New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with collision check in merge_tablemaps #518
Comments
Or actually, shouldn't Why copy.copy again if db[alias] already exists for the same original Table instance? |
Could this be related somehow with #521 ?? |
Or says differently should we make sure that each time we pass a second (dict) argument to merge_tablemaps() we make sure that we make a "deepcopy"?? Since we can't control that the second argument get update (modified), I think it would make more sens... |
@nursix, can you provide a failing example for me to check if this would fix your issue : Thanks |
Ah - okay, I should probably have described the problem better. I have two functions - the first one produces a Query, like:
...and passes this query on to another function, like:
Then, that other_function performs a select from a left join that includes the same aliased table:
Now, this raises a "Name Conflict" error - because there seem to be two tables "my_aliased_table". Obviously, this is exactly the same table with exactly the same alias (so perfectly the same resulting SQL), but merge_tablemaps doesn't recognize that because table.with_alias() produces a new instance of the Table every time it is called, and merge_tablemaps checks for object identity. As I see it, with_alias stores the aliased table instance in the DAL instance (self._db), so it should be able to detect when it is called again for the same table with the same alias, and in that case return the same instance from self._db. I can work around the issue by wrapping table.with_alias in a function that does exactly that: checking whether db already has an aliased instance of the same table with the same alias - and in that case, return that instance rather than creating a new one with with_alias. That wrapper function should though not be necessary - I think, Table.with_alias should do that, so that this would succeed (if you need a unit test):
|
Here's my workaround for the issue, which may help you to understand the problem:
(Yeah, it contains another workaround for a problem in older PyDAL versions since we do not always know which version our software is deployed with - bit ignore that, just look at the first part here) |
You could also consider this as an inconsistency. ...because, when I do this:
...then every subsequent I can see how this is debatable, of course - the new with_alias call could indeed be intended to create a new object even when called for the same table and with the same alias, and since there is no way to remove an aliased table from a DAL instance, this would be the only (non-intrusive) way to achieve that. So, the question may be what is the more common case: re-using the same instance, or creating a new instance? I for my part am struggling to imagine a case where I would want to override an alias with a new instance of the same table - but then again that's a very narrow perspective, perhaps. That's why I raised this issue rather than providing a fix. |
No luck, my fix doesn't help at all... This work in 2.14.6 : db.auth_user.with_alias('user_table').username == 'USERNAME'
db(query).select(db.auth_user.with_alias('user_table').username)
len(rows)
1 And don't with trunk and fall on : ---> 88 raise ValueError('Name conflict in table list: %s' % key)
89 # Merge
90 big.update(small) In this function : def merge_tablemaps(*maplist):
"""Merge arguments into a single dict, check for name collisions.
Arguments may be modified in the process."""
ret = maplist[0]
for item in maplist[1:]:
if len(ret) > len(item):
big, small = ret, item
else:
big, small = item, ret
# Check for name collisions
for key, val in small.items():
if big.get(key, val) is not val:
raise ValueError('Name conflict in table list: %s' % key)
# Merge
big.update(small)
ret = big
return ret This function had been introduced with pyDAL 17.01 v16.11...v17.01#diff-114ce07f361177e0669ec9a374ef7d6aR71 Which occurs with this bunch of changes : #408 |
@nextghost, You be of some help here... |
I had limited understanding what was all the underlying science of all these changes. :D |
It would be good enough. And it would also be much much MUCH easier to screw up that check. Which is why I've decided to not support that particular edge case and use simple and safe object identity. Table.with_alias() returns new object every time because one of my goals for later was to eliminate alias caching on the DAL instance entirely. Relying on cached aliases is a great way to shoot yourself in the foot. You should create the table alias once and then pass it whereever you need to use it. |
That's not a nice thing to say :/ because it's virtually impossible to do what you suggest. I think it's time to abandon web2py/PyDAL altogether. |
And to make the point: I don't see the slightest reason why combining two of the same (table.with_alias("alias") and table.with_alias("alias")) should crash a query. It doesn't make any sense - it's the same table, the same alias. Where exactly is the "simple and safe" in that? I'm sorry, but getting brushed off like that doesn't make PyDAL any more attractive :( We've been using it for a long time (as long as it exists), but it's being developed away from us. |
Cool down guys, there is surely a way... |
And what would possibly the risk of caching aliases that refer to the same original table, anyway? I can see how there is a risk if the same alias name is used for a /different/ table - but for the same table? Where would it make a difference? We have long had a separation of filter construction and actual querying - and we can not possibly change all of that just because PyDAL is suddenly unable to combine two completely equal aliases in the same query. |
@BuhtigithuB apparently the way is for us to rewrite our entire software and make it much more complicated in order to match PyDAL's quirks. I'm sorry, but I just can't see how this change in PyDAL is helpful with anything. We're not stupid here, we've been doing this for a very long time - yet I get near-absurd suggestions to rewrite thousands of lines of code for something with very little perceivable value. |
I too question myself about the value added of the nested select stuff if it render the DAL backward incompatible... But I don't think we should loose our temper... There is surely a way to solve the problem that we can not understand for the moment... |
Because in order to check that two different instances of Table point to the same database table, you have to check that:
And someone could easily make some change in the future that will complicate this check even further.
That's exactly the risk: Redefining the alias to a different table between two uses that assume it's still the same table. The easiest way you can upgrade your separation of filters and queries to the new API is to redefine the filters to a function that takes the table (or dict of table aliases) as an argument. Then just pass the function around instead of the original filter object and call it while building the final query with the right table instances for argument. Anyway, I've quit Web2Py/PyDAL more than a year ago because (to put it mildly) I'm not happy with the core team's approach to code quality. If you want to revert all my contributions, I don't care in the least. |
Here the merge commit : |
I still struggle to see the problem, to be honest. It seems you're merely looking at merge_tablemaps, not at Table.with_alias. Since with_alias is an instance method of Table, you already have the DAL object (self._db), so all you need to check is whether the cached alias in that DAL object refers to the same dalname - which is trivial, as proven by my workaround. It only ever becomes complicated when you abandon alias caching, because then indeed you would need to compare the entire object hierarchy - so that is why it seems absurd to do that. It would make a lot more sense to move the alias caching into the Table object itself, so that you can skip the dalname check and only ever look at the alias. Then, whenever you call Table.with_alias with an alias name for which there already is a cached instance in /that table/, then return that same instance. Your proposal is still for us to rewrite our code rather than fixing this newly introduced issue in PyDAL? This isn't really the most helpful advice, but if you say there is no other way and you're not going to fix this, then okay, thanks for your time. |
Yeah - and thanks for clarifying that you don't care about us. |
@nursix Please leave this issue open... |
If you wish...sorry, I understood it as resolved with "won't fix", so I closed it. |
To me it a regression as it break backward compatibility, I won't see why we would allow a new feature break backward compatibility... I am not qualified (and don't have the time to wrap my head around this issue), so I prefer let others look in to that. I had documented the issue the best I can and thank you for your collaboration too. Hope @mdipierro and @gi0baro will pass by here soon and help clear that out. Richard |
I said I've quit. I've explained my reasons for writing the code the way I did and now it's up to the core team to deal with this issue as they see fit. |
No prob thank you @nextghost this the whole point to leave the issue open... :D |
This makes sense to me as well, and would be my preference if it doesn't cause memory leaks/thread safety problems. Either way, this is definitely a bug, you do not break people's applications like this. |
Ahm - please take my proposal with a grain of salt. Mind my app developer perspective ;) I'm not saying it's a bug - it's rather somewhat implausible from the app perspective. If I create two Table.with_alias() instances from the same original table with the same alias, then it isn't obvious why they should collide in a query. To me as the app developer, they both constitute one and the same TABLE AS expression, and hence I don't see why the app should take care that they also are the same Python objects. That's a bit counter-intuitive, and not really obvious from the documentation. If it is a bug, like you say, then it should be fixed in some way - but I don't claim that my proposal is necessarily the correct way to fix it, it's just how I could imagine it to work. But if it is a feature, as has also been suggested here, then I would be interested to learn what the requirement behind it might be. |
It's definitely a bug, I can sympathize with nextghost wanting to simplify the way the DAL works to make it more manageable but you cannot break applications in this way. It's not like your application is using some kind of private methods or monkey patching a class, you're just calling regular functions documented in the book. Everything in software could be made simpler and more elegant if no one used it. |
I see it a bug too... DAL should allow you to construct SQL without having to know all the complexity behind it. Aliasing in SQL is something used all the time to save a few key stroke... In web2py it has not the same usage as it not save any key stroke at all, but it should be supported and we shouldn't break backward compatibility without notice, so it definitely a bug as it use to work before and it something you expect it gonna work logically if you start to user aliases. |
I will have try ASAP... |
@leonelcamara, I test it, seems to solve the issue... Could it has any drawback?? @nursix, can you try with your app if it solves all the issues you were having? Just apply the change here : leonelcamara@62d0948#diff-68220c912f02a03fd8bce58415479334R1053 Over : gluon/packages/pydal/objects.py Think to reboot web2py... :) |
This works (is equivalent to my workaround), but having the standard case in the "except" branch seems a bit expensive. It's not my call to make - but wouldn't it be better to store aliases in the Table instance (e.g. as |
Btw - I'm also wondering that because it seems to me that currently, I'm not a PyDAL expert developer, but /this/ seems even more serious to me than the original issue here. |
I mean this:
i.e. a (poorly chosen) alias for the |
@nursix I could easily change this caching to be done at the table level, the point is that it was already being made at the DAL level, so the shadowing was already being made, I assumed it was for some valid reason, so I simply took advantage of it. Perhaps, @mdipierro or @gi0baro can explain why are the aliases being stored in self._db, otherwise I will simply have to read more of the DAL code and probably start commenting it.
True, may be worth it to change this to an "if" statement as the try/except only becomes an optimization if you call the same with_alias like 10 times or more and I doubt that's common. |
I believe the reason is that we want to have a db.alias_name defined and pointing to the aliased table so we can build Rows from the response object. Because web2py lacks a mechanism to pass information from the query to the select about which tables are aliased. Without this line:
objects.py: self._db[alias] = other
I think the rows data would go into rows[k]._extra and not into rows[k].alias_name.
I am sure there is a better way. Try remove the line above and see what it break when you do a select with a join.
… On Mar 7, 2018, at 5:43 PM, Leonel Câmara ***@***.***> wrote:
@nursix <https://github.com/nursix> I could easily change this caching to be done at the table level, the point is that it was already being made at the DAL level, so the shadowing was already being made, I assumed it was for some valid reason, so I simply took advantage of it. Perhaps, @mdipierro <https://github.com/mdipierro> or @gi0baro <https://github.com/gi0baro> can explain why are the aliases being stored in self._db, otherwise I will simply have to read more of the DAL code and probably start commenting it.
but having the standard case in the "except" branch seems a bit expensive.
True, may be worth it to change this to an "if" statement as the try/except only becomes an optimization if you call the same with_alias like 10 times or more and I doubt that's common.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#518 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AA68z8ztYRSpAtIXl1zMHnPGPQBVy3OLks5tcHCRgaJpZM4R_vk9>.
|
I'm struggling with this:
https://github.com/web2py/pydal/blob/master/pydal/helpers/methods.py#L83
...as it requires identity of aliased Table instances, and hence fails if the same aliasing happens twice, i.e.:
...even though it's the same table, and the same alias, and consequently the same SQL.
Why does it have to be this way - wouldn't it be good enough to ensure that the DAL name of both tables is the same (i.e. that they refer to the same original table)? Do they really have to be exactly the same Python object?
The text was updated successfully, but these errors were encountered: