New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue when composing ecto queries #1284
Comments
@iwarshak sorry you are having issues. We'd be happy to help you with this on the ecto slack channel or on irc. |
Thanks @parkerl - @josevalim asked me to make this into an issue after discussing it in IRC. |
@iwarshak my bad. This is definitely a tough problem. Did you come up with any ideas in your conversation with @josevalim ? |
Thank you @iwarshak. To recap what we have discussed on IRC, the issue happens because when def with_popular_post(query) do
query
|> join(:inner, [comment], p in assoc(comment, :post))
|> where([comment, _author, p], p.likes == 4)
end However that obviously does not compose well if we always need to track the previous bindings. The only solution I have in mind so far is to allow ... to be specified in the bindings as a way to say "everything between". So we would rewrite the query to: def with_popular_post(query) do
query
|> join(:inner, [comment], p in assoc(comment, :post))
|> where([comment, ..., p], p.likes == 4)
end And p would always bind automatically regardless if author was joined before or not. This is still a bit surprising because folks won't know about such possibilities upfront but it would solve the problem cleanly. Finally, it is worth noting though this issue does not happen when using keyword queries because we can track bindings more efficiently there. |
I'm not really sure the |
@michalmuskala right, that would have been my answer as well except people really, really, really prefer using the pipeline so to me "using keyword syntax" is not a good enough answer. Another solution to this problem is to allow the association syntax to work without an explicit join: def with_popular_post(query) do
where(query, [c], assoc(c, :post).likes == 4)
end When such is used we would automatically join on |
Is there always the implied assumption that if you are composing queries that the If I have Authors with Posts, and Posts with Comments and Tags, then I could/should not attempt the following?
but have...
which would become?
|
@josevalim can you explain this a bit more, please?
How does this get around the problem of a chained query not being aware of the previous bindings? |
If you write:
We know exactly where |
Running into the same issue. Your last comment isn't clear to me. It seems like a more appropriate example of composition would be:
which won't work. Knowing nothing about how Ecto works, my "solution" would involve allowing binding names to be meaningful and thus emulating SQL alias. Then you could pass the alias to the composition and somehow use it there. |
My two cents (this has been causing me lots of trouble recently) Have you considered some variation of the following syntax: Post
|> join(:inner, [p], c in assoc(p, :comments))
|> join(:inner, [p], a in assoc(p, :authors))
|> where(%{comments: c}, c.id > 5) # bind by table name instead of index ? That way we could always explicitly say what table we are interested in. |
The problem with such solution is - what if we join twice on the same table. It's a completely reasonable thing to do. The table name is completely ambiguous in that case. Another issue is that we loose backwards compatibility. |
Regarding compatibility - afaik you can't use a map as binding right now, so all previous code should just work fine |
@iwarshak thanks for creating this issue. Right now we are in the process of migrating Ruby/Rails application to Elixir/Phoenix (with Ecto of course). We are also having a similar problem and it's very serious in our use case. In every resource we have a querable named visible_by that is responsible for access control and we use it everywhere we are getting data from Repo to avoid any data leakage . Pseudo code below: def visible_by(query, current_admin) do
case current_admin.role do
"role_1" ->
query
|> join(:inner, [o], p in assoc(o, :placements))
|> where([_, p], p.admin_id == ^current_admin.id)
"role_2" ->
query
|> join(:inner, [o], p in assoc(o, :placements))
|> join(:inner, [_, p], a in assoc(p, :assignments))
|> where([_, _, a], a.admin_id == ^current_admin.id)
_ ->
query
end
end Currently this querable is useless since it's impossible to compose it because we don't know how many joins there are: Offer
|> Offer.visible_by(current_admin)
|> join(:inner, [o], p in assoc(o, :placements))
|> where([_, p], p.followed == true) We can go back to doing access control in every query manually but using visible_by is very elegant and avoids data leakage in case someone forgets to add a join. I know we can do it like that but we have double joins if "role_1" or "role_2" and I'm not a big fan of "from" when we use a lot of composable queries . query = Offer |> Offer.visible_by(current_admin)
query =
from o in query,
join: p in assoc(o, :placements),
where: p.followed == true I like the idea proposed by @teamon, ActiveRecord works like that and is useful in most of the cases: Post
|> join(:inner, [p], c in assoc(p, :comments))
|> join(:inner, [_, c], a in assoc(c, :authors))
|> where(%{comments: c}, c.id > 5)
or
|> where([comments: c], c.id > 5) @michalmuskala I wouldn't be worried about backwards compatibility since this can be only an alternative syntax. If you need multiple joins on the same table you can use normal bindings (it's not that common to use multiple joins on the same table). What do you think @josevalim @michalmuskala @parkerl? If you think it makes sense we can dedicate some resources, try to write it and make pull request. Another possible solution would be to set visible_by as subquery if it is using any join(). That way it would be composable and we could join on it without any issues. It won't be as performant but maybe DB planner will optimize it. UPDATE: We went with subqueries approach. This seems to be the best approach in this use case without complicating the code. |
I think using the names is a sensible idea although we will need to raise if you attempt to use duplicates. |
I was thinking about this as well, and I see how we could use the names. Instead of table names, I would rather use the names given in bindings, e.g. The only problem is what you already mentioned, @josevalim, duplicate bindings (as in binding name, not table/column) would need to raise. This is not backwards-compatible, I think it would be fine to break this. Potentially by introducing a warning in a pre-release and checking how many people would complain. |
@michalmuskala we can raise only if the named binding is used. :) |
@michalmuskala I'm not sure if I got it correctly but: def with_date(query, date), do: query |> where(%{c: c}, c.date == date)
query
|> joins(:inner, [p], x in assoc(p, :comments) # note the x
|> with_date(today)
this would not match ( |
I am missing something...with this solution instead of remembering how many bindings there are between composing functions, I have to remember the name I gave the binding between functions? Could someone translate the original example above into what this new syntax might look like? Sorry, I'm a bit slow.
IMHO, this is a fairly common use case. A solution that did not allow for this would seem a bit incomplete. |
def with_popular_post(query) do
query
|> join(:inner, %{comment: comment}, post in assoc(comment, :post))
|> where(%{post: post}, post.likes == 4)
end
You would give different names to your bindings. |
I have pushed a series of commits that add last binding match with query
|> join(:inner, [p], a in assoc(p, :authors))
|> where([p, ..., a], p.category_id == a.category_id) In the example above, you know that post is the first binding and that the author is the last one, so you can use I would like to try this approach for now and, if it is not enough, we can add named bindings. In fact, the changes we did to support |
@josevalim Thanks! We will definitely try it out in our codebase :) |
We write some pretty large queries, and we're going to have to end up writing a module that allows us to track what things have been joined onto a query, and allows us to join if we need to, or just find the binding and perform further filtering if not. This has probably been our only pain point with ecto query syntax. If we had named bindings, we could take functions that take a query the name of the binding for what you want to filter, and the parameters and do some really great stuff. The sort of interface I want is something like
So you only join if an existing binding exists with that kind of join, and then you can filter. These would be infinitely composable. |
Named bindings would be very appreciated, the We compose a query depending on user input, which may include one or more, at the end of the query it's impossible to actually select a join unless it's first because we don't know where it would be in the bindings array |
Another example
this query doesn't work at the moment because groups is a many to many association, the problem is that if I join the MessageConfig with group, the select will have wrong bindings in case the query involves the group join |
I'd really love to see named bindings introduced. The new My use case is translating a nested and/or query described in an application specific DSL into a sql query. The kicker is that there is a central I had hoped I could create a dynamic fragment like this: The intent was to have a number of single line functions building up this where, while other functions were responsible for doing the proper nesting in parentheses. Only problem is the bindings are positional, and inside of these tiny functions I just don't know what has been bound already or not, and I don't know what order they were necessarily bound. If multiple tables are joined to the
Now I presume Postgres is smart enough to not join tables not asked for in the |
To workaround this lack of named bindings in Phoenix Datatables, I actually use a macro to create 25 different functions (actually 50, we have 25 for order_by, 25 for where) that accept an ordinal to which position they will bind an update to a I've run into a need for this in other circumstances though, and am considering spinning off a library to help do this sort thing more generically. This is madness. Please give us named bindings :) |
Named bindings is already in master... :) |
I am currently on Ecto master, to see if named bindings solve my problem at hand. My where clauses are built using Now is this a by-product that just-works™ or intended behaviour/syntax? In consequence Being on the bleeding edge: What is the outlook/intention for dynamic in conjunction with named bindings? |
@Overbryd `...` is supported since 2.2. Do you mean named bindings do not work with dynamic?
|
Environment
I am trying to write a set of composable queries. I am running into an issue when composing 2 queries where each query joins a separate table, and the second query has a where condition.
It seems to me that the root of the issue is that there is another binding available after the first query runs, and the second query having no idea about it does the
where
on the incorrect binding.This would seem to make it difficult to compose queries, since they would need to know of all the previously joined tables.
Any ideas?
Full source with failing test. https://github.com/iwarshak/ecto_compose_error
The text was updated successfully, but these errors were encountered: