Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preload many_to_many through preload query discards my data #2785

Closed
chingan-tsc opened this issue Nov 2, 2018 · 6 comments
Closed

Preload many_to_many through preload query discards my data #2785

chingan-tsc opened this issue Nov 2, 2018 · 6 comments

Comments

@chingan-tsc
Copy link

chingan-tsc commented Nov 2, 2018

Precheck

  • Do not use the issues tracker for help or support requests (try Elixir Forum, Stack Overflow, IRC or mailing lists, etc).
  • For proposing a new feature, please start a discussion on elixir-ecto.
  • For bugs, do a quick search and make sure the bug has not yet been reported.
  • Finally, be nice and have fun!

Environment

  • Elixir version (elixir -v): 1.6.6
  • Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): PostgreSQL9.6
  • Ecto version (mix deps): ecto 2.2.10
  • Database adapter and version (mix deps): ecto 2.2.10
  • Operating system: Mac OS X

Current behavior

I have created a sample app to simulate this behavior here: https://github.com/chingan-tsc/ecto_preload_test forked from #2534 as our issue is similar but yet different.

So first things first, the schema, we have a many to many relationship between Post and Tag, where Tag schema has a belongs_to with Comment (the schema might not makes sense in terms of actual practicality but it'll simulate the behavior). I have two queries to list all Post(s) along with all its associated Tag (and the parent Comment attached to the Tag).

def list_posts_a() do
    preload_query = from(t in Tag, join: c in assoc(t, :comment), preload: [comment: c])

    from(p in Post, preload: [tags: ^preload_query])
    |> PreloadTest.Repo.all()
  end

  def list_posts_b() do
    preload_query = from(t in Tag, join: c in assoc(t, :comment), preload: [comment: []])

    from(p in Post, preload: [tags: ^preload_query])
    |> PreloadTest.Repo.all()
  end

Now I have written a test case in my sample app:

test "preload many_to_many through preload query" do
      post1 = post_fixture()
      post2 = post_fixture()

      comment1 = comment_fixture()
      comment2 = comment_fixture()

      tag1 = tag_fixture(%{comment_id: comment1.id})
      tag2 = tag_fixture(%{comment_id: comment2.id})

      post_tag_fixture(%{post_id: post1.id, tag_id: tag1.id})
      post_tag_fixture(%{post_id: post2.id, tag_id: tag1.id})
      post_tag_fixture(%{post_id: post1.id, tag_id: tag2.id})

      IO.inspect(Query.list_posts_a())
      IO.inspect(Query.list_posts_b())
      assert Query.list_posts_a() == Query.list_posts_b()
    end

Lets talk about the second query, Query.list_posts_b() it returns 2 Posts (post1 and post2) where post1 has 1 Tag object under tags key and post2 has 2 Tag objects undertags key. All is fine.

But the first query, Query.list_posts_a() returns post1 as before but post2 with empty array ontags key. Now it doesn't makes any sense to me as two routines are actually almost the same.

What I did was I copied the actual SQL query used for preloading from the debug log and both returns 3 Tag objects but the second query "distribute" them nicely into two Post(s) nicely where the first query just drop 1 of the Tag object.

Please advise.

Expected behavior

Both queries should return the same result.

@josevalim
Copy link
Member

Hi @chingan-tsc!

For many_to_many preload queries, you need the first join to be the association table, otherwise it works completely by chance. Something like this:

preload_query = from(t in Tag, join: pt in "post_tags", on: pt.tag_id == t.id, join: c in assoc(t, :comment), preload: [comment: c])

I will improve the docs to make this clear.

@josevalim
Copy link
Member

Actually, I think this should work as is as well. I am investigating it further.

@chingan-tsc
Copy link
Author

Hi @josevalim, thanks for the prompt reply. I have added another test case on my repo as such:

test "preload many_to_many through preload query with non overlapping tags" do
      post1 = post_fixture()
      post2 = post_fixture()

      comment1 = comment_fixture()
      comment2 = comment_fixture()

      tag1 = tag_fixture(%{comment_id: comment1.id})
      tag2 = tag_fixture(%{comment_id: comment2.id})

      post_tag_fixture(%{post_id: post1.id, tag_id: tag1.id})
      post_tag_fixture(%{post_id: post2.id, tag_id: tag2.id})

      IO.inspect(Query.list_posts_a())
      IO.inspect(Query.list_posts_b())

      assert Query.list_posts_a() == Query.list_posts_b()
    end

In this test case both routine returns expected result, but only when post1 and post2 don't have any overlapping Tag. Actually I fixed the issue in my production app by using just preload: [comment: []] but I was just baffled by the inconsistencies of the result.

@josevalim
Copy link
Member

Ok, I have dug deeper into this and the issue is that when using joins such as comment: c, we have duplicated results. However, since you also get duplicate results from the query posts-tags relationship, the joiner ends up merging all of them together when processing comments.

Here is the query that we run internally:

from t in PreloadTest.Tag, join: c in assoc(t, :comment),
 join: p0 in PreloadTest.Post, on: p0.id in ^[16, 15], join: p1 in "post_tags",
 on: p1.post_id == p0.id, where: p1.tag_id == t.id, order_by: [asc: p0.id],
 select: {p0.id, t}, preload: [comment: c]

In the overlapping case, it is supposed to return three entries, but you can see it returns only 2. This would actually be an overall bug with joins that happens to manifest via preloads.

@chingan-tsc
Copy link
Author

I am guessing its a bug too but what baffles me is that both raw SQL queries seen on the debug log actually yields the same number of rows (well of course preload: [comment: c] yields more columns) but when the Ecto Preloader "injects" the result set into the base object, somehow overlapped Tag would be discarded.

@josevalim
Copy link
Member

It should be fixed in master. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants