Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new functions to Repo behaviour #4427

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tmbb
Copy link

@tmbb tmbb commented Jun 2, 2024

This PR is not suitable for merging right away because of a few reasons (detailed below), but it serves as a template for what I think should be included in Ecto.

Rationale

Currently there is no way to recursively preload associations in deeply nested changesets. Preloading association in deeply nested changesets is important if one is working with nested forms inside LiveView. LiveView expects forms built from changesets (not ecto structs!) and it expects associations to be preloaded so that the HTML components can iterate over the value's "children".

One could think of dropping changesets from LiveView, but changesets are really convenient to validate data and to make sure that errors are rendered correctly in HTML forms.

Nothing inside this PR refers to LiveView, of course, but it's just the motivating example that led me to write it.

Changes implemented in this PR

This PR implements two new public callbacks in the Repo behaviour, namely:

  • @spec preload_in_result({:ok, Ecto.Schema.t()} | {:error, Ecto.Changeset.t()}, list(), list()) :: {:ok, struct} | {:error, changeset} which is optimized for piping results returned by functions such as Repo.insert() and Repo.update()
  • @spec preload_in_changeset(changeset, list(), list()) to preload associations inside a changeset (the function above calls this one when given a {:error, changeset} tuple)

The implementation is not very efficient in the sense that it probably generates more queries than is really required, but that can be optimized by walking the nested changesets twice (once to get the IDs of all the parents to get the children in a single query and then to attribute each child to the correct parent based on the ID; changesets without IDs can't have children already in the database, and don't need to be queryeed - I already do this optimization).

Testing

I'm trying to test this with the TestRepo which uses a custom adapter (which I don't understand at all, I've never looked into adapter code), but it seems like the TestRepo is not generating IDs correctly when multiple changesets are inserted (?).

Questions

  1. Is this considered to be in scope for Ecto? I believe it should be considered in scope because it's non-trivial functionality which might be useful for a number of projects, but probably too trivial to be in its own package.
  2. How can I make the test adapter generate IDs correctly? Should I instead test with a new adapter, even if I have to import something just for testing?

@josevalim
Copy link
Member

Hi @tmbb! I need to understand the problem a bit better. Why can't the associations be loaded before you put the data in the changeset? Can you provide a general example with before/after that could benefit for this?

@tmbb
Copy link
Author

tmbb commented Jun 3, 2024

@josevalim the problem of the "non-preloaded" associations happens when I use the cast_assoc(:assoc, sort_param: :assoc_sort) functionalot to create a new child association. In that case, neither the child association nor its siblings have their associations preloaded. Maybe this is a problem that should be solved at the level of the cast_assoc/2 function, but I don't think the cast_assoc/2 function receives enough data to know which associations should be preloaded, or even what the default value of the preloaded associations should be. And preloading things shouldn't be the job of the cast_assoc/2 function anyway (it doesn't even have access to the repo).

An example can be found in the test I've added in the newest commit:

@collection_attrs %{
"name" => "The Con Collection",
"books_sort" => [0, "new"],
"books" => %{
0 => %{
"title" => "Necronomicon",
"chapters" => [
%{"title" => "Necro 1"},
]
}
}
}
test "preload_in_changeset" do
changeset =
%Collection{}
|> TestRepo.preload(books: [:chapters])
|> Collection.changeset(@collection_attrs)
# There are now two book changesets
assert [book_changeset_1, book_changeset_2] = changeset.changes.books
assert %Ecto.Changeset{} = book_changeset_1
assert %Ecto.Changeset{} = book_changeset_2
# The associations are not loaded!
assert %Ecto.Association.NotLoaded{} = book_changeset_1.data.chapters
assert %Ecto.Association.NotLoaded{} = book_changeset_2.data.chapters
# Call our new `preload_in_changeset/2` function
preloaded_changeset = TestRepo.preload_in_changeset(changeset, books: [:chapters])
assert [preloaded_book_changeset_1, preloaded_book_changeset_2] = preloaded_changeset.changes.books
assert %Ecto.Changeset{} = preloaded_book_changeset_1
assert %Ecto.Changeset{} = preloaded_book_changeset_2
# Preloaded (empty) associations
assert [] == preloaded_book_changeset_1.data.chapters
assert [] == preloaded_book_changeset_2.data.chapters
end

@josevalim
Copy link
Member

Why can't you preload the associations into the struct before you call cast? You already have to load it from the database, why not preload it by then?

@greg-rychlewski
Copy link
Member

It looks like their issue is when the data is not already in the db but when casting will produce a changeset for the creation of a new entry.

But I am still not 100% sure I understand why it's an issue. It seems like it's normal for that to exist only in changes before the persistence happens. And then once you call Repo.insert/update, you can have them in data.

@tmbb
Copy link
Author

tmbb commented Jun 3, 2024

Why can't you preload the associations into the struct before you call cast?

As shown in the code above, I am already preloading the data before the cast. It still doesn't give me preloaded associations for the books.

It looks like their issue is when the data is not already in the db but when casting will produce a changeset for the creation of a new entry.

The problem is not exactly that, the problem is that I want the :chapters associations to be preloaded in the data of the book change set (even if the preload is an empty list).

I need the associations to be preloaded in the changeset data (the changeset data, that is, order to be able to iterate over the chapters in nested inputs in a Phoenix form (with <.nested_inputs_for/>).

I want the form to work even with changeset that haven't been persisted, in which the user mau be performing edits with incremental validation.

I hope I've made it more clear

@josevalim
Copy link
Member

I think the simplest solution is to add a flag to cast_assoc that resets the association if unloaded?

@greg-rychlewski
Copy link
Member

i think something like that makes sense. to me it sounds like when the cast functions are creating changesets for new inserts the associations are defaulting to unloaded and that is causing the issue.

@tmbb
Copy link
Author

tmbb commented Jun 3, 2024

i think something like that makes sense. to me it sounds like when the cast functions are creating changesets for new inserts the associations are defaulting to unloaded and that is causing the issue.

Exactly, that's the issue. But I'm not sure that cast_assoc/2 even has enough information to preload the associations or to even known which associations to preload (which in case of creation means that the associations are "empty"). Is the list of preloads stored anywhere in the data or changesets?

@josevalim
Copy link
Member

We should not really preload it. The issue you describe is for new entries, so we should allow resetting it for new entries, and then we will be all good.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Jun 6, 2024

@tmbb I think it can be as simple as adding this change here

data = if opts[:force_load], do: %{data | key => Relation.load!(data, original)}, else: data
changeset = %{force_update(changeset, opts) | data: data, changes: changes, valid?: valid?}

By the time you get here you know if it's not loaded then it's a new entry. Otherwise this would have raised.

I don't know if that's the best name for the opt but hopefully this gives the idea.

@josevalim
Copy link
Member

@greg-rychlewski I didn't check the code (sorry, I am being lazy) but would that only be invoked for new records? For updates it can lead to weird failures.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Jun 6, 2024

I think so but maybe I need to check all the cases more closely. My thinking is that original is defined this way:

original = Map.get(data, key)

So when you do

Relation.load!(data, original)

The only thing it can do is change is an unloaded assoc to empty. And I don't think it should happen for updates because we do this at the beginning

original = Map.get(data, key)
current = Relation.load!(data, original)

Which raises if data is loaded but the assoc is unloaded. So it should essentially be a no-op for updates, if I'm seeing this correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants