Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic repos - Redefining module warnings #152

Closed
lawik opened this issue Sep 30, 2019 · 9 comments

Comments

@lawik
Copy link
Contributor

@lawik lawik commented Sep 30, 2019

Environment

  • Elixir version (elixir -v): 1.9
  • Database and version (PostgreSQL 9.4, MongoDB 3.2, etc.): 10.5
  • Ecto version (mix deps): 3.1.7
  • Database adapter and version (mix deps): postgrex 0.15 and ecto_sql 3.16
  • Operating system: MacOS Mojave

Current behavior

As covered in #151 I've been creating databases at runtime for use with dynamic repos.

While running my tests for this I end up with a warning that's perfectly reasonable but also rather regrettable.

warning: redefining module Morot.DynamicRepo.Migrations.CreateDocumentations (current version defined in memory)

I do this:

def ensure_repo_exists(repo_module, options) do
    repo_module.__adapter__.storage_up(options)
    options = Keyword.put(options, :pool_size, 2)
    {:ok, repo_pid} = repo_module.start_link(options)
    repo_module.put_dynamic_repo(repo_pid)
    Ecto.Migrator.run(repo_module, :up, all: true, dynamic_repo: repo_pid)
    repo_module.stop(1000)
  end

Ecto.Migrator.run loads all migrations as modules and performs the migrations. As far as I can tell it doesn't avoid loading them again if they already exist. Running this multiple times is to my knowledge unavoidable if I'm creating multiple databases but I see no real reason why the migration files should be loaded into modules if they already exist.

Expected behavior

Either the warnings could be suppressed if the current behavior is accurate or we might benefit from not loading the modules if they already exist.

A potential issue could be not detecting a module name collision with either approach, I guess?

But I don't think the current behavior is ideal :)

@lawik

This comment has been minimized.

Copy link
Contributor Author

@lawik lawik commented Sep 30, 2019

It may be that just replacing Code.load_file with Code.require_file in Ecto.Migrator.load_migration might do the trick but I didn't get a clear idea of what to expect from load_file since it lacks documentation and is marked for deprecation. I fell into an erlang docs black hole in my first surface search for what it does.

@josevalim

This comment has been minimized.

Copy link
Member

@josevalim josevalim commented Sep 30, 2019

You have two options here:

  1. Instead of running migrations from a path, you can pass the migration modules and versions as arguments to migrator run. This can be an advantage as it gives you more fine grained control.

  2. Replace load_file by require_file is also a great call, PR please? load_file basically loads the file multiple times, require_file only once.

@lawik

This comment has been minimized.

Copy link
Contributor Author

@lawik lawik commented Sep 30, 2019

  1. Oh, sweet. I'll look into that if 2. doesn't fix my issue in which case I might not need it.
  2. Sure thing :) I'll try to wrangle a PR for you.
@lawik

This comment has been minimized.

Copy link
Contributor Author

@lawik lawik commented Sep 30, 2019

Ok, I've found a complication that I'm not sure how to tackle in a good way within ecto/ecto_sql.

It seems like the only time I can map a file name to the modules it contains is at the time of loading the file into Elixir. Code.load_file works around this by always loading regardless if it ends up reloading a file and throwing warnings. Code.require_file returns nil when it encounters a file that's already been loaded. Which means we can't find out what modules are in the file.

Code.required_files gives you a list of which files have been required. But not mapped to the modules/atoms that I'd need to identify them and actually run the right migrations.

Diving into the Elixir codebase tells me I'm still not very good at reading Erlang. But from what I can divine it seems the data structure in elixir_code_server.erl for handle_call({acquire, Path..) link is basically a map with a file path pointing to either true or an intermediary reference thing to manage concurrent loading.

I see two approaches off of the top of my head. But I may be overcomplicating.

One is to maintain a sort of registry for which migration modules have been loaded in ecto_sql. This seems weird to me.

The second one is to make Code.require_file and the elixir code server keep the module names/atoms instead of the value true which would mean the Code module could get the ability to tell you what modules were actually loaded from the file, not just that they have been loaded once. This seems like it could be useful for other use-cases.

I have no idea if this impacts other things than require_file but it seems fairly isolated to those features.

@josevalim

This comment has been minimized.

Copy link
Member

@josevalim josevalim commented Sep 30, 2019

Ah yes, now I recall why. There is zero chance we will add “caching” to require_file because we would need to keep both the module name and byte code in memory and this could end-up taking a lot of space. It is definitely something someone could implement on top though. My suggestion is for you to keep all migrations in lib/ and then pass a list of module names and versions when calling the migrator.

@josevalim

This comment has been minimized.

Copy link
Member

@josevalim josevalim commented Sep 30, 2019

“Zero chance” may sound a bit too harsh, sorry if it is the case. :) thanks for following up and for the investigation. It is really appreciated.

@lawik

This comment has been minimized.

Copy link
Contributor Author

@lawik lawik commented Sep 30, 2019

No worries. Ah, right require_file as it is returns the bytecode too. I didn't think of that. Completely makes sense why that wouldn't be an option. Still feels like a bit of an odd limitation to not be able to figure out which modules were loaded in the manners, not even from required_files. That one could potentially keep the module name instead of true.

But I'll close this up. The limitations to this approach are clear to me now :)

@lawik lawik closed this Sep 30, 2019
@lawik

This comment has been minimized.

Copy link
Contributor Author

@lawik lawik commented Sep 30, 2019

Then again, feel free to re-open if you want me to dig into options for Ecto.SQL and loading migrations. It uses load_file and won't work with require_file. So not sure how the deprecation should be handled.

@josevalim

This comment has been minimized.

Copy link
Member

@josevalim josevalim commented Sep 30, 2019

We introduced compile_file which is equivalent to load_file in this case, so that would be the fix. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.