Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add :source option for schema field #1998

Closed
wants to merge 6 commits into from

Conversation

Bharat311
Copy link
Contributor

@Bharat311 Bharat311 commented Mar 20, 2017

This PR adds a source option to field.

The source name is then used within the database, whereas the name provided with field is used to refer to that field in the codebase.

Refer #1892 for more details.

TODO:

  • Insertion
  • Updation
  • Deletion
  • Reading
  • Quering
  • Error Reporting
  • Uniqueness
  • Embedded Schema
  • Tests

@sourcelevel-bot
Copy link

Hello, @Bharat311! This is your first Pull Request that will be reviewed by Ebert, an automatic Code Review service. It will leave comments on this diff with potential issues and style violations found in the code as you push new commits. You can also see all the issues found on this Pull Request on its review page. Please check our documentation for more information.

@Bharat311
Copy link
Contributor Author

Hi, @josevalim - Could you please review the work done so far in this PR and let me know your thoughts ?

Also, I would like to add the ability to query a record using an aliased name. So, If the schema is

defmodule EctoDemo.Post do
  use EctoDemo.Web, :model

  schema "posts" do
    field :title, :string
    field :b, :string, alias: :body
  end
end

then

EctoDemo.Post |> where(body: "some test body") |> Repo.all

should be perfectly valid statement and return all posts with body="some test body".
Same goes for other query constructs like select, order, group, etc.

I am a bit confused as to what will be the best place to replace the aliased names with the actual names - The Ecto.Query.Builder or the Ecto.Query.Planner.

I believe it should be done in Ecto.Query.Planner because we prepare the final query at this place and also have all the related schema info, but i am unsure as to how it could be done, as it already receives the quoted expression in the %Ecto.Query{} for various expressions like select, where, order, etc. set by the Ecto.Query.Builder and i don't know whether / how we could modify it.

@OvermindDL1
Copy link
Contributor

Oh please bring something in like this! I keep having to write things like from s in sldbn, where: s.somelongdatabasename_id = 42 and s.somelongdatabasename_blorp = ^vwoop because I have to interact with an old database that prefix's every-single-field-with-the-table-name (which is not terribly short on many of them either); it is so verbose.

So yes, I have a use-case for it. Thanks for the work!

@@ -396,13 +398,14 @@ defmodule Ecto.Schema do
autogenerate = @ecto_autogenerate |> Enum.reverse
autoupdate = @ecto_autoupdate |> Enum.reverse
fields = @ecto_fields |> Enum.reverse
aliased_fields = @ecto_aliased_fields |> Enum.reverse

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a function call when a pipeline is only one function long

@Bharat311
Copy link
Contributor Author

Hi @josevalim. Did you manage to get a look at this PR and my earlier comment ?

Apologies if i seem too impatient, but i really need to complete this PR to completely switch over my Rails project to an Elixir app and i am stuck with the above mentioned problem. Any help in this regard is highly appreciated.

Thanks a ton for all your efforts :)

@michalmuskala
Copy link
Member

I did have a look, and I'm not sure that's the way to do it. We introduce quite a lot of additional reflection lookups with this, and I'm not sure we're doing the conversions in the right places.

The conversions should happen so that the user never sees the database columns, and that the database never sees the user columns.

@Bharat311
Copy link
Contributor Author

@michalmuskala - Thanks for your valuable feedback. I thought that it would be better if the user would be able to use both the aliased name and the original name for the field.

However, what you suggest also makes sense. But, I am not sure as to how i could implement it.
If you could guide me, I'd be happy to make the necessary changes to this PR.

Thanks again for your feedback.

@johnnyshields
Copy link

johnnyshields commented Mar 27, 2017

@michalmuskala @josevalim please provide guidance to Bharat on this PR?

IMHO it's better if the Ecto user can use the alias/non-alias names interchangeably, while the DB only sees the aliased name (if present) -- this is what Ruby's Mongoid lib allows for example.

@ericmj
Copy link
Member

ericmj commented Mar 28, 2017

IMHO it's better if the Ecto user can use the alias/non-alias names interchangeably, while the DB only sees the aliased name (if present) -- this is what Ruby's Mongoid lib allows for example.

Can you elaborate on why this is better? The problem is that there is only one way to name struct keys so you have a mismatch between your ecto queries/changsets etc. and your elixir code.

@wojtekmach
Copy link
Member

maybe instead of:

field :first_name, alias: :fn

we should have:

field :fn, as: :first_name

which implies 1-to-1 conversion?

@johnnyshields
Copy link

johnnyshields commented Mar 28, 2017

@ericmj

Can you elaborate on why (using alias/non-alias names interchangeably) is better?

EDIT - Changed my opinion. It's not needed. Even though my current library (Ruby Mongoid) supports it, I've never actually used it.

@wojtekmach

we should have (as: parameter)

I think store_as: or db: might be a better name. The word "store" makes it clear that it's the DB name (rather than as: / :alias: which are ambiguous), and also we can re-use the same :store_as option for embedded models, e.g. embeds_many :addresses, store_as: :addrs

@johnnyshields
Copy link

@ericmj edited above

@johnnyshields
Copy link

hi all, may we please get resolution on the points above so we can move this PR forward? we need it for our project.

@raarts
Copy link

raarts commented Apr 5, 2017

My 0.02. For users of the module the name of the table is determined by the module name. The schema macro then points to the actual underlying table.

In a similar way the users of the module will expect the first argument to the field macro to be the field name, and will expect to be able to use that name in their code.

So my preference would be something like

field :first_name, xxxxx: :fn

On the other hand the alias name now introduces confusion, because that is something you should use instead of something else.

So my suggestion would be:

field :first_name, column: :fn

or even better:

field :first_name, source: :fn

because the docs for the schema macro also use source for the underlying table.

@Bharat311
Copy link
Contributor Author

Bharat311 commented Apr 6, 2017

@michalmuskala

I did have a look, and I'm not sure that's the way to do it. We introduce quite a lot of additional reflection lookups with this, and I'm not sure we're doing the conversions in the right places.

Could you please guide me to the right places where these conversions should take place. I believe it to be the Ecto.Query.Planner ?

The conversions should happen so that the user never sees the database columns, and that the database never sees the user columns.

I agree and would make the necessary changes 👍

@johnnyshields
Copy link

Hi all, just want to confirm the name of option we're adding. Candidates are:

  • as:
  • alias:
  • store_as:
  • db:
  • db_name:
  • repo:
  • repo_name:
  • column:
  • source:

I prefer store_as: or db_name: @josevalim @michalmuskala any strong opinions here?

@OvermindDL1
Copy link
Contributor

As a vote, I also prefer something like :store_as or :db_column_name or something that is very clear.

@ericmj
Copy link
Member

ericmj commented Apr 10, 2017

I prefer :source since that's what we use for the table name with schemas.

@overture8
Copy link

I like :store_as or :source. :db_column_name is a bit too long and :db_name is potentially ambiguous.


%{wheres: wheres, havings: havings} = query

query = %{ query | wheres: rewrite_aliases(wheres, aliases_map),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no whitespace around parentheses/brackets most of the time, but here there is.

%{wheres: wheres, havings: havings} = query

query = %{ query | wheres: rewrite_aliases(wheres, aliases_map),
havings: rewrite_aliases(havings, aliases_map) }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no whitespace around parentheses/brackets most of the time, but here there is.


field_name = Enum.at(aliases_map, binding)[field] || field

%{query | expr: {op, [], [{{:., [], [{:&, [], [binding]}, field_name]}, [], []}, rhs]} }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no whitespace around parentheses/brackets most of the time, but here there is.

%{query | expr: {op, [], [{{:., [], [{:&, [], [binding]}, field_name]}, [], []}, rhs]} }
end

defp aliases_list(%{ sources: sources }) do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no whitespace around parentheses/brackets most of the time, but here there is.

def __schema__(:source), do: unquote(source)
def __schema__(:fields), do: unquote(field_names)
def __schema__(:db_fields), do: unquote(db_field_names)
def __schema__(:aliased_fields), do: unquote(aliased_fields)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems unnecessary to store both db_fields and aliased fields ?

@@ -578,6 +579,33 @@ defmodule Ecto.Query.Planner do
end
end

defp rewrite_aliases(query) do

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about calling this "evolve_field_names"

@josevalim josevalim force-pushed the master branch 3 times, most recently from 5387818 to 7acc780 Compare April 23, 2017 09:26
@johnnyshields
Copy link

@michalmuskala @josevalim PR is ready for merge, please review.

@johnnyshields
Copy link

@michalmuskala @josevalim pinging again on this, been a month and no movement.

@hbobenicio
Copy link

+1 on this.
Really looking forward for this feature.
Pretty common case when database column name conventions are not desired to be inherited on our ecto models.

@zachdaniel
Copy link
Contributor

We'd love to see this. And even better if it could support fragments :D

@johnnyshields
Copy link

@josevalim @michalmuskala this pr is now conflicted. If we resolve conflicts will you merge?

@josevalim
Copy link
Member

We can rebase when merging, no worries.

@johnnyshields
Copy link

@josevalim so how do we move forward here? This PR has been collecting dust for several months now.

@josevalim
Copy link
Member

josevalim commented Jul 17, 2017 via email

@johnnyshields
Copy link

johnnyshields commented Jul 17, 2017

@josevalim my company wrote this (@Bharat311 works for us and wrote it on company time) and we are using this branch on 3 Elixir projects currently.

@josevalim
Copy link
Member

josevalim commented Jul 17, 2017 via email

@mpinkston
Copy link
Contributor

I'd like to add my 👍 here. I've been using the branch successfully and have come to depend on it (but master has since added some critical updates as well)

@johnnyshields
Copy link

@Bharat311 please update this branch with master if possible.

@Bharat311
Copy link
Contributor Author

@johnnyshields @mpinkston - I have rebased the changes to latest master.

@paulnicholson
Copy link

I would also like to see this merged, needed to use it today.

@johnnyshields
Copy link

@josevalim @michalmuskala please help with this

defp get_field_source(%{data: %{__struct__: schema}}, field) when is_atom(schema),
do: schema.__schema__(:source, field) || field
defp get_field_source(%{data: data}, _field),
do: raise(ArgumentError, "cannot add constraint to changeset because it does not have a source, got: #{inspect data}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slight backwards incompatibility. Previously we would just use the field but now we'll fail. I think we should default to using the field in case there's no schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - Updated the PR.

@@ -215,6 +215,7 @@ defmodule Ecto.Query.Planner do
query
|> prepare_sources(adapter)
|> prepare_assocs
|> rewrite_aliases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to have a performance impact since it means we'll need to traverse the whole query every time. Would it make sense to do it only if there is something to rewrite? Alternatively, maybe we could do some of the work in the builder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - I am new to elixir and am not familiar with internal workings of ecto. I am not too sure if / how we could do this in builder. I, however, think that it would be better to traverse only if the schema uses field aliases and needs rewrite. I think this should be doable and would work on it.

@@ -706,7 +706,7 @@ defmodule Ecto.Repo.Schema do
defp dump_field!(action, schema, field, type, value, adapter) do
case Ecto.Type.adapter_dump(adapter, type, value) do
{:ok, value} ->
{field, value}
{schema.__schema__(:source, field), value}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we'll do this for all fields anyway, would it make sense to load all the "sources" as a map and fetch from the map similar to what we do for types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - If i understand it correctly, the current types are stored in the changeset. So, we should also add a key source in the changeset similar to types and read the source from there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens at the point we no longer use changesets, so I don't see a reason to store it in the changeset.

@@ -255,6 +255,8 @@ defmodule Ecto.Schema do
* `__schema__(:primary_key)` - Returns a list of primary key fields (empty if there is none);

* `__schema__(:fields)` - Returns a list of all non-virtual field names;
* `__schema__(:db_fields)` - Returns a list of all field names as stored in db;
* `__schema__(:aliases)` - Returns a list of fields with their changed source;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like the name aliases and source we introduce. The sound confusing and inconsistent. Maybe we should go for field_alias and field_aliases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - I agree field_aliases is better than aliases, I'll rename the method. But i am confused about the field_alias. Do you want me to replace the option :source ?

I have intentionally kept it as source because we use it for defining name of the schema as well. I think we could better it with field_source. (You can check the discussions it the same PR for other suggestions as well)

Moreover, i think field_alias would also require the following change. Currently,

field :first_name, :string, source: :fn

will become:

field :fn, :string, field_alias: :first_name

Since this is something big, I wanted to know your thoughts on this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for using consistent nomenclature everywhere

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring only to what we use in the reflection. I would avoid source exactly because it's already quite an overloaded term and is used primarily to refer to the table name. It might be fine within the limited scope of field macro, but I think it's confusing for the reflection mechanism since you have much less context there.

Copy link
Contributor Author

@Bharat311 Bharat311 Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I get it now. I have made the necessary changes. 👍

I haven't changed the option name, its still called as :source. However, within the code everywhere, we are now using field_alias or field_aliases.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this, because the column name is not an alias for our field, it's the other way around. I would suggest using something like field_column_name or field_source internally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @raarts. field_source is probably better to account for NoSQL dbs which don't have columns in the strict sense.

they would then be the same for all records
they would then be the same for all records.

* `:source` - Defines the name that is to be used in database for this field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should source be a string or an atom? I think an atom would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - Yes, I agree. The source should be defined as an atom. However, I am not able to understand what needs change as per this comment?

@doc false
def __load__(schema, prefix, source, context, data, loader) do
struct = schema.__struct__()
types = schema.__schema__(:types)
aliases = schema.__schema__(:aliases)

data = replace_field_sources(data, aliases)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds another iteration over the data on every load. It will have a significant impact on performance. Is there maybe a way to do this in one pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - I understand that this affects performance but i am not sure if there is any other better way to do this. If you could suggest me something better, I'll gladly change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary problem for me is that this affects performance for everybody - no matter if you use the feature or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalmuskala - I understand your concern and agree with it completely. However, what i meant was that i can't think of any other way to fix this. If you could help me in the right direction, i'll make the necessary changes.

Another option that i can think of is providing a global configuration that can enable / disable this feature. However, again, i am not sure if it would help.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In production load would only be called once at load time, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No load is called every time we load data from the database - it's responsible for putting the raw data into schemas with proper types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will think on how we could make this less costly

field_names = Enum.map(fields, &elem(&1, 0))

db_field_names = Enum.map(field_names, fn(field) ->
if List.keymember?(aliased_fields, field, 0), do: aliased_fields[field], else: field
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be the same as Keyword.get(aliased_fields, field, field)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should work fine. Updated the PR. 👍

@Bharat311
Copy link
Contributor Author

@michalmuskala - Thanks for the review. I'll look over these points and will update the PR shortly.

@sourcelevel-bot
Copy link

Ebert has finished reviewing this Pull Request and has found:

  • 19 possible new issues (including those that may have been commented here).

But beware that this branch is 7 commits behind the elixir-ecto:master branch, and a review of an up to date branch would produce more accurate results.

You can see more details about this review at https://ebertapp.io/github/elixir-ecto/ecto/pulls/1998.

@josevalim
Copy link
Member

I will merge this pull request over the next days. Thanks for all the work and please do not rebase!

Notes to self:

  1. There are no integration tests
  2. Returning in insert_all and insert is not considered in field aliases

@josevalim
Copy link
Member

Merged manually and pushed it to master with further fixes, thank you! ❤️ 💚 💙 💛 💜

@josevalim josevalim closed this Aug 5, 2017
@johnnyshields
Copy link

@josevalim thank you so much for your help, and great work @Bharat311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.