Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change alias priority logic #1178

Merged
merged 8 commits into from Jan 24, 2020
Merged

change alias priority logic #1178

merged 8 commits into from Jan 24, 2020

Conversation

samuelcolvin
Copy link
Owner

@samuelcolvin samuelcolvin commented Jan 19, 2020

Change Summary

This changes the priority of aliases to fix #1177, however it introduces a potential breaking change.

For example:

def test_alias_generator_on_child():
class Parent(BaseModel):
x: bool = Field(..., alias='abc')
y: str
class Child(Parent):
y: str
z: str
class Config:
@staticmethod
def alias_generator(x):
return x.upper()
assert [f.alias for f in Parent.__fields__.values()] == ['abc', 'y']
assert [f.alias for f in Child.__fields__.values()] == ['abc', 'Y', 'Z']

before alias_generator on the child would take taken priority. While it's arguable whether this is correct and we could allow alias_generator to return a tuple (<alias>, <alias priority>) to allow customisation, I'm aware this is a breaking change.

However I'm not sure how else to fix the issue in a cleaner way. @tiangolo, @dmontagu, anyone else involved - do you think we should:

  • merge this with a "potential breaking change" warning
  • wait to merge this until v2
  • provide some other fix for #1177

?

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • changes/<pull request or issue id>-<github username>.md file added

@codecov
Copy link

@codecov codecov bot commented Jan 19, 2020

Codecov Report

Merging #1178 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff           @@
##           master   #1178   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files          20      20           
  Lines        3490    3523   +33     
  Branches      679     687    +8     
======================================
+ Hits         3490    3523   +33
Impacted Files Coverage Δ
pydantic/fields.py 100% <100%> (ø) ⬆️
pydantic/types.py 100% <0%> (ø) ⬆️
pydantic/env_settings.py 100% <0%> (ø) ⬆️
pydantic/networks.py 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a9e8db3...3c56a93. Read the comment docs.

@dmontagu
Copy link
Collaborator

@dmontagu dmontagu commented Jan 20, 2020

I personally view this as a bugfix -- I think manually labeled fields should always retain their manually labeled alias unless specifically overridden. It doesn't really surprise me that this hasn't caused issues because I think it is rare that the two are used together.

I am sensitive to the idea that it is a breaking change, but in my opinion this seems pretty clearly like incorrect behavior instead of a change from valid implementation to another.


It would be different if a lot of people were now dependent on the incorrect logic. That would very much surprise me, but I'm not sure how to confirm it's not the case.

Do you have a sense of how hard it would be to raise a model-creation-time warning in the case where this PR would actually result in a different (i.e., incompatible) alias being used? The warning message could say something like "your model may be dependeing on buggy behavior that will be fixed in an upcoming minor version". That way, if anyone is depending on this behavior, they will have the chance to let us know they would prefer we didn't change anything.

pydantic/main.py Outdated Show resolved Hide resolved
pydantic/main.py Outdated Show resolved Hide resolved
@samuelcolvin
Copy link
Owner Author

@samuelcolvin samuelcolvin commented Jan 21, 2020

@dmontagu very interesting.

I think in theory the warning approach would be possible, we would change this PR to raise the warning on the new behaviour but not change the alias. I guess we'd then either need to tell people "wait for v1.5 when the new behaviour will come in" or "add the use_new_alias_behaviour flag to your alias definitions".

This all sounds like a lot of work and complexity for something which:

  • might well not be used by many people
  • might be what people are expecting is happening anyway.

I think if we give an appropriate warning in the release notes and are explicit about what's changed, people can't be that upset by the change.

@aljets
Copy link

@aljets aljets commented Jan 21, 2020

That seems ok. I could see a pretty reasonable argument it should work that way. I could also see a pretty reasonable argument that children should always override the parent. But following specificity, as you are, seems reasonable to me. So long as it is clearly documented.

My 2 cents: I'm not sure how I feel about adding an alias_priority (in your latest commits), however, as that seems like a decent complication that could quickly get messy, so it may be worth thinking through use cases and alternatives--it may be cleaner for consumers to work around priority in other ways.

Thanks for working on this, and for your speed!

@samuelcolvin
Copy link
Owner Author

@samuelcolvin samuelcolvin commented Jan 22, 2020

So long as it is clearly documented.

Done I hope, feedback welcome on the docs.

My 2 cents: I'm not sure how I feel about adding an alias_priority

Me neither, I haven't documented it and I don't think people should ever use alias_priority directly. However it isn't possible to get sensible or consistent alias precedence without it (as far as I could work out).

so it may be worth thinking through use cases and alternatives

again I completely agree, but that will have to wait for v2, see #1186


For me this is done, any feedback welcome. Otherwise I'll merge tomorrowish. I'm intending to deploy v1.4 fairly soon.

!!! warning
Alias priority logic changed in **v1.4** to resolve buggy and unexpected behaviour in previous versions.
In some circumstances this may represent a **breaking change**,
see [#1178](https://github.com/samuelcolvin/pydantic/issues/1177) and the precedence order below for details.
Copy link

@aljets aljets Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to indicate issue 1178 but link to 1177? Pretty minor since they are related though.

In some circumstances this may represent a **breaking change**,
see [#1178](https://github.com/samuelcolvin/pydantic/issues/1177) and the precedence order below for details.

A field alias's is chosen based on the following
Copy link

@aljets aljets Jan 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant A field's alias (or maybe this line is unnecessary/duplicated below)

@aljets
Copy link

@aljets aljets commented Jan 23, 2020

Sounds great. Documentation looks good. Thanks!

@samuelcolvin samuelcolvin merged commit 943a8a0 into master Jan 24, 2020
13 checks passed
@samuelcolvin samuelcolvin deleted the alias-generator-fix branch Jan 24, 2020
@mat02
Copy link

@mat02 mat02 commented Jun 28, 2020

Hi everyone!
Is there any option to increase priority of alias_generator for particular model?
I have a situation, where I am extending old API of some legacy software with some additional functionality under a new, sort of "unified" API.

I have a base model with aliases that match legacy API. I also have extended models that inherit from that base models and these will be served for client application. Now the problem is that I would like to use alias_generator in that extended models to replace snake case field names with camel case. Obviously, just as explained in the docs, this works only for the fields defined in child model.
Here's the simplified code example:

def to_camel(string: str) -> str:
    return ''.join(word.capitalize() for word in string.split('_'))

class OldAPIModel(BaseModel):
    id: int = Field(..., alias='Identyfikator')
    name: str = Field(..., alias='Nazwa')
    description: str = Field(..., alias='DlugiOpis')
    
class NewAPIModel(OldAPIModel):
    product_keywords: Optional[List[str]] = Field([])
    class Config:
        alias_generator = to_camel

Now, the use case for that is as following:
I get data from legacy api using OldAPIModel, then gather addidional data from other database and pass all of that to the client using NewAPIModel. As I see it, it would be nice to set some flag (that is disabled by default) to force use of alias_generator in child models. Of course I can just copy all fields from OldAPIModel to NewAPIModel, but I would like to avoid code duplication.
Or maybe there is some other way/usage pattern that I should follow in this case?
BR,
Mateusz

@samuelcolvin
Copy link
Owner Author

@samuelcolvin samuelcolvin commented Jun 28, 2020

no way I know of to force use of a different alias.

You have the options of:

  • copy them like you said
  • generate the model yourself using create_model(), with Field() and alias's set on the field
  • look through the pydantic code and find some way to hack it, no idea if this is possible, might break in minor version updates

@ainsleymcgrath
Copy link

@ainsleymcgrath ainsleymcgrath commented Aug 10, 2020

Hi everyone!
Is there any option to increase priority of alias_generator for particular model?
I have a situation, where I am extending old API of some legacy software with some additional functionality under a new, sort of "unified" API.

I have a base model with aliases that match legacy API. I also have extended models that inherit from that base models and these will be served for client application. Now the problem is that I would like to use alias_generator in that extended models to replace snake case field names with camel case. Obviously, just as explained in the docs, this works only for the fields defined in child model.
Here's the simplified code example:

def to_camel(string: str) -> str:
    return ''.join(word.capitalize() for word in string.split('_'))

class OldAPIModel(BaseModel):
    id: int = Field(..., alias='Identyfikator')
    name: str = Field(..., alias='Nazwa')
    description: str = Field(..., alias='DlugiOpis')
    
class NewAPIModel(OldAPIModel):
    product_keywords: Optional[List[str]] = Field([])
    class Config:
        alias_generator = to_camel

Now, the use case for that is as following:
I get data from legacy api using OldAPIModel, then gather addidional data from other database and pass all of that to the client using NewAPIModel. As I see it, it would be nice to set some flag (that is disabled by default) to force use of alias_generator in child models. Of course I can just copy all fields from OldAPIModel to NewAPIModel, but I would like to avoid code duplication.
Or maybe there is some other way/usage pattern that I should follow in this case?
BR,
Mateusz

@mat02, if you're still looking for a way to do this, I figured out a reasonable solution using create_model.
If you already figured it out, then this can just hang out here for posterity 😸

from itertools import chain

from pydantic import BaseModel, create_model


class Bread(BaseModel):
    class Config:
        alias_generator = lambda s: f"bread_{s}"

    name: str
    id: int


class Butter(BaseModel):
    class Config:
        alias_generator = lambda s: f"butter_{s}"

    name: str
    id: int


print(Butter.__fields__)
print(Bread.__fields__)


class BreadAndButterDoesntWork(Bread, Butter):
    """This won't work :("""


print(BreadAndButterDoesntWork.__fields__)


def BreadAndButterWorks():
    return create_model(
        "BreadAndButter",
        **{
            f.alias: (f.type_, ...)
            for f in chain.from_iterable(
                getattr(m, "__fields__").values() for m in [Bread, Butter]
            )
        },
    )


print(BreadAndButterWorks().__fields__)

Hopefully this helps someone!

Fwiw, I was dealing with a use case where we are using Pydantic models to get bits of data from various APIs and databases in order to ultimately stitch them together into one big data warehouse table.

Imagine in the above, that Bread and Butter were each responses from separate apis The idea was to compose all the models together in order to form that one big table with minimal repetition and a lot of consistency. Since each of the "parts" had a source_id and a natural_key, disambiguation with alias_generator made a ton of sense. What a surprise when the big rolled up model lost all those aliases!

My experience is limited with this library, and my problem is somewhat unusual, but while I'm here I'll toss 2 cents in the bucket for raising the priority of alias_generator in the future. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants