Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is EnvSettingsSource's behavior intentional or bug? #258

Open
qkrwoghk15 opened this issue Mar 20, 2024 · 6 comments
Open

Is EnvSettingsSource's behavior intentional or bug? #258

qkrwoghk15 opened this issue Mar 20, 2024 · 6 comments
Assignees

Comments

@qkrwoghk15
Copy link

qkrwoghk15 commented Mar 20, 2024

Hi, I was using the dotenv related settings in pydantic BaseSettings and found something weird.

Nested 'BaseModel' is used for 'BaseSettings', and the alias of the inner model operate dependently on each other as it is set with 'case_sensitive' in the outer model.

Below is an example code for the above situation.
when outer Model's case_sensitive is False,

import os

from pydantic import Field, BaseModel
from pydantic_settings import BaseSettings, SettingsConfigDict

class MyModel(BaseModel):
    a: int = Field(alias="A")


class MySettings(BaseSettings):
    uid: str
    name: str
    inner: MyModel

    model_config = SettingsConfigDict(env_nested_delimiter="__", case_sensitive=False)

def test(environ_name:str):
    os.environ[environ_name] = "3"
    
    try:
        settings = MySettings(uid="uid", name="name")
    except Exception as e:
        print(e)
    else:
        print(settings)
    finally:
        os.environ.pop(environ_name)

then

>>> test("inner__A")
1 validation error for MySettings
inner.A
  Field required [type=missing, input_value={'a': '3'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing

but when outer Model's case_sensitive is True,

...

class MySettings(BaseSettings):
    uid: str
    name: str
    inner: MyModel

    model_config = SettingsConfigDict(env_nested_delimiter="__", case_sensitive=True)

...

then the result is below.

>>> test("inner__A")
uid='uid' name='name' inner=MyModel(a=3)

I think it's a movement that comes from this line
Is this an intended action or is it a bug?

Thanks for reading my long issue

@hramezani
Copy link
Member

Thanks @qkrwoghk15 for reporting this.

This is the intended behavior. pydantic-settings respect the case_sensitive config of the settings model that it starts to initialize. it means in your example it takes this config from MySettings settings model.

@qkrwoghk15
Copy link
Author

Oh, Thanks @hramezani for your response.

As you say, when capital letters exist in the alias of the inner(nested) model,

  • If the case_sensitive of the outer(parent) model is False, validation will fail when alias has upper case characters. <-- I have something weird.
  • However, when it is True, validation can be successful if the environment variable and alias clearly match. <-- It's OK👌

I would appreciate it if you could check the code results below to make sure my understanding is correct.

  1. case 1: case_sensitive=False with lower case alias
class MyModel(BaseModel):
    a: int = Field(alias="a")

class MySettings(BaseSettings):
    uid: str
    name: str
    inner: MyModel

    model_config = SettingsConfigDict(env_nested_delimiter="__", case_sensitive=False)

test("inner__a")
# uid='uid' name='name' inner=MyModel(a=3)

test("inner__A")
# uid='uid' name='name' inner=MyModel(a=3)
  1. case 2: case_sensitive=False with upper case alias
  • I have something weird in this case. So, I would like to check through this issue.
  • In this case, I thought that validation should all succeed by case_sensitive=False, or if alias is independent of case_sensitive, validation should succeed for "inner_A".
  • This is just my initial testing thought and just confirms your intent. I'm not trying to inject my intentions :)
class MyModel(BaseModel):
    a: int = Field(alias="A")

class MySettings(BaseSettings):
    uid: str
    name: str
    inner: MyModel

    model_config = SettingsConfigDict(env_nested_delimiter="__", case_sensitive=False)
    
test("inner__a")
"""
1 validation error for MySettings
inner.A
  Field required [type=missing, input_value={'a': '3'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing
"""

test("inner__A")
"""
1 validation error for MySettings
inner.A
  Field required [type=missing, input_value={'a': '3'}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.6/v/missing
"""

I can infer the relationship between env_prefix and alias from this page of the documents through the following words,

env_prefix does not apply to fields with alias. It means the environment variable name is the same as field alias:

but it was difficult to infer this dependence between case_sensitive and alias.
Is there any information I can get about this from the documents?

@hramezani
Copy link
Member

@qkrwoghk15pydantic-settings convert all the envs to lowercase when case_sensitive=False. So, in both cases env will be inner__a.
In case 1.1, the alias is a. so inner__a will be matched with field a.
in case 1.2, the alias is A. so inner__a won't be matched with field a.

Probably, making all envs lowercase at the beginning is not a good approach but nested model parsing in very complicated and I am afraid of changing it needs a lot of changes.

feel free to ask here if you don't understand it clearly or have more questions.

@qkrwoghk15
Copy link
Author

qkrwoghk15 commented Mar 25, 2024

Thank you so much for sharing your opinions freely @hramezani .

I just thought it would be good to have a clear definition of the relationship between case_sensitive and alias in the inner model.

In general, in BaseSettings's alias and case_sensitive works as expected.
For example, if alias is A and case_sensitive is True, only A is matched.
On the other hand, if alias is A and case_sensitive is False, both A and a will match.

below is an example.
If case_sensitive is False

class MySettings(BaseSettings):
    name: str = Field(alias="Name")

    model_config = SettingsConfigDict(case_sensitive=False)

test(MySettings, "name")  # name = "..."
test(MySettings, "Name")  # name = "..."
test(MySettings, "NamE")  # name = "..."

and if case_sensitive is True

class MySettings(BaseSettings):
    name: str = Field(alias="Name")

    model_config = SettingsConfigDict(case_sensitive=True)

test(MySettings, "name")  # 1 validation error for MySettings ...
test(MySettings, "Name")  # name = "..."
test(MySettings, "NamE")  # 1 validation error for MySettings ...

However, the interaction between alias in the inner(nested) model and case_sensitive in the outer(parent) model exhibits a different phenomenon.
In general, if you expect consistent behavior regardless of whether it is inner(nested) or not,
In case 2, both inner_a and inner_A should have succeeded in validation, but the actual results are that both are failing. (ref. case 2)

I am also afraid because it is not clear to what extent this change will affect me, and I believe that such usage examples (capitalizing the nested model's alias and setting the parent model's case_sensitive to False) will not be common. Since it is expected and there is no rush to make changes, I think it would be a good idea to think about it slowly and together.

I have not yet looked into where the nested model is processed in pydantic BaseModel's __init__.
I roughly thought that it would be possible to use case_sensitive as a context in the validation stage rather than changing the original source (input) itself.

Let's think about some more good ways :)

@hramezani
Copy link
Member

I have not yet looked into where the nested model is processed in pydantic BaseModel's init.
I roughly thought that it would be possible to use case_sensitive as a context in the validation stage rather than changing the original source (input) itself.

Not sure how you would use the validation context here?

I would like to work on this part and improve inner model value extraction but unfortunately, I can't say when it will happen.

As I mentioned before the inner model value extraction implementation is complex and I am afraid we introduce a breaking change by changing this part

@qkrwoghk15
Copy link
Author

qkrwoghk15 commented Mar 28, 2024

Sorry for the delay in replying because of personal schedule.

I often use pydantic(settings) and am very interested in this module.
Therefore, I just wanted to share my thoughts, so I would appreciate it if you could listen at ease. :)

What I thought was temporary and also a bad idea because I don't know where and how to use ruff.
It seem to have side effects or may harm the basic operation of BaseSettings,

  1. In this line
    Our BaseSettings has a predefined model_validator(mode="before") that uses case_sensitive as context,
    and __pydantic_self__.validate_python(context={"case_sensitive": case_sensitive}) rather than __init__.

  2. Or, I understand that BaseModel's schema has already been built by metaclass.
    Therefore, there is some curiosity as to whether this information can be used when doing source build.

I don't think this issue is urgent
This modification is expected to have an overall impact on the design.
Therefore, I think it would be better to slowly devise a better solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants