Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class attributes starting with underscore do not support assignment #288

Closed
dmfigol opened this issue Oct 30, 2018 · 57 comments
Closed

Class attributes starting with underscore do not support assignment #288

dmfigol opened this issue Oct 30, 2018 · 57 comments
Labels
feature request help wanted Pull Request welcome

Comments

@dmfigol
Copy link

dmfigol commented Oct 30, 2018

Bug

For bugs/questions:

  • OS: macOS
  • Python version import sys; print(sys.version): 3.6.6
  • Pydantic version import pydantic; print(pydantic.VERSION): 0.14

In #184 it was suggested to use variables starting with the underscore, however this does not work.
The last comment in #184 referred to the same problem, but it is technically a separate issue.

from pydantic import BaseModel
m = BaseModel()
m._foo = "bar"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dmfigol/projects/my/public/simple-smartsheet/.venv/lib/python3.6/site-packages/pydantic/main.py", line 176, in __setattr__
    raise ValueError(f'"{self.__class__.__name__}" object has no field "{name}"')
ValueError: "BaseModel" object has no field "_foo"
@samuelcolvin samuelcolvin changed the title ValueError: Class attributes starting with underscore do not work Class attributes starting with underscore do not support assignment Nov 15, 2018
@samuelcolvin
Copy link
Member

(I've updated the issue title to better reflect the issue).

Personally, I don't think it's that important, but if you'd like the feature, I'd be happy to accept a PR.

@asemic-horizon
Copy link

I'm having this same problem in a non-workaroundable context. Can someone maybe give some context into why there is this restriction and if there's any way to hack around it, even if ugly?

@samuelcolvin
Copy link
Member

Have you tried using field aliases?

eg,

class MyModel(BaseModel):
    foobar: str
    class Config:
        fields = {'foobar': '_foobar'}

@asemic-horizon
Copy link

That works in a small example

import json, my_models

with open('test.json') as f: d = json.load(f)
model = my_models.MyModel(**d)

I continue to have problems from the context of FastAPI. That's not the beginning of a new question, it's for the next person to arrive here from Google.

Thanks for your help!

@edatpb
Copy link

edatpb commented May 30, 2019

That's not the beginning of a new question, it's for the next person to arrive here from Google.

👋 Oh hi!
@asemic-horizon Did you find a solution or a good workaround yet? I'll post again if I find one

@asemic-horizon
Copy link

Yeah sorry.

As I had said, field aliases worked in small toys examples but didn't seem to work in a larger app context, which made me suspect the problem was at FastAPI itself and therefore not pertinent to this issue.

After further rounds of research and bug squashing, minor and subtle errors on my end (my code on top of FastAPI) surfaced. By then I had forgotten I had gone to Pydantic support for it.

Thanks everyone!

@asemic-horizon
Copy link

(On my end I would "vote this issue closed" or its equivalent in the github issue tracker. I suspect I can't because I didn't actually open it.)

@samuelcolvin
Copy link
Member

samuelcolvin commented May 30, 2019

I'm pretty sure aliases work fine in fastapi.

Closing, but if someone really wants this, let me know and I can reopen it.

@tiangolo
Copy link
Member

tiangolo commented Jun 3, 2019

@asemic-horizon field aliases work normally in FastAPI and are even documented/recommended in several places.

If you still have issues with them, feel free to open an issue in FastAPI with the minimum sample to reproduce your issue: https://github.com/tiangolo/fastapi/issues/new/choose

@bartekbrak
Copy link

bartekbrak commented Jun 26, 2019

I read the above but saw no explanation for this limitation. From brief study of the source code I could not determine any danger in sunder fields.

This check is introduced in the first commit ever a8e844d and a check added in f9cf6b4 but no explanation given in commit messages.

Python itself by design does not enforce any special meaning to sunder and dunder fields - the "we are all consenting adults here" approach. I am used to a convention of marking fields private-like with sunder.

Using aliases is clumsy. I use Pydantic to save code lines and make code more readable. I would rather not add this magic. Also this prevents me from doing arguably dirty trick of having both _response and response (processed response).

I would very much like to know the motivation or have this restriction lifted.

Still, I adore Pydantic, thanks!

@pdonorio
Copy link

pdonorio commented Dec 7, 2019

Hello there, I use pydantic in FastAPI as well, and I use mongodb as database.

Mongo sets Identifiers as the field _id in collections/query outputs and handling those via Pydantic models is quite confusing because of this issue. Let me try to show what I do to ask if I'm doing something wrong; situations I've met with my use case

  1. when sending the id as output I have to create a Model with the alias as
class Foo(BaseModel):
    id: str = Field(..., alias='_id')
    ...

query = db.collection.find(...)
validated = Foo(**query).dict()
# where validated = {'id': MONGO_HASH, ...}
# send validated as response

Note: if I'd want to send the id like mongo does (if my API are used by another service that deals with queries and mongo for example) I need to use

validated = Foo(**query).dict(by_alias=True)
# where validated = {'_id': MONGO_HASH, ...}
  1. when storing a PUT in the database the above require the input to be using id instead of _id like:
data = read_from_source(...)
# data = {'id': EXISTING_MONGO_HASH, ...}
db.collection.update(Foo(**data).dict(by_alias=True))

but if some other service is sending me raw data I would get the _id in input so I need to change the model to

class Foo(BaseModel):
    id: str = Field(..., alias='_id')
    ...
    class Config:
        allow_population_by_field_name = True

otherwise I get a validation error


There are other situations that I can't recall well enough to show with code,
but in general I'm currently forced by pydantic on taking care of these differences since the underscores are not accepted.
I wouldn't use the id translation at all, I'd go with _id always - because I'm risking non deterministic behaviour in wrapper methods for all models in the codebase and in the responses.

Hope this helps to reason on it, but thanks for this awesome library!

@gpakosz
Copy link

gpakosz commented Feb 7, 2020

Hello @samuelcolvin 👋

I just started using pydantic along with FastAPI and pymongo. I used alias like @pdonorio mentions above. But I would like to understand why attributes starting with underscore are disallowed in the first place?

@samuelcolvin
Copy link
Member

I still think most python users would not expect "private" attributes to be exposed as fields.

I know python doesn't formally prevent attributes who's names start with an underscore from being accessed externally, but it's a pretty solid convention that you're hacking if you have to access attributes with leading underscores. (Yes I know there are some exceptions, but in the grand scheme they're rare).

In the example quoted above regarding mongo, I would personally feel it's inelegant design to keep the mongo attribute _id in python with the same name. You should be able to use an alias generator and a shared custom base model to change the field name in all models without repeated code.

Still, if people really want this, I'd accept a PR to allow field names with underscores. I guess the best solution would be to add a new method to Config which replaces this logic, perhaps returning None to fallback to the default logic.

Then in v2 we can completely remove Config.keep_untouched it's a bad name and a confusing attribute.


None of my business, but: @gpakosz, unless i'm missing something, you're using a synchronous library for io (pymongo) while using an async web framework (starlette+fastapi). I would worry that your entire process will hang during every database call, thereby negating the point in asyncio.

@samuelcolvin samuelcolvin reopened this Feb 7, 2020
@Garito
Copy link

Garito commented Feb 8, 2020

I'm agree that, in python, underscore variables are considered private, so I understand your point
But
everyone that uses mongodb (perhaps others) must flow our code with by_alias=True which is not that bad (but talking about elegance seems very weird)
The problem becomes deeper when you use lists of pydantic models
Then you must add a good amount of extra fors to return the information in the correct way (for read and write)

What about a config attribute like always_by_alias = True?

@samuelcolvin
Copy link
Member

You can just implement your own custom base model and overwrite .dict() to have by_alias=True default.

I don't want to add more config options unless absolutely necessary. Also it would be weird since I think .dict() and .schema() have different defaults.

Perhaps we can change the default to True in V2 once we have load and dump aliases.

@samuelcolvin
Copy link
Member

samuelcolvin commented Feb 10, 2020

Allowing fields with underscores by default would also conflict with #1139.

@mortbauer
Copy link

I'm very new to pydantic, however i'm really impressed and really like it, however with the current implementation filtering out any field starting with an underscore which is completely unconfigurable main.py 162
except monkey patching is not really acceptable. So I think it is really necessary to add an additional configuration parameter.

@SmadusankaB
Copy link

SmadusankaB commented Mar 31, 2020

Same thing happens in connexion framework. I want to use default mongo _id as the id.

class FileMetadata(BaseModel):
    _id: str
    created_date: str

Set id

data = FileMetadata.parse_obj(info.to_dict())
data._id = my_custom_id

Error

ValueError: "FileMetadata" object has no field "_id"

Change summary explained here solved the problem.

class FileMetadata(BaseModel):
    _id: str
    created_date: str

    class Config:
        extra = Extra.allow

@samuelcolvin
Copy link
Member

Better to use an alias for that field.

@StephenBrown2
Copy link
Contributor

For the pymongo issue specifically I moved to mongoengine, which exposes _id as id. No more need for aliasing!

@iyedg
Copy link

iyedg commented Apr 30, 2020

The solution for me was using aliases and by_alias when exporting the model. Using the example from @Seshirantha that would look as follows:

class FileMetadata(BaseModel):
    id: str = Field(alias="_id") # can be done using the Config class
    created_date: str

input_data = {"_id": "some id", "created_date": today()}
FileMetadata(**input_data).dict(by_alias=True)

That should return {"_id": "some id", "created_date": "whatever today is"}

I am using Pydantic with pymongo therefore having access to _id and _meta on the validated model is very important.

@killswitch-GUI
Copy link

killswitch-GUI commented May 14, 2020

+1 on this, would be a nice feature. Ill use alias in the meantime, but this for sure caught me off-guard.

@mortbauer
Copy link

mortbauer commented May 17, 2020

I earlier thought I needed that feature, and also found a quite easy way to have it via monkey patching:

# monkey patch to get underscore fields
def is_valid_field(name:str):
    if not name.startswith('__'):
        return True
    elif name == '__root__':
        return True

pydantic.main.is_valid_field = is_valid_field

However I have to say that using an alias works very well now for me!

@PrettyWood
Copy link
Member

#1679 would probably be a good solution

@hoangddt
Copy link

Can we support this features. Our database table have leading underscore field for metadata (_created, _updated,...). I am using fastapi and pydantic as data layer to passing thing from user to database and vice versa. But the _ is not got load in the the model.

@marianpa
Copy link

I had the same problem with pydantic. In my case that is the MUST. Looks like this problem is still ignored and the only solution is to use different library such as Marshmallow, Attrs or Dataclass

@frndlytm
Copy link

frndlytm commented Jan 25, 2022

Yeah, second all the _attribute support. it's not just Mongo, any Elasticsearch work behind FastAPI misses out on the meta-properties on documents. any in general as a DB developer, I find _attributes an expressive way of working around langue-specific keywords when necessary, like _id and _timestamp

to me, it seems like a by_alias workaround, while sufficient to solve any immediate issues, is not expressing the schema being parsed.

It seems that the justification for why it's not supported is because Python conventionally (not enforced) uses underscores for privacy problems, but I would argue that Python metaclasses being a way of having the language rewrite itself throws that issue out the window, particularly in their use in pydantic.

Additionally, I would ask what is the core responsibility or value proposition of pydantic? which I believe is as a model parsing library with applications in messaging applications. (e.g. there's nothing stopping a protobuf loader in the Model Config),a and if that IS the value proposition, then the BaseModel declaration should look as close as possible to the model being parsed.

so really, any restrictions on identifiers that aren't explicitly prohibited by the python language (like @timestamp), should probably be supported natively.

Just my two cents.

If there's anything I can do to help push these changes along, I'm dying to get involved in OSS.

@jtfidje
Copy link

jtfidje commented Feb 8, 2022

Yea we also really need this for use with ArangoDB.

@emremrah
Copy link

emremrah commented Mar 3, 2022

So I've been working on this. What I needed is to easily cast between ObjectId and str to save to MongoDB or return data to frondend without "$oid" nested object.

To keep things simple on the frontend, in data models in frontend, object ids are strings; not Objects with "$oid" like {person: {"$oid": "..."}}.

So I implement my data models like this:

# to cast strings to object id for incoming data, especially from frontend
class MongoObjectId(str):
    "Cast string to ObjectId."""
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if bson.ObjectId.is_valid(v):
            return bson.ObjectId(v)
        raise ValueError(f'{v} is not a valid ObjectId for type {cls}')

def oidstr(d):
    """Recursively search for values that objectids and convert them to strings."""
    if isinstance(d, dict):
        return {k: oidstr(v) for k, v in d.items()}
    if isinstance(d, list):
        return [oidstr(v) for v in d]
    if isinstance(d, bson.ObjectId):
        return str(d)
    return d

class MyModel:
    def dict_with_str(self, *args, **kwargs):
        d = BaseModel.dict(self, *args, **kwargs)

        return oidstr(d)

class Person(MyModel, BaseModel):
    person_id: MongoObjectId

id = '622076858cdc0a82c37f1f56'
print(model)  # person_id=ObjectId('622076858cdc0a82c37f1f56')
print(model.dict(by_alias=True))  # {'person_id': ObjectId('622076858cdc0a82c37f1f56')}
print(model.dict_with_str(by_alias=True))  # {'person_id': '622076858cdc0a82c37f1f56'}

I don't know if I'm overcomplicating things while trying to keep it simple for front end. Please share your thoughts. Thank you.

@lgvld
Copy link

lgvld commented Apr 5, 2022

As @frndlytm said:

so really, any restrictions on identifiers that aren't explicitly prohibited by the python language (like @timestamp), should probably be supported natively.

Regardless of the underlying database, Pydantic should not enforce what is only a convention. This is definitely not an expected behavior.

alexpovel added a commit to alexpovel/ancv that referenced this issue Jul 18, 2022
These aren't allowed with pydantic:

pydantic/pydantic#288

and

https://stackoverflow.com/q/59562997/11477374

The leading underscore was auto-generated by datamodel-codegen and only caught now.

*However*, just 'schema' is also a bad idea, it shadows the built-in method:

https://pydantic-docs.helpmanual.io/usage/schema/
@Bi0max
Copy link

Bi0max commented Oct 30, 2022

I still think most python users would not expect "private" attributes to be exposed as fields.

I know python doesn't formally prevent attributes who's names start with an underscore from being accessed externally, but it's a pretty solid convention that you're hacking if you have to access attributes with leading underscores. (Yes I know there are some exceptions, but in the grand scheme they're rare).

In the example quoted above regarding mongo, I would personally feel it's inelegant design to keep the mongo attribute _id in python with the same name. You should be able to use an alias generator and a shared custom base model to change the field name in all models without repeated code.

Still, if people really want this, I'd accept a PR to allow field names with underscores. I guess the best solution would be to add a new method to Config which replaces this logic, perhaps returning None to fallback to the default logic.

Then in v2 we can completely remove Config.keep_untouched it's a bad name and a confusing attribute.

None of my business, but: @gpakosz, unless i'm missing something, you're using a synchronous library for io (pymongo) while using an async web framework (starlette+fastapi). I would worry that your entire process will hang during every database call, thereby negating the point in asyncio.

I also need a possibility to use underscore attributes.
As @samuelcolvin suggested himself, the easiest way would be probably to implement a simple switch in the Config, which would allow to use "_attribute" style.
Is it something that you would accept in PR?

@fernando-freitas-alves
Copy link

fernando-freitas-alves commented Nov 18, 2022

Maybe this helps:

from typing import Any

from pydantic import BaseModel


class PrivateInitBaseModel(BaseModel):
    "Workaround for initializing models by specifying their private attributes"

    def __init__(self, **data: Any) -> None:
        for private_key in self.__class__.__private_attributes__.keys():
            try:
                value = data.pop(private_key)
            except KeyError:
                ...
            finally:
                setattr(self, private_key, value)

        super().__init__(**data)

@misuzu
Copy link

misuzu commented Dec 6, 2022

I just got bitten by this issue while migration from dataclasess. I'm using pydantic model to describe aggregate result from mongodb:

class CollectionAggregateId(BaseModel):
    field1: str
    field2: str
    field3: str


class CollectionAggregateModel(BaseModel):
    _id: CollectionAggregateId

    count: int
AttributeError: 'CollectionAggregateModel' object has no attribute '_id'

Using alias is not an option for me because I have data with both _id and id fields in my database.

@Pedrexus
Copy link

Can the original question be solved in any way? I tried using alias, but still model._foo = False raises the same error.

@tumma72
Copy link

tumma72 commented Mar 5, 2023

I have just started to use Pydantic in a couple of applications I am working on at the moment. The issue I am facing, which is probably related to the design I am using is the following:

  1. I have created a series of Pydantic BaseModel classes, with a shared MyBaseModel, which are converting and validating data coming from a remote server API. The validation works, I am able to test it with a series of mock data and also live data, they behave as expected. So I was very happy with Pydantic till this point 😉
  2. I need to integrate the data models into application models, I am not using FastAPI but just my own python application. What I mean with this, is that I have for example a Market(MarketBaseModel, AutoLogger) which does a classical mixin between the Pydantic model and the AutoLogger class which adds distributed logging features. The AutoLogger adds a local self.__logger attribute to the Market class, and getter and setter properties for the logger.

The problem I am facing is that no matter how I call the self.__logger, or self._logger or self.__logger__ attribute, even if it is initialized in the __init__ method and it isn't declared as a class attribute, because the MarketBaseModel is a Pydantic Model, extends the validation not only at the attributes defined as Pydantic attributes but also to every other attribute which is declared in __init__. This happens even if I have actually add to the Config class all the options that appeared to make sense to achieve the desired behavior:

class Config:
        extra = Extra.allow
        exclude = {"__logger__"}
        arbitrary_types_allowed = True
        underscore_attrs_are_private = True
        json_encoders = {bytes: lambda b: b.decode("utf-8")}

Now I might be thinking too complicated, and surely I could use composition and pass a logger instance to the constructor of the Market, but that would cause to add a lot of boiler plate code that I want to avoid at least at the high level APIs. Any idea on how I can still inherit from a Pydantic BaseModel while having arbitrary defined instance attributes?

@dmontagu
Copy link
Contributor

dmontagu commented Mar 6, 2023

I'm thinking it might be possible to do something with keep_ignored, and setting one of the attributes on the class to have a value that is a type in that setting.

More explicitly, you should be able to do something like:

class MyModel:
    a: int
    b: str  # whatever, random fields

    logger = MyLogger()

    class Config:
        ...  # whatever other values you want to set on the Config
        keep_untouched = (MyLogger,)

Any chance this approach might work for you?

@aolefira
Copy link

aolefira commented Mar 9, 2023

I'm thinking it might be possible to do something with keep_ignored, and setting one of the attributes on the class to have a value that is a type in that setting.

More explicitly, you should be able to do something like:

class MyModel:
    a: int
    b: str  # whatever, random fields

    logger = MyLogger()

    class Config:
        ...  # whatever other values you want to set on the Config
        keep_untouched = (MyLogger,)

Any chance this approach might work for you?

logger is not seems to be public variable. use __logger instead

@tumma72
Copy link

tumma72 commented Mar 9, 2023

Thanks for your suggestions, It appears I have managed to solve this problem at least at the moment, by changing the design and going for composition instead of inheritance. So I have now my model classes which have a class attribute called schema_class which is type bound to my schema superclass (a Pydantic BaseModel extension) and they have a from_data and to_data methods, which allows to read the serialized structure, validate the model using the schema_class, and store it as data storage as self.data into the model class. With the composition design I have also managed to create a PersistenceEnabled mixin which allow me to extend the main Model with SQLAlchemy persistence, using the same pattern, by declaring a db_class class attribute, and providing saving, loading and searching features on the model. This of course requires to have 3 classes implemented:

  1. The main domain related Model class (e.g.: MyModel) which now can mixin with all other regular python classes;
  2. The Pydantic schema class (e.g.: MyModelSchema) which does all of the validation and serialization business;
  3. If the domain model needs persistence also a db class (e.g.: MyModelDB) which does all of the CRUD business behind the scenes, and allows also for transaction participation (in case of multiple items operation).

I didn't really solve the problem, but I went around it. Every piece is smaller, has clearer responsibilities and doesn't cause interference between the different frameworks.

@tommyjcarpenter
Copy link

tommyjcarpenter commented Jul 21, 2023

Sorry if I am being naive here, but after reading this thread, I came across this: https://docs.pydantic.dev/latest/usage/models/#private-model-attributes

This seems to solve this problem?

class Foo(BaseModel):
    class Config:
        extra = Extra.forbid

  bar: date
  _myvar_ = PrivateAttr(...)

@rreusser
Copy link

@tommyjcarpenter I believe pydantic v2 was just released, and it looks like the diff in that code block is:

# v1.1:
    _secret_value: str = PrivateAttr()

# v2:
    _secret_value: str

so maybe it's that PrivateAttr is the default behavior for underscored attributes so that you don't explicitly need to specify? Not certain if there are differences beyond that.

@tommyjcarpenter
Copy link

@rreusser sorry I don't follow - just saying, I used PrivateAttr and it seemed to solve all the commotion in this thread.

@rreusser
Copy link

rreusser commented Jul 21, 2023

@tommyjcarpenter Oh, sorry, I misread! Disregard. I've done the same, but I'm on v1.1 still, and when I clicked the above link I noticed that maybe this has changed slightly in v2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request help wanted Pull Request welcome
Projects
None yet
Development

No branches or pull requests