Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to ensure_ascii in Pydantic v2 #8825

Closed
FyZzyss opened this issue Feb 15, 2024 Discussed in #8821 · 4 comments
Closed

How to ensure_ascii in Pydantic v2 #8825

FyZzyss opened this issue Feb 15, 2024 Discussed in #8821 · 4 comments

Comments

@FyZzyss
Copy link

FyZzyss commented Feb 15, 2024

Discussed in #8821

Originally posted by FyZzyss February 15, 2024
How to ensure_ascii in Pydantic v2?

SomeModel.model_dump_json() in V2 not ensure ascii symbols anymore.

Example:

class TextMessage(BaseModel):
    text: str

print(TextMessage.model_validate({"text": "Что"}).model_dump_json(by_alias=True, exclude_unset=True))
@sydney-runkle
Copy link
Member

sydney-runkle commented Feb 15, 2024

@FyZzyss,

Thanks for your question!

You could do something like this:

from pydantic import BaseModel, ConfigDict
import json

class TextMessage(BaseModel):
    text: str


dumped_data = TextMessage.model_validate({"text": "Что"}).model_dump(by_alias=True, exclude_unset=True)
print(dumped_data)
#> {'text': 'Что'}
print(json.dumps(dumped_data, ensure_ascii=True))
#> {"text": "\u0427\u0442\u043e"}

Or even:

from pydantic import BaseModel, model_serializer
import json

class TextMessage(BaseModel):
    text: str

    @model_serializer(mode='wrap', when_used='json')
    def serialize(self, handler) -> str:
        return json.dumps(handler(self), ensure_ascii=True)


print(TextMessage.model_validate({"text": "Что"}).model_dump_json(by_alias=True, exclude_unset=True))
#> "{\"text\": \"\\u0427\\u0442\\u043e\"}"

By default, ensure_ascii is set to false :). Let me know if you have any follow up questions!

@FyZzyss
Copy link
Author

FyZzyss commented Feb 15, 2024

@sydney-runkle Thank you for quick answer. Are you planning to return this functionality?

Now it produces more boilerplate code and built-in json module is very slow, so I must import third-party serializers(

@sydney-runkle
Copy link
Member

@FyZzyss,

At the moment, we're not planning on adding this functionality - the performance isn't slower than it would be in V1, where we just had a catch-all **kwargs that passed those values onto json.dumps.

We could consider adding support for flags like this on a case by case basis, though I'm not sure how high the demand is for this specific flag. Thanks for following up!

@HansBambel
Copy link

This gave me some headache as well! I was using json.dumps before and wanted to use the sleeker in-built functionality from pydantic, but then the input from German clients that contained Umlaute such as "ä", "ö", or "ü" where not converted any more.

Now I have to use import json along with json.dumps again :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants