Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The misuse of smart union in pydantic #72

Closed
numb3r3 opened this issue Jan 21, 2022 · 3 comments
Closed

The misuse of smart union in pydantic #72

numb3r3 opened this issue Jan 21, 2022 · 3 comments

Comments

@numb3r3
Copy link
Contributor

numb3r3 commented Jan 21, 2022

The string value "1" in doc.tags will be automatically converted to bool True when using .to_pydantic_model. This would not make sense.

The reproduce script:

from docarray import Document

doc = Document(tags={'a': '1', 'b': 1, 'c': '2'})
print(f'origin:')
print(doc.tags)

pd_doc_tags = doc.to_pydantic_model().tags
print(f'pydantic: ')
print(pd_doc_tags)


proto_tags = doc.to_dict(protocol='protobuf')['tags']
print(f'protobuf: ')
print(proto_tags)


json_tags = doc.to_dict(protocol='jsonschema')['tags']
print(f'jsonschema')
print(proto_tags)

yields the following output:

origin:
{'a': '1', 'b': 1, 'c': '2'}
pydantic: 
{'a': True, 'b': True, 'c': 2.0}
protobuf: 
{'a': '1', 'c': '2', 'b': 1.0}
jsonschema
{'a': '1', 'c': '2', 'b': 1.0}
@hanxiao
Copy link
Member

hanxiao commented Jan 21, 2022

This to be honest is an upstream issue, i don't think we can solve it?

@numb3r3
Copy link
Contributor Author

numb3r3 commented Jan 21, 2022

One possible quick solution is to disable the smart union in the pydantic model definition

class PydanticDocument(BaseModel):
    ...
    tags: Optional[Dict[str, '_StructValueType']] -> tags: Optional[Dict[str, ANY]]

https://github.com/jina-ai/docarray/blob/main/docarray/document/pydantic_model.py#L30-L50

But I'm not very clear whether this change will not destroy other API/functions.

@numb3r3
Copy link
Contributor Author

numb3r3 commented Jan 21, 2022

With this PR #73, we can get the following result

origin:
{'a': '1', 'b': 1, 'c': '2'}
pydantic: 
{'a': '1', 'b': 1, 'c': '2'}
protobuf: 
{'b': 1.0, 'c': '2', 'a': '1'}
jsonschema
{'b': 1.0, 'c': '2', 'a': '1'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants