-
Notifications
You must be signed in to change notification settings - Fork 0
fix: make fields optional in Pydantic models for better flexibility #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request enhances the robustness of Pydantic data models across three source model files by making numerous fields optional. This change allows the models to gracefully handle incomplete or missing data from external APIs (HAL, OAPEN, and TED), preventing validation errors when fields are absent.
- Made critical fields optional in HAL, OAPEN, and TED models to handle missing data
- Added proper type imports (
Optional,List) for Python typing - Set default values to
Nonefor optional fields to maintain backward compatibility
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.
| File | Description |
|---|---|
| welearn_datastack/data/source_models/ted.py | Added Optional typing and List imports; made nested model fields optional throughout TED data structures |
| welearn_datastack/data/source_models/oapen.py | Converted most fields in CheckSum, Bitstream, Metadatum, and OapenModel classes to optional with None defaults |
| welearn_datastack/data/source_models/hal.py | Added Optional import; made author, language, document type, and date fields optional in Doc and HALModel classes |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This pull request makes the data models in
hal.py,oapen.py, andted.pymore robust by allowing many fields to be optional. This improves compatibility with incomplete or missing data from external sources and prevents validation errors when fields are absent.HAL model improvements:
Docclass to beOptional, includingauthFullName_s,language_s,docType_s,producedDate_tdate, andpublicationDate_tdate, to handle missing data gracefully.nextCursorMarkfield in theHALModelclass optional.OAPEN model improvements:
CheckSum,Bitstream,Metadatum, andOapenModelclasses to beOptional, ensuring the model can handle absent or incomplete fields from the OAPEN data source.TED model improvements:
Optionaltyping to many fields inParagraph,Translation,TEDData, andTEDModelto support cases where data may be missing. Also importedOptionalandListfor proper type hinting. [1] [2]