Skip to content

Conversation

@lpi-tn
Copy link
Collaborator

@lpi-tn lpi-tn commented Nov 21, 2025

This pull request updates the Pydantic data models for PressBooks and TED sources to improve handling of optional and required fields. The changes make the models more robust by accurately reflecting which fields may be missing in the source data, and ensure required fields are enforced where appropriate.

PressBooks model improvements

  • Made slug and type_ fields in EditorItem and AuthorItem optional to handle cases where these fields may be absent in the source data.
  • Updated the Publisher model to make type_, name, and address fields optional, allowing for incomplete publisher information.
  • Changed editor and author fields in PressBooksMetadataModel to be optional lists, supporting metadata records that may not have editors or authors.

TED model improvements

  • Removed unnecessary import of Optional since all fields are now required.
  • Made cues in Paragraph, paragraphs in Translation, and video, translation, data in TEDData and TEDModel required fields, enforcing stricter data integrity for TED models.…ility

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors Pydantic models for PressBooks and TED data sources to better distinguish between required and optional fields. The changes ensure that models accurately reflect the possibility of missing data in source systems while enforcing stricter validation where data is guaranteed to be present.

Key changes:

  • TED models now require previously optional nested data structures (cues, paragraphs, video, translation, data)
  • PressBooks models now allow optional values for metadata fields that may be absent (slug, type_, editor, author, publisher details)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
welearn_datastack/data/source_models/ted.py Removes Optional from nested structures, making all TED data fields required and removing unused import
welearn_datastack/data/source_models/pressbooks.py Adds Optional to metadata fields that may be missing in source data (slugs, types, editors, authors, publisher info)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lpi-tn lpi-tn merged commit 0888293 into main Nov 21, 2025
4 checks passed
@lpi-tn lpi-tn deleted the Fix/pydantic-model branch November 21, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants