Skip to content

Add Paratext metadata to project data files#931

Open
Enkidu93 wants to merge 3 commits intomainfrom
add_paratext_metadata
Open

Add Paratext metadata to project data files#931
Enkidu93 wants to merge 3 commits intomainfrom
add_paratext_metadata

Conversation

@Enkidu93
Copy link
Copy Markdown
Collaborator

@Enkidu93 Enkidu93 commented Apr 24, 2026

Fixes #902.

Is there other data we'd like to put in here? Language code? Translation type (e.g. BackTranslation)?

I wasn't sure @ddaspit if you wanted a generic metadata field. I can do that if you prefer.

I had to avoid a naming collision with the type Guid thus the slightly verbose property names.


This change is Reviewable

@Enkidu93 Enkidu93 requested a review from ddaspit April 24, 2026 16:53
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 13.04348% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.16%. Comparing base (fc27e6f) to head (58d3cc2).

Files with missing lines Patch % Lines
...l/src/Serval.DataFiles/Services/DataFileService.cs 12.50% 33 Missing and 2 partials ⚠️
...al/src/Serval.DataFiles/Models/ParatextMetadata.cs 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #931      +/-   ##
==========================================
- Coverage   66.23%   66.16%   -0.08%     
==========================================
  Files         367      368       +1     
  Lines       21540    21584      +44     
  Branches     2742     2744       +2     
==========================================
+ Hits        14268    14282      +14     
- Misses       6267     6293      +26     
- Partials     1005     1009       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

@Enkidu93 Enkidu93 force-pushed the add_paratext_metadata branch from c3d9619 to 84f1ebe Compare April 28, 2026 06:31
@Enkidu93
Copy link
Copy Markdown
Collaborator Author

Enkidu93 commented Apr 28, 2026

:lgtm:

@ddaspit reviewed 3 files and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

Thanks for the review 😁! Just making sure you saw my question above: While we're at it, is there any other Paratext information that would be helpful to stow here?

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed your questions in the description. I responded to them in the comments.

@ddaspit made 3 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on Enkidu93).


src/Serval/src/Serval.DataFiles/Models/DataFile.cs line 11 at r1 (raw file):

    public string Filename { get; init; } = "";
    public required FileFormat Format { get; init; }
    public ParatextMetadata? ParatextMetadata { get; init; }

I would like this property to be extensible for different formats, but I also like the type-safety of a statically typed class. Since we only support metadata for the Paratext format, we can just use the ParatextMetadata as the type but rename the field to FileMetadata. In the future, if we want to support different metadata for other formats, we can create a custom serializer that deserializes to the correct metadata class based on the value of the Format field.


src/Serval/src/Serval.DataFiles/Models/ParatextMetadata.cs line 3 at r1 (raw file):

namespace Serval.DataFiles.Models;

public record ParatextMetadata

I think it would make sense to include the language code, translation type, and versification. I think that would be the most useful metadata. What do you think?

@pmachapman
Copy link
Copy Markdown
Collaborator

@Enkidu93 You may want to consider including the FullName and perhaps the Visibility? Some projects have a Visibility of Confidential, and in it can be helpful to know that we have to deal quite sensitively with that project (on SF we store Visibility in the sf_project Mongo document for that reason).

Copy link
Copy Markdown
Collaborator Author

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Peter! I went ahead and added those. The visibility will require a new Machine version. See sillsdev/machine#412.

I also added a test.

@Enkidu93 made 3 comments.
Reviewable status: 0 of 8 files reviewed, 2 unresolved discussions (waiting on ddaspit).


src/Serval/src/Serval.DataFiles/Models/DataFile.cs line 11 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would like this property to be extensible for different formats, but I also like the type-safety of a statically typed class. Since we only support metadata for the Paratext format, we can just use the ParatextMetadata as the type but rename the field to FileMetadata. In the future, if we want to support different metadata for other formats, we can create a custom serializer that deserializes to the correct metadata class based on the value of the Format field.

Done. Let me know if this is what you wanted.


src/Serval/src/Serval.DataFiles/Models/ParatextMetadata.cs line 3 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I think it would make sense to include the language code, translation type, and versification. I think that would be the most useful metadata. What do you think?

I went ahead added the fields you suggested as well as those Peter mentioned.

Copy link
Copy Markdown
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 8 files and all commit messages, made 1 comment, and resolved 2 discussions.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on Enkidu93).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider logging/storing basic Paratext project information

4 participants