-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add genre information to metadata of plays #120
Comments
Update: GerDraCor was fully marked up with basic genre information as described above (see dracor-org/gerdracor@9c2fcf5). So we could test the new function with this corpus first… |
@lehkost How do we deal with corpora that have Regarding the |
Speaking of which, RusDraCor will be updated with proper genre info in the next hour. 🙃 And I agree to your suggestion how to handle |
@lehkost Implemented in v0.75.0 deployed on staging. Please test! |
Works well for GerDraCor, metadata loaded into LibreOffice Calc: The only thing I would change is the handling of Libretto. The assignment of "true" is okay, but every play that has no "Libretto" indication should generally be "false", also those plays that have no other genre information. Reason: If "Libretto" information is not marked up it means that a play is not a libretto. |
I just added "Tragicomedy" to GerDraCor where appropriate, like this: <textClass>
<keywords>
<term type="genreTitle">Tragicomedy</term>
</keywords>
<classCode scheme="http://www.wikidata.org/entity/">Q192881</classCode>
</textClass> |
Is the idea that repertoire of accepted text classes will grow over time or will it even be completely open to additions? This has implications on how the libretto flag is implemented. I assumed that for now |
Ideally, the text classes (genres) will grow over time. Tagging genre for such diverse corpora in a TEI document, though, is not the best solution. We might build on a drama genre ontology in the future, but there is no good candidate there yet. For now I would say we have two levels of genre information: 1. libretto or not; 2. tragedy, comedy or tragicomedy, if applicable. The later group under (2.) might grow in the near future, the first group (1.) probably won't, because it just tells apart libretti and dramas not written for music. I hope that makes sense? |
The libretto flag is now always false unless unless the libretto class code is among the text classes. See #120 (comment) and #120 (comment)
So for libretti there will never be a chance to distinguish between e.g. dramma giocoso, tragédie lyrique, Singspiel etc.? That looks like an unnecessary restriction to me. In fact, why would we need to couple genre attribution and identification of libretti at all? PR #122 tries to somewhat loosen this coupling while still following the main idea of genre attribution proposed above. |
Good points! Ideally, we would like to store all genre info we can gather on each play, including the libretto subgenres you mention. The way we started to implement genre markup now doesn't prevent us to further differentiate genre in the future. On a sidenote, some examples from the emerging French Drama Corpus: <textClass>
<keywords>
<term type="genreTitle">Tragédie</term>
<term type="genreTitle">vers</term>
</keywords>
</textClass> <textClass>
<keywords>
<term type="genreTitle">Comédie</term>
<term type="genreTitle">prose</term>
</keywords>
</textClass> <textClass>
<keywords>
<term type="genreTitle">Monologue</term>
<term type="genreTitle">vers</term>
</keywords>
</textClass> <textClass>
<keywords>
<term type="genreTitle">Proverbe</term>
<term type="genreTitle">prose</term>
</keywords>
</textClass> Also here, Tragedy, Comedy, Monologue and Proverb are not on the same level of attributing a genre, the same way as "prose" and "verse" are a different way of (a much more formal) genre description. We should probably try not to loose any of the information we inherit from other sources (and should try to add rich genre markup for the sources that don't have any). |
I would suggest that the transformation script for dracor-org/fredracor adds the appropriate |
After deciding on a mark-up strategy for genre declaration (dracor-org/dracor-schema#3), we are ready to add this info to the metadata of plays in a unified manner for all corpora. For the time being we only mark genre very coarsely on two levels:
comedy or tragedy:
or
libretto or not:
The API should add two columns to the metadata files:
Comedy
, orTragedy
, orempty
if no information is available).1
for yes, this is a libretto, and0
for no, this is not a libretto).The background for this feature is:
The German corpus already has some files with corresponding markup to test this feature once it's implemented (see dracor-org/gerdracor@98b713b).
The text was updated successfully, but these errors were encountered: