-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt SHAWI data management workflows to include audio files #10
Comments
We can generate the labels in text format:
We can use this with audacity 2 or 3. Audacity 3 moved away from any (visible) XML. |
IMO we should go for the CSV/TSV format which seems less effort to create. |
General Workflow:
TEI > Auacity labels conversionThe TSV format is described here: https://manual.audacityteam.org/man/importing_and_exporting_labels.html This should be generated by taking all <timeline unit="ms">
<when xml:id="T0"/>
…
<when interval="197124" since="#T0" xml:id="T19"/>
<when interval="197256" since="#T0" xml:id="T20"/>
…
</timeline>
…
<annotationBlock>
<u xml:lang="ar-acm-x-shawi-vicav" xml:id="URFA-034_a20" who="#default" end="#T20" start="#T0">
…
</u>
<annotationBlock> Instead of having the speaker name as the label, we should use the utterance's xml:id, so the exported audio snippet can be named after the utterance id. |
for some reason, the xml:id is missing from the @url attribute on , e.g. https://github.com/acdh-oeaw/shawi-data/blob/main/010_manannot/Urfa-097_Three_Daughters-Harran-2010.xml#L210 |
As I said I only inserted the two lines with the media tag. I guess the linking to some data is missing |
this issue can be closed I guess @dasch124 ? |
We have wav files and we have timestamps in TEI XML.
We need a way to cut the wav files and also probably to export them as mp4.
One way to do this is to transform the TEI files to audacity 2 project files which also happen to be XML files.
The text was updated successfully, but these errors were encountered: