Skip to content

Metadata

Kalinda Pride edited this page Apr 25, 2021 · 15 revisions

What metadata does LingView display?

The website can show metadata for each story, including title, author, description, and other fields. On this page, "metadata" refers to all that information about an ELAN or FLEx document other than the transcription and annotations themselves.

Want more metadata than what's listed here? Contact kalinda.pride@gmail.com to request it as a feature.

From ELAN

If you input these pieces of information into ELAN, they will be automatically displayed by LingView after processing the ELAN file:

  • Name for each tier (e.g. "Spanish translation", "speaker 1 transcription", "gloss", "part of speech").
  • Language for each tier (often an ISO code).
  • Speaker name for each tier.
  • The display order for tiers. ELAN stores this information in a .pfsx file with the same name as the .eaf file. If you omit the .pfsx file from LingView's elan_files directory, LingView will use a default order instead.
  • Hidden tiers, which are completely omitted from the LingView output. ELAN stores this information in a .pfsx file with the same name as the .eaf file. If you omit the .pfsx file from LingView's elan_files directory, LingView won't omit any tiers.
  • The filename (shortened) becomes the default document title.
  • The URI of this ELAN file, for example "1ed3d641-acd9-4466-811d-17c8ed59844c", becomes its unique ID within LingView.
  • LingView can also use ELAN's info about associated media (audio or video) to help it know which files to display.

From FLEx

If you input these pieces of information into FLEx, they will be automatically displayed by LingView after processing the FLEx file:

  • A title in each of the document's languages. If none is specified, a shortened version of the filename will be used.
  • A source description in each of the document's languages.
  • Languages used in the document (often ISO codes).
  • Speaker name for each speaker.
  • The URI of this FLEx file, for example "1ed3d641-acd9-4466-811d-17c8ed59844c", becomes its unique ID within LingView.

From the edit.js script

Metadata can be added or changed using the edit.js script. To use this script, first open index.html in your browser, navigate to the story you want to edit, and locate the unique ID associated with that story. This is a 36-character string found at the end of the story URL. For example, for URL https://brownclps.github.io/LingView/#/story/97b8ab3b-d2a5-428a-aa68-0aa304ba1c44, the ID is 97b8ab3b-d2a5-428a-aa68-0aa304ba1c44.

Now that you know the story ID, it's time to run the edit.js script. If you have already downloaded and installed LingView on your computer, we recommend running the script in your terminal. If you don't want to install LingView, use the Edit Story Metadata Github Action instead.

In your terminal

With this unique ID copied to your clipboard, return to your terminal and type the following command: node preprocessing/edit.js unique_id where unique_id is replaced with the 36-character ID. This script shows a multiple-choice question, allowing you to choose what category you want to edit, for example title, description, genre, etc. It will ask you to confirm the changes. After editing the file, run npm run quick-build so that the changes will appear on the LingView site.

With the Edit Story Metadata Github Action

An alternative way of using the edit.js script without cloning the repo or installing any of the node packages is using the “Edit Story Metadata” Github Action.

  1. Go to the site’s Github repo’s main page, then click on the “Actions” tab at the top
  2. Select “Edit Story Metadata” on the left panel
  3. Click the “Run workflow” button on the right
  4. After the workflow is triggered, you should see a new entry for the “Edit Story Metadata” action with a yellow circle next to it. The yellow circle means this workflow is running. Click on the entry “Edit Story Metadata”. You will land on a new page and should see a yellow circle next to the “edit-data” step on the left. Click on this entry.
  5. On the log screen that shows up after you click on “edit-data”, wait until the “Setup Debug Session” step runs. When it’s this step, wait for a minute or so, and then you should see a URL listed after the line “To connect to this session copy-n-paste the following into a terminal”
  6. Copy the ssh command that is listed here. It should look like ssh <long string>@<some string>.tmate.io Open the terminal on your laptop, and paste the command in the terminal. Hit enter. This should create a ssh connection to the Github action that is currently running. If the terminal asks if you want to trust this connection, type “yes” or “y”.
  7. On your terminal, you will see an intro page to tmate. Click Ctrl-C to quit the intro page, and you should see a terminal that is similar to a regular linux terminal. If you type the command ls, you should see all the folders and files for the repo you are currently running the Github action in.
  8. You can run the edit.js script as described in the previous section, using a story’s unique ID to edit its metadata.
  9. Note: the ssh connection has a default timeout of 15 minutes. If you think you need more time, you can run touch /tmp/keepalive after connecting to keep the ssh connection alive for longer.
  10. After you have edited the metadata files, run the usual Github commands to commit and push the changes to the master branch of your repo. You might have to use git config --global to set up your Github email and username before committing.
  11. Return to the Github Action workflow, which should still be running at this point. Cancel this workflow, which will terminate the ssh connection. You should be able to see the metadata edits in the repo now.
  12. If the Build & Deploy action doesn't run on its own, you need to trigger it manually to make sure that the changes made to the metadata are reflected on the website. You can do this by adding or removing an extra space at the end of the README.md file.

Media files

LingView uses a mix of strategies to know which media files to display for each ELAN or FLEx file. Here's what LingView does each time you rebuild it:

  1. Try to use the same media file(s) as the last time LingView was rebuilt. If you run the edit.js script, there's an option to change what media files are used in this step. If that doesn't work, keep looking...
  2. Try to use the media file name(s) that are referenced by the ELAN or FLEx file. To see which media files are referenced by an ELAN file, you can open it in ELAN and then go to \textsf{Edit $\rightarrow$ Linked Files...} If a media file is referenced, but LingView can't find it in the data/media_files directory, keep looking...
  3. Look for media file(s) with the same filename as the ELAN or FLEx file. For example, if the ELAN file is named Kuke_chiste.eaf, look in data/media_files for a file named Kuke_chiste.mp3. LingView usually tries multiple variants of the filename. For example, if LingView is looking for a video file, and the ELAN file is named Kuke_chiste.new.eaf, LingView will try Kuke_chiste.new.mp3, Kuke_chiste.new.wav, Kuke_chiste.mp3, and Kuke_chiste.wav.
  4. Give up. Print a message saying that LingView failed to find the media files for this text.