Skip to content
tumarkin edited this page Mar 6, 2018 · 3 revisions

yente accepts comma-separated and tab-separated files. They should have the following formats:

  • A header row.
  • One column called "name", listing the names of the entities.
  • One column called "id", listing identifiers for the entities.
  • One optional column "group", listing group identifiers for each entry to be used in sub-group matching.
  • All text should be standard English characters (Unicode).

By default, yente attempts to convert all Unicode characters to simplified equivalents (e.g. é is converted to e). Should this fail, yente will output an error error when processing a file that indicates the line where an illegal character was found. Simply edit the line and rerun the program to address any such issue.