You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Stefan
you are right, these entities were removed. We chose to do it due to their scarcity in the training data. We first thought about keeping OTHER, but then decided for HIPE 2022 to keep it simple, given that all these datasets are already a bit nightmarish. Here are the stats that one can generate from the published data in webanno tsv format:
annotated_tsv $ cat *.tsv | grep -vP '^#' | cut -f 5| grep -Po '[A-Z]+' |sort |uniq -c | sort -rn
3470 LOC
891 BUILDING
406 STREET
5 OTHER
1 FICTION
For the currently private data, the distribution situation is similar and we will not add additional entity types.
Hi,
during review of adding HIPE-2022 dataset into Flair, we just found that some of the listed entites do not exist in the actual dataset.
These entities are:
ALIEN
,OTHER
,FICTION
.Could you please clarify what happened to these entites? Will they be added later (or will they appear in the final test dataset).
Many thanks,
Stefan
The text was updated successfully, but these errors were encountered: