Skip to content

linkedtv/videocorpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 

Repository files navigation

videocorpus - Resources for Named Entity Linking Performance Evaluation: Corpora Extracted from Transcripts

This project contains the following:

RBB150

Currently it contains 150 documents collected from RBB transcripts.

It contains several folders:

  • tutorial - short GATE tutorial used by the annotators
  • guideline - annotation guideline
  • subs - raw subtitles / transcripts
  • ontology - original and enriched ontology (created by automatically adding subtypes to the original ontology)
  • gold - gold standards in various formats (csv, nif, etc).

Regionality: content from Berlin and Brandenburg.

News topics: floods, traffic jams, immigration, sports and political events, and local administration.

Due to the regionality of the content, the corpus also contains frequent use of shortened names for entities, direct or indirect references to local (elections) or historical events (e.g. anniversaries of the 1953 East German Uprising or of Kennedy's visit to Berlin from 1963).

LICENSE

This corpus is (c) 2015-2016, MODUL University Vienna

It is distributed under CC-BY-NC-ND 3.0 license. Please refer to the LICENSE file for further information.

CC-BY-NC-ND 3.0

ACKNOWLEDGEMENTS

Transcripts were provided by Rundfunk Berlin-Brandenburg - RBB through the LinkedTV project.

Corpus belongs to MODUL University Vienna.

Annotations were created by anonymous annotators under the supervision of the development team.

Development was done by Adrian M.P. Brasoveanu under the supervision of Dr. Lyndon Nixon, Prof. Dr. Albert Weichselbraun, and Prof. DDr. Arno Scharl at MODUL University Vienna.

DISCLAIMER

This software, its code and documentation, is made available without guarantee of correctness or completeness. The software owner gives permission for the use of this software without liability.

This software, its code and documentation, is made available under a LICENSE. Redistributions of the software shall include the same LICENSE in the software package. Installations of the software should also refer to this LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published