Skip to content

LingConLab/data_oral_abaza_corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Spoken corpus of Abaza Data Repository

DOI

This repository is the place where the data from the Spoken corpus of Abaza is curated. This repository also provides an alternative way to access corpus data locally. The data is stored in data_oral_abaza_corpus.csv with 4094 rows and 13 columns:

  • filename
  • time_start
  • time_end
  • speaker
  • sentence_id
  • text
  • translation
  • word_forms
  • morphonology
  • gloss
  • language
  • dataset_creator
  • dataset_provider

About corpus

The corpus contains oral texts of the Tapanta dialect of the Abaza language. Recording were made during a joint HSE University / RSUH expeditions to the village of Inzhich-Chukun in the Abazinsky district of the Karachay-Cherkess Republic in 2017-2019. Text analysis and glossing was done by the participants in the research and study group “Aspects of Abaza Grammar” and the RSF grant # 17-18-01184 “Communicative organization of natural discourse in spoken and signed languages.” The search function entered a closed testing regime in December 2019.

How to cite the corpus and the data

If you use data from the Spoken corpus of Abaza in your research, please cite as follows:

Anastasia Panova, Anna Sorokina, Peter Arkadiev, Elena Sokur. Spoken corpus of Abaza. Moscow: School of Linguistics, HSE University; Linguistic Convergence Laboratory, HSE University. (Available online at: http://lingconlab.ru/spoken_abaza/, accessed on ...)

You may contact with questions about the Corpus data or leave an issue in this repository:

anastasia.b.panova@gmail.com (Anastasia Panova)

You may contact with questions about the search platform or leave an issue in its own repository:

elena.o.sokur@gmail.com (Elena Sokur)