The Spoken corpus of the dialects of Khakas Data Repository

This repository is the place where the data from the Spoken corpus of the dialects of Khakas is curated. This repository also provides an alternative way to access corpus data locally. The data is stored in data_oral_khakas_corpus.csv with 85107 rows and 14 columns:

filename
time_start
time_end
speaker
recorded
sentence_id
text
translation
word_forms
morphonology
gloss
language
dataset_creator
dataset_provider

About corpus

The Spoken corpus of the dialects of Khakas contains transcribed annotated texts, synchronized with the sound. The texts were recorded during the 21st century with speakers born in 1916-1985 in different expeditions from Moscow to the Republic of Khakassia. All texts are translated to Russian. Texts were analyzed using the automatic parser, and then edited and synchronized with the sound with the help of the ELAN software.

How to cite the corpus and the data

If you use data from the Spoken corpus of the dialects of Khakas in your research, please cite as follows:

Vera Maltseva, Elena Sokur. Spoken corpus of the dialects of Khakas. Moscow: Institute of Linguistics; Moscow: Linguistic Convergence Laboratory, NRU HSE. (Available online at http://lingconlab.ru/spoken_khakas/, accessed on ....)

You may contact with questions about the Corpus data or leave an issue in this repository:

malt.wh@gmail.com (Vera Maltseva)

You may contact with questions about the search platform or leave an issue in its own repository:

elena.o.sokur@gmail.com (Elena Sokur)

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
README.md		README.md
data_oral_khakas_corpus.csv		data_oral_khakas_corpus.csv
json2csv.R		json2csv.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Spoken corpus of the dialects of Khakas Data Repository

About corpus

How to cite the corpus and the data

About

Releases 1

Languages

License

LingConLab/data_oral_khakas_corpus

Folders and files

Latest commit

History

Repository files navigation

The Spoken corpus of the dialects of Khakas Data Repository

About corpus

How to cite the corpus and the data

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages