Hierarchical annotation - line (phrase), syllable, phoneme annotations of the jingju (Beijing opera) a-cappella singing dataset
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
catalogue - dan.csv
catalogue - laosheng.csv


Jingju (Beijing opera) Phoneme Annotation


Authors: Rong Gong, Rafael Caro Repetto, Yile Yang, MTG-UPF, rong.gong@upf.edu, rafael.caro@upf.edu


This dataset is a collection of boundary annotations of a cappella singing performed by Beijing Opera (Jingju, 京剧, wiki page) professional and amateur singers.

The boundries have been annotated in a hierarchical way. Line (phrase), syllable, phoneme singing units have been annotated to a jingju (Beijing opera) a cappella singing audio dataset.

The corresponding audio files are the a-cappella singing arias recordings, which are stereo or mono, sampled at 44.1 kHz, and stored as wav files. Due to its large size, we can’t upload the audio files here, please refer to our zenodo link: http://doi.org/10.5281/zenodo.344932

The wav files are recorded by two institutes: those file names ending with ‘qm’ are recorded by C4DM Queen Mary University of London; others file names ending with ‘upf’ or ‘lon’ are recorded by MTG-UPF. If you use this audio dataset in your work, please cite both the following publication:

Rong Gong, Rafael Caro Repetto, & Yile Yang. (2017). Jingju a cappella singing dataset [Data set]. Zenodo. http://doi.org/10.5281/zenodo.344932

D. A. A. Black, M. Li, and M. Tian, “Automatic Identification of Emotional Cues in Chinese Opera Singing,” in 13th Int. Conf. on Music Perception and Cognition (ICMPC-2014), 2014, pp. 250–255.


Format: Praat TextGrid, Praat official page of the textgrid annotation

Tiers number: 5

Role-types wiki page: dan, laosheng

Annotation units for phoneme-level

1.This table shows the annotation units used in 'pinyin', 'dian', 'dianSilence' and 'details' tiers of each textgrid.

2.Chinese pinyin and X-SAMPA format are given.

3.b,p,d,t,k,j,q,x,zh,ch,sh,z,c,s initials are grouped into one representation (not a formal X-SAMPA symbol): c

4.v,N,J (X-SAMPA) are three special pronunciations which do not exist in pinyin.

Structure Pinyin[X-SAMPA]
head initials m[m], f[f], n[n], l[l], g[k], h[x], r[r\'], y[j], w[w],
{b, p, d, t, k, j, q, x, zh, ch, sh, z, c, s} - group [c]
[v], [N], [J] - special pronunciations
medial vowels i[i], u[u], ü[y]
belly simple finals a[a"], o[O], e[7], ê[E], i[i], u[u], ü[y],
i (zhi,chi,shi) [1], i (ci,ci,si) [M],
compound finals ai[aI^], ei[eI^], ao[AU^], ou[oU^]
nasal finals an[an], en[@n], in[in],
ang[AN], eng[7N], ing[iN], ong[UN]
retroflexed finals er [@][r\']
tail i[i], u[u], n[n], ng[N]

##Tier descriptions:

*1-line: line boundary, lyrics in Chinese characters

*2-pinyin: written character (syllable, in pinyin) boundary not including padding characters. Silence is annotated.

*3-dian: written character (in pinyin) boundary including padding characters. Silence is annotated.

*4-dianSilence: written character (in pinyin) boundary. Silence is not annotated explicitly, it follows the previous dian syllable.

*5-details: phoneme (in X-SAMPA) boundary

##Usage: The annotation textgrid files can be opened by Praat or by our parsing code, see this jupyter notebook for some parsing examples.

##License: This textgrid annotation work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.