Sentence label is the transcripted sentence of a piece of audio, which is often used for autonomous speech recognition.
Each audio can be assigned with multiple sentence labels.
The structure of one sentence label is like:
{
"sentence": [
{
"text": <str>
"begin": <float>
"end": <float>
}
...
...
],
"spell": [
{
"text": <str>
"begin": <float>
"end": <float>
}
...
...
],
"phone": [
{
"text": <str>
"begin": <float>
"end": <float>
}
...
...
],
"attributes": {
<key>: <value>
...
...
}
}
To create a ~tensorbay.label.label_sentence.LabeledSentence
label:
>>> from tensorbay.label import LabeledSentence >>> from tensorbay.label import Word >>> sentence_label = LabeledSentence( ... sentence=[Word("text", 1.1, 1.6)], ... spell=[Word("spell", 1.1, 1.6)], ... phone=[Word("phone", 1.1, 1.6)], ... attributes={"attribute_name": "attribute_value"} ... ) >>> sentence_label LabeledSentence( (sentence): [ Word( (text): 'text', (begin): 1.1, (end): 1.6 ) ], (spell): [ Word( (text): 'text', (begin): 1.1, (end): 1.6 ) ], (phone): [ Word( (text): 'text', (begin): 1.1, (end): 1.6 ) ], (attributes): { 'attribute_name': 'attribute_value' }
The ~tensorbay.label.label_sentence.LabeledSentence.sentence
of a ~tensorbay.label.label_sentence.LabeledSentence
is a list of ~tensorbay.label.label_sentence.Word
, representing the transcripted sentence of the audio.
The ~tensorbay.label.label_sentence.LabeledSentence.spell
of a ~tensorbay.label.label_sentence.LabeledSentence
is a list of ~tensorbay.label.label_sentence.Word
, representing the spell within the sentence.
It is only for Chinese language.
The ~tensorbay.label.label_sentence.LabeledSentence.phone
of a ~tensorbay.label.label_sentence.LabeledSentence
is a list of ~tensorbay.label.label_sentence.Word
, representing the phone of the sentence label.
~tensorbay.label.label_sentence.Word
is the basic component of a phonetic transcription sentence, containing the content of the word, the start and the end time in the audio.
>>> from tensorbay.label import Word >>> Word("text", 1.1, 1.6) Word( (text): 'text', (begin): 1, (end): 2 )
~tensorbay.label.label_sentence.LabeledSentence.sentence
, ~tensorbay.label.label_sentence.LabeledSentence.spell
, and ~tensorbay.label.label_sentence.LabeledSentence.phone
of a sentence label all compose of ~tensorbay.label.label_sentence.Word
.
The attributes of the transcripted sentence. See reference/label_format/CommonLabelProperties:attributes
for details.
Before adding sentence labels to the dataset, ~tensorbay.label.label_sentence.SentenceSubcatalog
should be defined.
Besides reference/label_format/CommonSubcatalogProperties:attributes information
in ~tensorbay.label.label_sentence.SentenceSubcatalog
, it also has ~tensorbay.label.label_sentence.SentenceSubcatalog.is_sample
, ~tensorbay.label.label_sentence.SentenceSubcatalog.sample_rate
and ~tensorbay.label.label_sentence.SentenceSubcatalog.lexicon
. to describe the transcripted sentences of the audio.
The catalog with only Sentence subcatalog is typically stored in a json file as follows:
{
"SENTENCE": { <object>*
"isSample": <boolean>! -- Whether the unit of sampling points in Sentence label is the
number of samples. The default value is false and the units
are seconds.
"sampleRate": <number> -- Audio sampling frequency whose unit is Hz. It is required
when "isSample" is true.
"description": <string>! -- Subcatalog description, (default: "").
"attributes": [ <array> -- Attribute list, which contains all attribute information.
{
"name": <string>* -- Attribute name.
"enum": [...], <array> -- All possible options for the attribute.
"type": <string or array> -- Type of the attribute including "boolean", "integer",
"number", "string", "array" and "null". And it is not
required when "enum" is provided.
"minimum": <number> -- Minimum value of the attribute when type is "number".
"maximum": <number> -- Maximum value of the attribute when type is "number".
"items": { <object> -- Used only if the attribute type is "array".
"enum": [...], <array> -- All possible options for elements in the attribute array.
"type": <string or array> -- Type of elements in the attribute array.
"minimum": <number> -- Minimum value of elements in the attribute array when type is
"number".
"maximum": <number> -- Maximum value of elements in the attribute array when type is
"number".
},
"description": <string>! -- Attribute description, (default: "").
},
...
...
]
"lexicon": [ <array> -- A list consists all of text and phone.
[
text, <string> -- Word.
phone, <string> -- Corresponding phonemes.
phone, <string> -- Corresponding phonemes (A word can correspond to more than
one phoneme).
...
],
...
]
}
}
Note
*
indicates that the field is required. !
indicates that the field has a default value.
Besides giving the parameters while initializing ~tensorbay.label.label_sentence.SentenceSubcatalog
, it's also feasible to set them after initialization.
>>> from tensorbay.label import SentenceSubcatalog >>> sentence_subcatalog = SentenceSubcatalog() >>> sentence_subcatalog.is_sample = True >>> sentence_subcatalog.sample_rate = 5 >>> sentence_subcatalog.append_lexicon(["text", "spell", "phone"]) >>> sentence_subcatalog SentenceSubcatalog( (is_sample): True, (sample_rate): 5, (lexicon): [...] )
To add a ~tensorbay.label.label_sentence.LabeledSentence
label to one data:
>>> from tensorbay.dataset import Data >>> data = Data("local_path") >>> data.label.sentence = [] >>> data.label.sentence.append(sentence_label)
Note
One data may contain multiple Sentence labels, so the Data.label.sentence<tensorbay.dataset.data.Data.label.sentence>
must be a list.