-
Create a custom language model:
curl -X POST -u "<username>:<password>" --header "Content-Type: application/json" --data "{\"name\": \"Chile Custom Language Model Example\", \"base_model_name\": \"es-ES_NarrowbandModel\", \"dialect\": \"es-LA\", \"description\": \"Example language model for chilean speakers\" }" "https://stream.watsonplatform.net/speech-to-text/api/v1/customizations"
Returns: {"customization_id": "e4766090-ba51-11e7-be33-99bd3ac8fa93"}
-
Create a corpora - https://console.bluemix.net/docs/services/speech-to-text/language-resource.html#workingCorpora
For example, create a text file named "sample-phrases.txt" with the following content:
vale tienda horario Santiago Alameda O'Higgins Lomo Santa Lucia agarrar onda apretar cachete al tiro cachar Espérate un cachito
-
Add the Corpora to the Language Model
curl -X POST -u "<username>:<password>" --data-binary @sample-phrases.txt "https://stream.watsonplatform.net/speech-to-text/api/v1/customizations/<customization-id>/corpora/sample_phrases"
-
Monitor the corpora request
curl -X GET -u "<username>:<password>" "https://stream.watsonplatform.net/speech-to-text/api/v1/customizations/<customization-id>/corpora/sample_phrases"
Returns:
{ "out_of_vocabulary_words": 2, "total_words": 19, "name": "sample_phrases", "status": "analyzed" }
The status field has one of the following values: analyzed indicates that the service has successfully analyzed the corpus. being_processed indicates that the service is still analyzing the corpus. undetermined indicates that the service encountered an error while processing the corpus
-
Adding words to the custom language model (https://console.bluemix.net/docs/services/speech-to-text/language-create.html#addWords)
-
Train the model
This step should always be done whenever more corpora files or words are added. To train, just make a POST request to the custom language model:
curl -X POST -u "<username>:<password>" "https://stream.watsonplatform.net/speech-to-text/api/v1/customizations/<customization-id>/train"
-
Monitor the training process:
curl -X GET -u "<username>:<password>" "https://stream.watsonplatform.net/speech-to-text/api/v1/customizations/<customization-id>"
Returns:
{ "owner": "752f5526-2624-4e8e-aa4b-1814ffe4db0d", "base_model_name": "es-ES_NarrowbandModel", "customization_id": "e4766090-ba51-11e7-be33-99bd3ac8fa93", "dialect": "es-LA", "created": "2017-10-26T13:30:54.617Z", "name": "Chile Custom Language Model Example", "description": "Example language model for chilean speakers", "progress": 0, "language": "es-ES", "status": "training" }
The response includes status and progress fields that report the current state of the custom model; the meaning of the progress field depends on the model's status. The status field can have one of the following values:
pending
- indicates that the model was created but is waiting either for training data to be added or for the service to finish analyzing data that was added. The progress field is 0.ready
- indicates that the model is ready to be trained. The progress field is 0.training
- indicates that the model is currently being trained. The progress field indicates the progress of the training as a percentage complete.
Note: The progress field does not currently reflect the progress of the training. The field changes from 0 to 100 when training is complete. available indicates that the model is trained and ready to use. The progress field is 100. failed indicates that training of the model failed. The progress field is 0.
- Creating a custom Language Model - https://console.bluemix.net/docs/services/speech-to-text/language-create.html#createModel