Skip to content
This repository has been archived by the owner on Feb 4, 2021. It is now read-only.
/ almond-voice Public archive

A prototype voice interface for Almond, an open-source virtual assistant developed at Stanford.

License

Notifications You must be signed in to change notification settings

stanford-oval/almond-voice

Repository files navigation

Almond Voice

Voice interface for Almond, an open-source virtual assistant developed at Stanford University by the Open Virtual Assistant Lab.

API

Currently, Almond Voice provides two REST endpoints that provides TTS and STT functionality for Almond-based services, both hosted at voice.almond.stanford.edu. Support for websocket-based streaming will be added in the future.

Note: This API is experimental and may be significantly modified in the future. Please use with caution.

Speech-to-text

Request

POST /rest/stt
Host: voice.almond.stanford.edu
Content-Type: multipart/form-data

Where the body of the request contains a .wav file with the correct MIME type audio/wav in a field named audio. The wav file needs to have a bit depth of 16 and be little endian; however, it does not need to have a specific sample rate, as the server automatically resamples submitted audio.

Response

{
    "status": "ok",
    "text": "Recognized text."
}

Text-to-speech

Request

POST /rest/tts
Host: voice.almond.stanford.edu

Parameters:

{
    "text": "Text to convert to speech."
}

Response

{
    "status": "ok",
    "audio": "/audio/<arbitrary_speech_filename>.wav"
}

The audio file linked in the response is not guaranteed to remain online for longer than an hour.

About

A prototype voice interface for Almond, an open-source virtual assistant developed at Stanford.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •