Skip to content

didi/MeetDot

Repository files navigation

MeetDot: Speech Translation for Video Calling

MeetDot is a videoconferencing system with live translation captions overlaid onscreen. The system aims to facilitate conversation between people who speak different languages, reducing communication barriers between multilingual participants.

Currently, our system supports speech and captions in four languages (English, Chinese, Spanish, Portuguese) and combines automatic speech recognition (ASR) and machine translation (MT) in a cascade.

Key Features

  • Smooth scrolling captions to reduce flicker and therefore cognitive load
  • Modular architecture to integrate different ASR and MT services
  • Integrated evaluation suite to optimize metrics such as accuracy, latency and erasure
  • Word-guessing game for an extrinsic measure of end-to-end performance

Compatibility

  • conda 64-bit with Python3.7+
  • Tested on Chrome and Firefox

Setup

Create Python Environment

Create a conda environment that is Python 3.7 or 3.8 (newer versions will probably also work)

  1. conda create -n streaming_translation python=3.7.9
  2. conda activate streaming_translation

Set Up Backend

  1. cd backend
  2. pip install -r requirements.txt, from your conda environment.
  3. pip install -e .
  4. Under the repository's root directory, execute python env_setup.py --copy-keys. Sets environment variables neccessary to run APIs and copies credentials. It also installs git commit hooks. If not running on our shared server, see python env_setup.py --help

Create Credentials

In MeetDot, we provide ASR/MT integrations with Google Cloud so it can be run without access to our servers or internal APIs. If not running on our shared server, do the following:

  1. Go to Google Cloud and get a custom google api key link and copy it into backend/resources/credentials.json
  2. We use API from daily.co to optimize real-time video call quality. Create an account with daily.co, and add your URL and key to .env.

Configuration

In MeetDot, we uses dotenv in the frontend and backend. Variables can be set by editing the .env file or on the command line BACKEND_PORT=5888 python src/app.py. See .env.default for a list of possible keys.

Run the Backend

  1. python src/app.py (run with the --debug flag to restart on code changes, useful for development)

Set Up Frontend

  1. cd frontend
  2. yarn install
  3. yarn run serve --port=8080 (You can use --mode=production to serve a minified bundle)
  4. Open localhost:8080 in browser.

Testing

  1. cd backend/
  2. pytest

Committing

We use pre-commit for managing git hooks.

Run: pre-commit install to set it up.

This should run the formatter (Prettier for javascript) and syntax checks whenever you commit. It should be set up to update itself and library installation, but see .pre-commit-config.yaml for more information.

Evaluation

  1. Download test data
  2. cd backend/
  3. python src/services/evaluation/evaluate.py

Security Notes

In the first time running MeetDot system, the browser needs microphone permission to run the demo. Modern browsers prevent unsecure websites from accessing the microphone.

Local environments (localhost) are considered secure. For remote deployment, you will need to do one of the following:

  1. Serve the website over HTTPS. Set paths to your key and cert files in backend/resources/keys.json. The filenames are defaults generated by LetsEncrypt.
  2. Use a self-signed certificate and have your users click through the browser warnings. (Not recommended)
  3. Have users enable experimental flags to bypass browser security setttings. (Also not recommended). In Chrome, this is chrome://flags#unsafely-treat-insecure-origin-as-secure. In Firefox, this is media.devices.insecure.enabled in about:config.

Deployment

Our deployment script copies files into $DEPLOY_ROOT (e.g. /srv/example.com, set in gitlab variables). Static frontend files are served from $DEPLOY_ROOT/dist with nginx. Backend files are saved in $DEPLOY_ROOT/backend and served with a bare flask-socketio server.

These processes (nginx, the backend) are managed by systemd scripts.

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published