Translate Server

The Problem

In a service heavily relying on user generated content, scaling globally poses the challenge of seeding content for a new region, to initially bring a platform to self-sufficiency. One common growth hack to circumvent this issue is using machine translations to display content only available in other langauges in a user's native languages.

This can require using local APIs in markets with restricted access to mainstream services like Google Cloud Translation or Microsoft's Cognitive Platform. Therefore, I propose a service which abstracts other translation APIs behind a common HTTP API. The backend implementation uses Golang and allows for failover between any number of services configured. New upstream service connectors are implemented with one method. Additionally, this service optionally connects to a cache backend to store translations results. This accomodates upstream services that impose restrictive rate or total translation limits.

Scope

Currently, there are only two backend implementation. Furthermore, there is currently no backend for external cache services, and no expiry timer for cached items.

Future improvements planned are:

Integration tests for upstreams
Concurrently calling multiple upstreams
Enabling load-balancing to multiple upstreams / expanding failover options
Controlling the cache backend / settings at runtime
Limiting simultaneous upstream requests
Backfilling / cache warming for likely future translations
Implementing a streaming RPC endpoint
Rate-limiting by different user groups
Front-end to manually perform translations
API for injecting company-specific vocabulary
Allowing for templated translations / placeholders
Adding support for more translation services, and cache backends

Technical Choices

As a language, Go was a good fit, because it provides both concurrency primitives and a well-tested HTTP implementation. An added benefit is the speedup over Python / Java. This is a simple, low-level abstraction, so the faster the better. For the external interface, I decided to implement a single HTTP endpoint, and used Content-Langauge and Accept-Language headers. There is a spec, so this is easier to support.

The caching backend is implemented as a simple Put / Get / Has interface, allowing for straightforward expansion to use external key-value stores. Translation services are called asynchronously - this enables future changes toward calling multiple upstream services concurrently, and selecting the first one to answer. Currently one service is called at a time. Failover is currently implemented by pushing a service to the end of the queue for requests to run on.

Upstream credentials are passed as environment variables. Currently, there are two variables processed:

GOOGLE_API_KEY: If specified, enable the Google Cloud Translation backend with given key.
ENABLE_MOCK: If specified, enables the mock backend. This is used for testing upstream failure handling.

Testing Strategy

The cache package is fully unit tested. It plays a part in every request and is critical to reducing upstream load.
handler.go has full coverage as well. It handles every request, making it a critical piece of code.
sanity_test.go contains tests running gofmt and govet. Code is more often read than written, so this was a no-brainer.

About The Author

My name is Leonhard Kuboschek. This code uses libraries written by many people. Thank you. All code contained within this repository was authored solely by me. I'm currently working full-time at a startup (50 hrs / week), and we have a launch coming up. Thus I've been working on this after hours, and on the weekends only. Additionally, this is one of my earlier projects utilizing Go as a language.

Personal Homepage

HTTP Interface

Authentication

This service authenticates all requests by checking the presence of a JSON Web Token (JWT). There is currently no further authorization strategy, all authenticated requests are allowed.

JWTs are checked with the following parameters:

Algorithm: HS256
Secret Key: Set using environment variable SECRET_KEY

At present, no claims are required on the tokens.

Request Format

The server accepts HTTP POST requests to port 8080, the path is /.

The headers required are:

Content-Language specifying the language that the request content is assumed to be in.
Accept-Language specifying the target language.
Authorization containing the string Bearer followed by a JSON Web Token (see authentication section)

The text to be translated is sent in the request body; The content type shall be text/plain.

Response Format

The response will contain the Content-Language header for the target language, as well as the translated text in the response body. In case of an error, standard HTTP status codes are used for signaling.

Sample Deployment

There is a sample deployment running at translate dot leo dot codes. It authenticates requests by JSON Web Token. If you would like to receive testing credentials, please send me an email (see homepage links below).

Sample request using `cURL`

curl -X POST \
'http://localhost:8080/' \
-H 'accept-language: en' \
-H 'content-language: de' \
-d 'Also wirklich!'

Docker Image

`docker run -p 8080:8080 -e GOOGLE_API_KEY=$GOOGLE_API_KEY -e SECRET_KEY=yoursecretkey kuboschek/translate-server`

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
cache		cache
upstream		upstream
.gitignore		.gitignore
.travis.yml		.travis.yml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
handler.go		handler.go
handler_test.go		handler_test.go
main.go		main.go
sanity_test.go		sanity_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Translate Server

The Problem

Scope

Technical Choices

Testing Strategy

About The Author

HTTP Interface

Authentication

Request Format

Response Format

Sample Deployment

Sample request using `cURL`

Docker Image

About

Releases

Packages

Languages

License

kuboschek/translate-server

Folders and files

Latest commit

History

Repository files navigation

Translate Server

The Problem

Scope

Technical Choices

Testing Strategy

About The Author

HTTP Interface

Authentication

Request Format

Response Format

Sample Deployment

Sample request using cURL

Docker Image

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Sample request using `cURL`

Packages