Backend Engineering Challenge 2.0

This project contains a solution to the Unbabel's Backend Engineering Challenge reformulated to a more realistic context, in which the translation events arrive in real time.

Idea

The goal of this project is to solve the original challenge, but assuming the translation events occur in real time. In this reformulated challenge, we also assume that we care about the metrics by client.

For solving this problem the Apache Beam framework was chosen given that it is able to solve this problem in both a batch and streaming ways. We can also leverage on the GCP's Dataflow runner to run a fully scalable and managed pipeline.

In summary, we have in each folder:

batch the solution to the batch problem, where we process an input json file and write to an output json file.
streaming the solution to streaming problem, where we have: multiple publishers (publisher.py), simulating the clients translation events (Unbabel's translation API); the streaming pipeline (streaming_pipeline.py) performing the pipeline processing; and a subscriber (subscriber.py) that reads the output from the pipeline and prints the information.

Note: For the messaging in the streaming problem, the Cloud Pub/Sub is used.

More details will be provided here in the following days.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
batch		batch
data		data
streaming		streaming
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backend Engineering Challenge 2.0

Idea

To Do List

About

Releases

Packages

Languages

pedrodeoliveira/unbabel-bec

Folders and files

Latest commit

History

Repository files navigation

Backend Engineering Challenge 2.0

Idea

To Do List

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages