MAT Fan Engagement Coding Challenge

Solution Overview

The Telemetry Solution for the coding challenge has been implemented in Java and docker-compose, with:

the Quarkus framework,
and Apache Kafka as the streaming platform and data store.

Solutions Architecture: Design and Implementation Decisions

Kafka as both the streaming platform and persistent data store (/archive):

Motivation
- Using Kafka for persistent storage may seem counter-intuitive at first. Kindly note, however, that Kafka is a streaming platform and not a "traditional" messaging system where messages are deleted from queues once they've been successfully consumed.
- There are three primary differences between Kafka and traditional messaging systems:
  - Kafka stores a persistent log which can be re-read and kept indefinitely.
  - Kafka is built as a modern distributed system: it runs as a cluster, can expand or contract elastically, and replicates data internally for fault-tolerance and high-availability.
  - Kafka is built to allow real-time stream processing, not just processing of a single message at a time. This allows working with data streams at a much higher level of abstraction.
Benefits
- Stream processing jobs perform computation(s) from a stream of data coming in via Kafka. When the logic of the stream processing code changes, you often want to recompute the results. A very simple method to accomplish this is to reset the topic offset (either to zero or a specific starting time, etc.) so that the application can recompute the results with the new code. (I.e. The Kappa Architecture)
- We can have an in-memory cache in each instance of the application that is fed by updates from Kafka. A very simple way of building this is to make the Kafka topic log compacted, and have the application simply start fresh at offset zero whenever it restarts to populate its cache.
- Kafka is often used to capture and distribute a stream of "database" updates (this is often called Change Data Capture or CDC). Applications that consume this data in steady state only need the newest changes. New applications, however, need start with a full dump or snapshot of data. Performing a full dump of a large production database is, however, often a very delicate (performance impacts, etc.) and time consuming operation. Enabling log compaction on the topic(s) containing the stream of changes allows consumers of this data to perform simple reloads by resetting to offset to zero.
Implications
- Only the CarCoordinate MQTT source data will be published (and persisted) in the carCoordinate Kafka topic.
- Stream processing outputs (i.e. CarStatus and Events MQTT messages) will not be stored in Kafka.
  - If this becomes a requirement in the future, the solution can be extended to assign session ID's and then store all of the stream processing outputs in kafka topics as well.
- This architecture enables the solution to reset the offset for the carCoordinate topic to zero (when required) in order to recompute the results (i.e. CarStatus and Events MQTT messages) when code/logic changes need to be implemented.
Alternatives
- InfluxDB is one of the best time series databases. The enterprise version is, however, required for features such as DB clustering; which is a critical non-functional requirement for well-architectured cloud-based solutions. As such, the FOSS version is not suitable for this solution.
Quarkus as the (Java) framework for implementing the application code:

Motivation
- Quarkus provides all of the required functionality to build the solution.
- I haven't used Quarkus before and decided that this coding challenge presents me with a good opportunity to learn a new framework.
Benefits
- Quarkus is an alternative to similar, but "older" frameworks, and provides excellent development functionalities and supports integration with cloud providers, containerisation technologies and serverless (Lambda) application architectures.
- Quarkus can compile Java bytecode to native OS binaries with extremely impressive results:
Implications
- Using Quarkus may require a relatively small invest of time to learn a new framework.
- The building of native binaries can be very slow. As such, it is recommended to build OpenJVM-based container images during development and to build native images for deployment on longer-lived environments.

Description of the core services and logic in the solution

Working from the architecture diagram above, the Telemetry Solution provides:

The CarCoordinateService that subscribes to the carCoordinate MQTT topic (provided by the broker service) and streams the consumed telemetry data to a carCoordinate topic in the kafka service.
- This service also subscribes to the carCoordinate topic (provided by the kafka service) and asynchronously consumes and processes the carCoordinate stream to perform the telemetry data processing.
The TelemetryService that processes the carCoordinate's data stream and:
- Calculates the speed and total distance that each Car has travelled using the Haversine formula.
- Maintains an ordered list of Car's (sorted by total distance travelled) and:
  - Calculates and publishes MQTT CarStatus topic updates (with the CarStatusService) for both the speed and overall position of each Car.
  - Detects and publishes MQTT Events topic updates (with the EventsService) for every Car that has overtaken another Car.
- Calculates and publishes MQTT Events topic updates (with the EventsService) for each Car's lap times and the overall fastest lap.

Building and running the code locally

Prerequisites

docker
docker-compose
OpenJDK 1.8
Apache Maven
A Linux-based environment (recommended)
Or Windows 10 with WSL 1/2

A bash script has been provided to automate the code compilation, testing and execution of docker-compose:

./buildAndRun.sh
# -- OR, use the following for the native binary:
./buildAndRunNative.sh

Open: http://localhost:8084

Press Control+C to stop docker-compose.

The code includes Unit Tests for the:

Haversine formula implementation,
Speed calculations.
Circuit length calculation from GeoJSON coordinates.

The source code can be compiled and tested by running:

cd solution
./build.sh
# -- OR, use the following for the native binary:
./buildNative.sh

Deploying and running the code on AWS

Note: I would normally automate the provisioning and deployment with CloudFormation, Terraform, the CLI-API, etc., but doing so (in this case) would require you to run potentially dodgy automation code and/or templates (etc.) from a third-party with privileged access to your AWS services/resources. This is obviously an extremely bad idea... so I've provided simple step-by-step instructions instead.

Login to your AWS Console and change to the London region: https://aws.amazon.com
Click here to create a new EC2 instance
Follow the onscreen prompts to provision a new EC2 instance with:
- Image: Amazon Linux 2 AMI (HVM), SSD Volume Type (ami-05f37c3995fffb4fd)
- Instance Type: t2.micro
- On the Instance Details page, paste the following script into the User Data textbox under Advanced Settings:

#!/bin/bash
# update the system and install required packages
yum update -y
yum install -y docker git
# start docker on boot
systemctl enable docker --now
# install docker-compose
curl -L "https://github.com/docker/compose/releases/download/1.25.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
usermod -aG docker ec2-user
# clone the solution's git repo
cd /home/ec2-user
git clone https://github.com/nicdesousa/MAT-Coding-Challenge.git
chown ec2-user:ec2-user -R /home/ec2-user/MAT-Coding-Challenge
# allocate some swap (file) memory to enable the solution to run on a t2.micro instance
# note: doing this is only suitable for a very small demo and certainly not recommended for anything else...
fallocate -l 1G /swapfile
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
echo "/swapfile none swap sw 0 0" >> /etc/fstab
# run docker-compose when the instance boots
echo "runuser -l ec2-user -c 'cd /home/ec2-user/MAT-Coding-Challenge/; docker-compose up -d'" >> /etc/rc.local
chmod +x /etc/rc.d/rc.local
# "restart" to apply the instance changes
reboot

Click Review and Launch
Click Edit security groups and add the following rules with Add Rule:

Type	Protocol	Port Range	Source	Description
HTTP	TCP	80	0.0.0.0/0	MAT Fan App HTTP
HTTPS	TCP	443	0.0.0.0/0	MAT Fan App Websocket

Note: The solution (see docker-compose) enables using ports 80 and 443 instead of 8084 and 8080 to allow access for clients behind a restrictive (network or corporate) proxy server.

Click Review and Launch and then Launch
Click View Instances and wait ±5 minutes for the provisioning and deployment to complete.
Select your new EC2 instance, and copy the IP Address next to the Description panel's->IPv4 Public IP field.
Open a new browser tab/window, paste the copied IP Address, and press enter. (Please refresh the page until the site contents is shown.)

The Solution to the MAT Coding Challenge has now been deployed to a single EC2 Instance:

When you're done, please remember to terminate your EC2 instance:
- Click here to view your AWS EC2 instances
- Right-click on your EC2 instance and select: Instance State->Terminate

AWS Production Deployment Architecture

While it is out of scope for this challenge, the diagram below is an example of what a production solutions architecture could look like when applying the best-practices from the AWS Well-Architected Framework.

In this case, we use the existing containerised services (i.e. no code changes) and leverage ECS clusters with auto-scaling groups to provide a low-maintenance, secure, reliable, performant, and cost-effective AWS-based solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOLUTION_SRE_README.md

SOLUTION_SRE_README.md

MAT Fan Engagement Coding Challenge

Solution Overview

Solutions Architecture: Design and Implementation Decisions

Description of the core services and logic in the solution

Building and running the code locally

Deploying and running the code on AWS

AWS Production Deployment Architecture

Review of the Solution

The MAT Coding Challenge Dashboard

EC2 (t2.micro) Instance Metrics

Files

SOLUTION_SRE_README.md

Latest commit

History

SOLUTION_SRE_README.md

File metadata and controls

MAT Fan Engagement Coding Challenge

Solution Overview

Solutions Architecture: Design and Implementation Decisions

Description of the core services and logic in the solution

Building and running the code locally

Deploying and running the code on AWS

AWS Production Deployment Architecture

Review of the Solution

The MAT Coding Challenge Dashboard

EC2 (t2.micro) Instance Metrics