This repository contains the new infrastructure architecture as described in the document prepared by Animeshon and approved by the owners of the project.
This repository contains the PoC for Open Data Hub's new architecture.
The PoC is designed and developed to run in two different environments:
- Locally for development purposes
- In this configuration, the architecture is orchestrated with
docker-compose
- All
camel routes
are powered by Java Quarkus applications - The
notifier
is powered by a Node.js application RabbitMQ
is powered by the docker container rabbitmq:3-management which provides the Admin Pannel out of the box
- In this configuration, the architecture is orchestrated with
- In the cloud as Staging environment
- In this configuration, the architecture is orchestrated with
kubernetes
- All
camel routes
are powered by Camel K applications - The
notifier
is a dockerized Node.js application RabbitMQ
is deployed using RabbitMQ Cluster Operator for Kubernetes
- In this configuration, the architecture is orchestrated with
Follow the data flow by entrying in one of the inbound.
├── infrastructure │ ├── inbound │ │ └── src/main/java/opendatahub │ │ ├── inbound │ │ │ ├── mqtt/MqttRoute.java │ │ │ └── rest/RestRoute.java │ │ ├── pull/PullRoute.java │ │ ├── writer/WriterRoute.java │ │ ├── RabbitMQConnection.java │ │ └── WrapperProcessor.java │ ├── notifier │ │ └── src │ │ ├── changeStream.js │ │ └── main.js │ ├── router │ │ └── src/main/java/opendatahub/outbound │ │ ├── fastline/FastlineRoute.java │ │ ├── update/UpdateRoute.java │ │ └── router/RouterRoute.java │ ├── transformer │ │ └── src/main/java/opendatahub/transformer │ │ ├── ConsumerImpl.java │ │ ├── Main.java │ │ └── Poller.java
We provide a docker-compose
file to start the architecture locally.
To run the cluster just
docker-compose up
in the main folder. It will build and spin up all components.
The first time we compose-up, we have to initialize MongoDB's replica set. To do so firstly identify the MongoDB docker container by running
docker ps
then copy the MongoDB container name (should be something like odh-infrastructure-v2_mongodb1_1
or odh-infrastructure-v2-mongodb1-1
depending on the version of docker-compose
) and run
docker exec <mongodb-container-name> mongosh --eval "rs.initiate({
_id : 'rs0',
members: [
{ _id : 0, host : 'mongodb1:27017' },
]
})"
Service | Address |
---|---|
Gateway Mosquitto | localhost:1883 |
Gateway Rest | localhost:8080 |
RabbitMQ Pannel | localhost:15672 |
RabbitMQ AMPQ 0-9-1 port | localhost:5672 |
MongoDB | localhost:27017 |
To make and listen to the MQTT brokers (gateway or internal) we suggest using MQTTX.
To make REST requests we suggest using Insomnia or any other REST client.
To connect to the MongoDB deployment we suggest using Compass. Be aware that being the deployment a Replica Set
, the URI string must be properly configured (Doc) and you have to check Direct Connection in the Advanced Connection Options of Compass.
Once all connections are established, you can subscribe to the MQTT Brokers
and watch for messages or send, send REST request
and connect to the MongoDB
instance.
! All messages sent to the Perimeter
must be valid JSON, otherwise, the Integrator
will discard and log the message.
! All messages sent with MQTT must be sent with QoS2
.
Being a PoC, the whole system is meant to give an overall insight and overview of the proposed architecture. Performances can be greatly improved using replicas, sharding, parallel programming, configuration tuning, and polish.
The notifier
, written in JS and running on Node, is a good example of a possible bottleneck that can be greatly improved.
Instead of having a single instance subscripted to the whole MongoDB deployment
, it could be split between multiple instances each one subscripted to a particular MongoDB Database
or even to single Collections
.
This kind of polish can be done only once the team decides how to distribute the RawData
coming from different Datasources.
To use the Change Stream feature, the MongoDB deployment MUST be deployed as ReplicaSet
The notifier subscribes to the MongoDB deployment and starts listening for changes. In the case that the MongoDB deployment restarts / goes offline, the Notifier MUST implement a mechanism to check that the connection is alive and start reconnecting until the MongoDB deployment returns online.
The following section is dedicated to the troubleshooting of known or common issues.
There are a number of common issues related to docker-compose that might prevent the correct setup of the local environment. Verify the following configurations to be appropriate depending on your local runtime environment and operating system:
- The
docker
anddocker-compose
versions are up-to-date - The
docker
caches are not interfering with a newer configuration (rundocker system prune --all
to delete all volumes, images, and containers) - The branch is up-to-date (run
git pull
) - The permissions assigned to the mounted volumes (folders) on your host are valid (MongoDB is especially known to generate issues with the permissions)
- Windows line endings are not interfering with bash scripts and tools configurations (run
git rm --cached -r . && git reset --hard
to force line endings normalization) - this issue usually affects only builds on Windows systems, please read visit the official Git documentation to learn more about gitattributes and line endings normalization.