datalake-query-pg-consumer
is a python microservice that consumes datalake query events from a Kafka topic and stores them inside a relational database.
- Rationale
- Quick start
- Building
- Installation
- Contributions
- License
- Code of Conduct
- Security Vulnerability Reporting
The purpose of this service is to provide a way of consuming Kafka messages produced by datalake-query-ingester and storing them in a relational database for long-term storage and analysis.
This is part of a datalake query metadata ingestion and analysis pipeline.
To run the service locally, along with supporting services for testing, just run docker-compose up datalakequerydbconsumer
.
Similarly, for tests run docker-compose run tests
.
SQLALCHEMY_DEPENDENCIES
argument.
To build, run docker build --build-arg SQLALCHEMY_DEPENDENCIES=psycopg2-binary -f Dockerfile -t bloomberg/datalakequerydbconsumer:latest-postgresql .
OR
Run docker-compose build datalakequerydbconsumer
This is meant to be used with Trino and models data based on Trino's query metrics. This has been tested with Trino 363, backwards or forwards compatibility is not guaranteed.
The service is ment to be deployed with k8s. Configuration is passed with environment variables:
- KAFKA_BROKERS
- DATALAKEQUERYDBCONSUMER_KAFKA_TOPIC
- DATALAKEQUERYDBCONSUMER_KAFKA_GROUP_ID
- DATALAKEQUERYDBCONSUMER_DB_URL
An example config can be found in docker-compose.yaml > datalakequerydbconsumer
.
We ❤️ contributions.
Have you had a good experience with this project? Why not share some love and contribute code, or just let us know about any issues you had with it?
We welcome issue reports here; be sure to choose the proper issue template for your issue, so that we can be sure you're providing the necessary information.
Before sending a Pull Request, please make sure you read our Contribution Guidelines.
Please read the LICENSE file.
This project has adopted a Code of Conduct. If you have any concerns about the Code, or behavior which you have experienced in the project, please contact us at opensource@bloomberg.net.
If you believe you have identified a security vulnerability in this project, please send email to the project team at opensource@bloomberg.net, detailing the suspected issue and any methods you've found to reproduce it.
Please do NOT open an issue in the GitHub repository, as we'd prefer to keep vulnerability reports private until we've had an opportunity to review and address them.