This repository contains the DAG code used in the Use Cohere and OpenSearch to analyze customer feedback in an MLOps pipeline use case.
The DAG in this repository uses the following packages:
This section explains how to run this repository with Airflow. Note that you will need to copy the contents of the .env_example
file to a newly created .env
file and provide your own value for <your-cohere-api-key>
. You can find your Cohere API key in the Cohere dashboard, a free account is sufficient to run this example.
Download the Astro CLI to run Airflow locally in Docker. astro
is the only package you will need to install locally.
- Run
git clone https://github.com/astronomer/airflow-pgvector-tutorial.git
on your computer to create a local clone of this repository. - Install the Astro CLI by following the steps in the Astro CLI documentation. Docker Desktop/Docker Engine is a prerequisite, but you don't need in-depth Docker knowledge to run Airflow with the Astro CLI.
- Run
astro dev start
in your cloned repository. - After your Astro project has started. View the Airflow UI at
localhost:8080
.
In this project astro dev start
spins up 6 Docker containers:
- The Airflow webserver, which runs the Airflow UI and can be accessed at
https://localhost:8080/
. - The Airflow scheduler, which is responsible for monitoring and triggering tasks.
- The Airflow triggerer, which is an Airflow component used to run deferrable operators.
- The Airflow metadata database, which is a Postgres database that runs on port
5432
. - A Python container running a mock API that generates synthetic customer feedback data, accessible at port
5000
. - A local OpenSearch instance, that runs on port
9200
.
- Use Cohere and OpenSearch to analyze customer feedback in an MLOps pipeline use case.
- Orchestrate OpenSearch operations with Apache Airflow.
- Orchestrate Cohere LLMs with Apache Airflow
- Airflow OpenSearch provider documentation.
- Airflow Cohere provider documentation.
- Cohere documentation.
- OpenSearch documentation.