An ecuadorian supermarket shows its prices in a webpage so as to consumer can check product prices. This webpage doesn't allow to effectively filter, sort and search the price for a particular product. In this way, the purpose of this app is to create an app to access the product prices information and improve the accessibility to consumers. The app can be accessed in link. This repo is for testing only so it extracts and loads data for 200 products but the DAG can be easily modified to extract the whole data.
The application is made up a Airflow app that schedules daily updates to the Webpsage database. The Webpage is a plotly's Dash app that reads data from a PostgreSQL database and present it in a table. The table is capable of filtering and searching. The architecture is discribed in the following figure.
The following instructions guides to the creation of two separate apps in Heroku. One app runs the web interface of the table and connection to a Postgres database. The other app runs Airflow DAG that extracts data from the Web API and then loads the data the database of the first app.
-
Clone repository
-
Create conda environment for Python project (In case of local testing)
conda create -n myenv python=3.8.10
conda activate myenv
-
Install libraries
pip install dash pandas psycopg2-binary requests pandas pip install apache-airflow==2.1.0 apache-airflow-providers-postgres
-
Login to Heroku CLI (Instructions for installation)
heroku login
-
Inside
dashApp
directory create git repositorycd dashApp git init
-
Create app in Heroku
heroku create --app <WEB-APP-NAME>
-
Add Heroku Postgres to app
Run from terminal
heroku addons:create heroku-postgresql:hobby-dev --app <WEB-APP-NAME>
-
Write down
DATABASE_URL
config var to use it in step 20heroku config
-
Commit changes
git add . git commit -am "Initial commit" git push heroku master
-
Inside
airflowApp
directory create git repositorygit init
-
Create an app in Heroku CLI
heroku create --app <AIRFLOW-APP-NAME>
-
Add Heroku Postgres to app
Run from terminal
heroku addons:create heroku-postgresql:hobby-dev --app <AIRFLOW-APP-NAME>
-
Create Fernet key
From Python terminal
>>> from cryptography import fernet >>> fernet.Fernet.generate_key() b'pZcwcoB8RQfjtE9n0Du5Weu8zLKoFphKkiGDBihOwcM=' >>>
-
Create
Config Vars
Create the following Config Vars
AIRFLOW__CORE__FERNET_KEY = <YOUR-FERNET-KEY>
AIRFLOW__CORE__LOAD_EXAMPLES = False
AIRFLOW__CORE__SQL_ALCHEMY_CONN = <SAME-AS-DATABASE_URL-CONFIG-VAR>
AIRFLOW__WEBSERVER__AUTHENTICATE = True
AIRFLOW_HOME = /app
Can be created in two ways:
- Heroku CLI as:
heroku config:set <CONFIG-VAR-NAME> = <CONFIG-VAR-VALUE>
- From Settings in Heroku App webpage
-
Set
Procfile
to initialize airflow database- On
Procfile
writeweb: airflow db init
- On
-
Commit changes to heroku repo
git add . git commit -am "Initial commit" git push heroku master
-
SSH to Heroku instance
Run
heroku run bash
-
In heroku app terminal create User
airflow users create \ --username admin \ --firstname Peter \ --lastname Parker \ --role Admin \ --email your-email@mail.org
-
Exit from heroku app terminal
-
Log in to Airflow Web UI at
<AIRFLOW-APP-NAME>.herokuapp.com
and create variable to store web app's database url from step 8Name the variable
db_url
-
Set
Procfile
to run airflow webserver and scheduler- On
Procfile
writeweb: airflow webserver --port $PORT --daemon & airflow scheduler
- On
-
Commit changes to heroku repo
git add . git commit -am "Update Procfile" git push heroku master
-
Log in to Airflow Web UI and run DAG