Airflow DAGS for libsys processes and migrating ILS data into FOLIO
Run pip install -r requirements.txt
to install the dependencies.
Based on the documentation, Running Airflow in Docker.
NOTE Make sure there is enough RAM available locally for the docker daemon, we recommend at least 5GB.
- Clone repository
git clone https://github.com/sul-dlss/libsys-airflow.git
- If it's commented out, uncomment the line
- ./dags:/opt/airflow/dags
in docker-compose.yaml (undervolumes
, underx-airflow-common
). - Start up docker locally.
- Build the docker image with
Docker build .
- Create a
.env
file with theAIRFLOW_UID
andAIRFLOW_GROUP
values. - Run
docker-compose build
to build the customized airflow image. (Note: theusermod
command may take a while to complete when running the build.) - Run
docker compose up airflow-init
to initialize the Airflow - Bring up airflow,
docker compose up
to run the containers in the foreground, usedocker compose up -d
to run as a daemon. - Access Airflow locally at http://localhost:8080
- Log into the worker container using
docker exec -it libsys-airflow_airflow-worker-1 /bin/bash
to view the raw work files.
- Install
pip3
withapt install python3-pip
- Install python virtual enviroments:
apt install python3.8-venv
- Create the virtual envirnment in the home directory:
python3 -m venv virtual-env
- Install docker-compose in the virtual environment:
source virtual-env/bin/activate && pip3 install docker-compose
- List all the airflow tasks using
cap -AT airflow
cap {stage} airflow:build
cap {stage} airflow:init
cap {stage} airflow:start
- In the Airflow UI under Admin > Connections, add
bib_path
with connection typeFile (Path)
. - In the Airflow UI under Admin > Variables, import the
folio-dev-variables.json
file from shared_configs.
- In the Airflow UI under Admin > Variables, import the
aeon-variables.json
and thelobbytrack-variables.json
files from shared_configs.
All FOLIO related code should be in the folio
plugin. When developing
code in the plugin, you'll need to restart the airflow-webserver
container
by running docker-compose restart airflow-webserver
to see changes in
the running Airflow environment.
To run the test suite, use pytest passing in the location of where you have local clone repository of the folio_migration_tools/with the PYTHONPATH i.e.
PYTHONPATH='{path-to-folio_migration_tools}' pytest
To run with code coverage:
PYTHONPATH=../folio_migration_tools/ coverage run -m pytest
MARC data to be converted will be mounted on the sul-libsys-airflow server under /sirsi_prod
which is a mount of /s/SUL/Dataload/Folio
on the Symphony server.