The DataGov Harvest Orchestrator is a Flask-based application specifically developed to oversee and facilitate the data harvesting process. It is equipped to manage basic Create, Read, Update, and Delete (CRUD) operations for Harvest Source configurations and to initiate harvest jobs efficiently.
This project uses Poetry for dependency management. To set up the project:
-
Ensure Poetry is installed.
-
Clone the repository and navigate to the project directory:
-
Install dependencies using Poetry:
poetry install
-
Copy the sample environment file and set your local configurations:
cp .env.sample .env
Edit the
.envfile with your local settings.
-
Use the Makefile to set up local Docker containers, including a PostgreSQL database and the Flask application:
make build make up make test make cleanThis will start the necessary services and execute the test.
A database service is required for use on cloud.gov.
In a given Cloud Foundry space, a db can be created with
cf create-service <service offering> <plan> <service instance>.
In dev, for example, the db was created with
cf create-service aws-rds micro-psql harvesting-logic-db.
Creating databases for the other spaces should follow the same pattern, though the size may need to be adjusted (see available AWS RDS service offerings with cf marketplace -e aws-rds).
Any created service needs to be bound to an app with cf bind-service <app> <service>. With the above example, the db can be bound with
cf bind-service harvesting-logic harvesting-logic-db.
Accessing the service can be done with service keys. They can be created with cf create-service-keys, listed with cf service-keys, and shown with
cf service-key <service-key-name>.
-
Ensure you have a
manifest.ymlandvars.ymlfile configured for your Flask application. Thevars.ymlfile should include variables such asFLASK_APPand database service bindings. -
Deploy the application using Cloud Foundry's
cf pushcommand with the variable file:poetry export -f requirements.txt --output requirements.txt --without-hashes cf push --vars-file vars.yml
setup github workflow for commit, deployment.