Executes work for a datastore by polling a job-service for available jobs. Uses subprocesses for asynchronous workloads.
To work on this repository you need to install poetry:
# macOS / linux / BashOnWindows
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python -
# Windows powershell
(Invoke-WebRequest -Uri https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py -UseBasicParsing).Content | python -
Then install the virtual environment from the root directory:
poetry install
Use plugin Poetry and add Python Interpreter "Poetry Environment". See https://plugins.jetbrains.com/plugin/14307-poetry
Open terminal and go to root directory of the project and run:
poetry run pytest --cov=job_executor/
docker build --tag job_executor .
To stub out collaborating services run the following:
cd wiremock
docker run -it --rm \
-p 8080:8080 \
--name wiremock \
-v $PWD:/home/wiremock \
wiremock/wiremock:3x
Access http://localhost:8080/__admin/mappings to display the mappings.
There is an initial set of mappings under wiremock/mappings
. Feel free to add more if needed.
Then set the PSEUDONYM_SERVICE_URL and JOB_SERVICE_URL to http://localhost:8080 and run the application.
- Poetry - Python dependency and package management
- PyMongo - MongoDB Driver
- PyArrow - Apache Arrow
- Pandas - Data analysis and manipulation
- microdata-tools - dataset packaging & validation