This project collects the metadata of CKAN based Open Data Portals. In this specific case the metadata for the Berlin Open Data Portal. And saves them into a database as is for further analysis.
Run test via
pytest
Install packages with pip install -r requirements.txt
Create a .env
file.
Run local.py
to test it out locally.
This project is meant to be run in a serverless infrastructure. The preferred way is AWS Lambda.
For AWS Lambda the file handler.py
is the entry point. The event object does contain the information needed to proceed.
Especially the API Key for the Berlin Open Data Portal and the URL and City Name.
URL should be https://datenregister.berlin.de
and city_name
would be berlin
The load_data
method/entry will use the CKAN API to retreive all packages in the Open Data Portal and will start new lambda functions to import them into the DB.
This method does not need DB env parameters
import_package
can be called individually for a package_id
but will be called from load_data
for every package in the Open Data Portal.
Parameters are package_id
, url
, api_key
.
It will use the CKAN API to retreive the package information and save them including resources, tags, groups to the DB.
This method does need DB env parameters!!
DB_USER
DB_PASSWORD
DB_HOST
DB_PORT
DB_NAME
Will create the needed DB tables for the importer method. This method does need DB env parameters!!
Deployment can be done with the serverless
npm package. The profile needs to be update and the necessary AWS rights are needed.
Thanks goes to these wonderful people (emoji key):
Mila Frerichs 💻 📖 |
Sebastian Meier 💻 📖 |
Lucas Vogel 📖 |
This project follows the all-contributors specification. Contributions of any kind welcome!
|
Together with:
|
A project by:
|
Supported by:
|