Skip to content
This repository has been archived by the owner on Jun 5, 2021. It is now read-only.

simonwoerpel/memorious-sehrgutachten

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ARCHIVED

this scraper has been included into https://github.com/okfde/dokukratie/ and is maintained there

memorious-sehrgutachten

A simple memorious extension to download documents from the Wissenschaftliche Dienste des Deutschen Bundestags

Other than the name suggests, it's not technical based on https://sehrgutachten.de but scrapes the website of the bundestag directly.

It downloads the files and metadata into a local folder.

usage

The startdate and enddate parameters need to be set via env vars:

STARTDATE=2021-05-01 ENDDATE=`date '+%Y-%m-%d'` memorious run sehrgutachten

if running locally, make sure the memorious config env is set as well:

MEMORIOUS_CONFIG_PATH=src

local installation / developement

git clone https://github.com/simonwoerpel/memorious-sehrgutachten.git
cd memorious-sehrgutachten
pip install -e .

make changes

All the magic happens in src/sehrgutachten.py and src/sehrgutachten.yml

production use / deployment

To use the scraper for a production basis, a proper redis and psql should be used.

Please refer to the official documentation of memorious

About

Scrape public documents of "Wissenschaftliche Dienste des Deutschen Bundestags" via memorious into aleph.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published