This package was provided by "Die Post" for the public domain.
More information about the license, see this link.
- Sign-up here
- Go to the download-center
- Download the resource.
Here is how to build this package.
python __main__.py --build=wordlist
python __main__.py --build=pos
python __main__.py --build=ner
The documentation about the resource can be found here.
As we see in the document, we do not need all tables, since we only want to build text-files.
This tables are not needed:
- NEW_HEA (00)
- NEW_GEB (06)
- NEW_GEBA (07)
- NEW_BOT_B (08)
- NEW_GEB_COM (12)
To save space, we only include a filtered version in Github.
The original file contains ~4'000'000 lines. After the removing of the unnecessary tables it contains ~2'000'000 lines.
Place a current dump of the resources at data/adressdaten_raw.csv
and run the script python /data/__main__.py
.
This will create a new adressdaten.csv
file.