cardinality.py
is a small tool written in Python for examining
the cardinality of a relation between two datasets. All the code is
contained within a single file that can be imported using Python's
import mechanism or used as a command-line tool.
The code has been tested with Python 3.7.
Source repository: https://github.com/naturhistoriska/cardinality.py
Table of contents
- Python 3
- The Python library pandas
An easy way to get Python working on your computer is to install the free Anaconda distribution.
The project is hosted at <https://github.com/naturhistoriska/cardinality.py> and can be downloaded using git:
$ git clone https://github.com/naturhistoriska/cardinality.py
$ ./cardinality.py --help usage: cardinality.py [-h] [-V] [-v] [-p column [column ...]] [-f column [column ...]] pk-file fk-file Command-line utility for examining the cardinality of the relation between two TSV-files. positional arguments: pk-file TSV-file with primary keys fk-file TSV-file with foreign keys optional arguments: -h, --help show this help message and exit -V, --version show program's version number and exit -v, --verbose show verbose output -p column [column ...] primary key columns -f column [column ...] foreign key columns
Examine the relation between two example datasets included in this repository.
$ ./cardinality.py test_files/pk-data.tsv test_files/fk-data.tsv -p pk -f fk 0,1 to 0,3
Testing is carried out with pytest:
$ pytest -v test_cardinality.py
Test coverage can be calculated with Coverage.py using the following commands:
$ coverage run -m pytest $ coverage report -m cardinality.py
The code follow style conventions in PEP8, which can be checked with pycodestyle:
$ pycodestyle cardinality.py test_cardinality.py
cardinality.py
is distributed under the
MIT license.
Markus Englund, markus.englund@nrm.se