Skip to content

Small tool for aggregating and grouping data. Written in Python, mimicking functionality of my older C++ data-explorer project. Created to refresh Python skills, have some fun and compare speed of C++/Go/Python versions.

License

Notifications You must be signed in to change notification settings

przemek83/data-explorer-python

Repository files navigation

About project

Small tool for aggregating and grouping data. Written in Python, mimicking the functionality of my older C++ data-explorer project. Created to refresh Python skills, have some fun, exercise TDD and compare the speed of C++/Go/Python versions.

Usage

data_explorer.py [-h] file {avg,min,max} aggregation grouping
Where:

  • file - name of file with data to load,
  • {avg,min,max} - type of operation, use one of those,
  • aggregation - name of column used for aggregating data,
  • grouping - name of column used for grouping data.

Example usage:
python data_explorer.py sample.txt avg score first_name

Example output:

Data loading completed in 0.000227s
Operation completed in 0.000011s
Results:  {'tim': 8.0, 'tamas': 5.5, 'dave': 8.0}

Input data format

Input data needs to have the following structure:

<column 1 name>;<column 2 name>;<column 3 name>  
<column 1 type>;<column 2 type>;<column 3 type>  
<data 1 1>;<data 2 1>;<data 3 1> 
...  
<data 1 n>;<data 2 n>;<data 3 n> 

Where column type can be string or integer.

Example data:

first_name;age;movie_name;score
string;integer;string;integer
tim;26;inception;8
tim;26;pulp_fiction;8
tamas;44;inception;7
tamas;44;pulp_fiction;4
dave;0;inception;8
dave;0;ender's_game;8

Such a simple and strict format of data was used for simplicity of parsing.

Tox

For static analysis and testing, the Tox tool is used in Python 3.7 and Python 3.11 environments. The following tools are added for each Tox run:

  • flake8
  • mypy
  • pylint
  • prospector
  • bandit
  • pytest code coverage report

To launch the Tox sequence of checks, install it and type tox in the project directory.

About

Small tool for aggregating and grouping data. Written in Python, mimicking functionality of my older C++ data-explorer project. Created to refresh Python skills, have some fun and compare speed of C++/Go/Python versions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages