This is a small etl service application written in python (3.6+). It is a lightweight CLI tool using sqlite3 as database and the flask webframework for exposing an web api
- python 3.7 or above
- pip3
- Install pipenv using pip
pip install pipenv
- Change directory into the application folder 'yieldify-etl' where you should see two files named Pipfile and Pipfile.lock
- Create virtual environment and install required packages within it
pipenv install Pipfile.lock
For help on available commands use -h flag where relevant: e.g.
pipenv run main.py -h
-
Create database and extract input file (input_data.gz)*
pipenv run main.py -c main.conf rebuild-database <gzipped input data>
-
Print stats to stdout
pipenv run main.py -c main.conf run -t stdout
-
Create database and extract input data and print to stdout
pipenv run main.py -c main.conf rebuild-database <gzipped input data> run -t stdout
-
Provide stats through api
pipenv run main.py -c main.conf run -t api
This will trigger to run flask's dev webserver on
localhost:8080
. Open a browser or any other http tool for making a GET request.For example:
localhost:8080/stats/browser?start_date=2014-10-11%2000:00:00&end_date=2014-10-13%2018:01:01
Result:
[ {"browser":"Mobile Safari","percentage":0.2980790074}, {"browser":"Chrome","percentage":0.2649029181}, {"browser":"IE","percentage":0.1788866294}, {"browser":"Firefox","percentage":0.082486039}, {"browser":"Safari","percentage":0.0690256717} ]