Public Data Science Utilities @ reBuy
- Documentation: https://rebuy-de.github.io/ds-pub-utils/
- Write to us: datascience@rebuy.com
This package is, by and large, under active development and nothing should be taken here for granted. It is intended to be used as part of other, internal, workflows. Therefore, it is very likely that changes will occur. It is available under the MIT license.
Lastly, this is a public repository; DO NOT INCLUDE ANY BUSINESS LOGIC NOR DATA NOR ANYTHING CONFIDENTIAL!
Along the design lines of Scikit learn
, the classes in this module provide fit
, transform
and fit_transform
functionalities.
However, when transforming data using these classes, new features are added, and nothing is removed.
Inspired by sklearn-pandas, this module provide preprocessing functionalities for columns of a DataFrame. In contrast to the features engineering module, this one doesn't append columns to the data, but rather replaces.
Utilities for data fetching
- (Optional but recommended) Start a new virtual environment.
- Either using
conda create --name test-this python=3
. The package needs Python 3.x. - Or, use the provided
environment.yml
.
- Either using
- Clone the repository
- Run
pip install -e .
from the directory of the package - (Optional) you can run
pytest
from the root of the package and see if all tests passes
The function data_fetch.from_sql_sever
uses pymssql
which in turn depends on freetds
.
If you want to use this function, make sure you install pymssql
.
This SO thread might be helpful as well
At {virtualenv}/lib/python2.7/site-packages/
(if not using virtualenv
then {system_dir}/lib/python2.7/dist-packages/
) remove the egg file (e.g. pubdsutils-0.6.34-py2.7.egg
) if there is any.
From file easy-install.pth
, remove the corresponding line (it should be a path to the source directory or of an egg file).
Source is SO answer.
- Use
flake8 --exclude=build
to check that the code is well styled - Use
pytest --cov-report term-missing --cov=pubdsutils tests/
to check the tests coverage - Execute
sphinx-apidoc -f -o . ../pubdsutils/
from./docs
when adding/removing module/packages - Documentation:
make html
from./docs
will generate the documentation.- After building the docs, you can publish them (
./docs/_build/html
) to thegh-pages
branch. Most easily, this can be done, byghp-import -n -p docs/_build/html
from the project's root.