Skip to content

Commit

Permalink
Release/0.8.0 (#80)
Browse files Browse the repository at this point in the history
* Merge back to develop

* Simplifying viz.draw syntax in tutorial notebook (#46)

* Add non negativity constraint in numpy lasso (#41)

* Add plotting tutorial to the documentation (#47)

* Unpin some requirements

* Mixed type data generation (#55)

Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.

* Merge back to develop (#59)

* Pytorch NOTEARS (#63)

* NoTears as ScoreSolver

* refactor continuous solver

* adding attribute to access weight matrix

* refactoring continuous solver

* Adding fit_lasso method

* add data_gen_continuous.py and tests (#38)

* add data_gen.py

* rename

* wrap SM

* move data_gen_continous, create test

* more coverage

* test fixes

* move discrete sem to another file

* node list dupe check test

* ValueError tests

* replace dag and sem functions with Ben's verions

* add Ben's tests

* fix fstring

* to_numpy_array coverage

* Ben's comments

* remove unreachable ValueError for coverage

* remove unused fixture

* remove redundant test

* remove extensions

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docstring

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docstring

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docs

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* doc

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* rename file, g_dag rename to sm

* add new tests for equal weights

* docstring

* steve docstring, leq fix

* steve comments + docstrings

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* Adding check input and removing some inner functions

* Removing attribute original_ndarray

* Aligning from pandas with new implementation

* Adding tests for fit_lasso

* More tests for lasso

* wrapping tabu params in a dict

* Aligning tests with new tabu params

* Aligning from_pandas with new tabu_params

* Adding fit_intercept option to _fit method

* Adding scaling option

* fixing lasso tests

* Adding a test for fit_intercept

* scaling option only with mean

* Correction in lasso bounds

* Fix typos

* Remove duplicated bounds function

* adding comments

* add torch files from xunzheng

* add from_numpy_torch function that works like from_numpy_lasso

* lint

* add requirements

* add debug functionality

* add visual debug test

* add license

* allow running as main for viz, comments

* move to contrib

* make multi layer work a bit better

* add comment for multi layer

* use polynomial dag constraint for better speed comparison

* revert unnecessary changes to keep PR lean

* revert unnecessary changes to keep PR lean

* revert unnecessary changes to keep PR lean

* fixes

* refactor

* Integrated tests

* Checkpoint

* Refactoring

* Finished initial refactoring

* All tests passed

* Cleaning

* Git add testing

* Get adjacency matrix

* Done cleaning

* Revert change to original notears

* Revert change to original structuremodel

* Revert change to pylintrc

* Undo deletion

* Apply suggestions from Zain

Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* Addressed Zain comments

* Migrated from_numpy

* Delete contrib test

* Migrated w_threshold

* Some linting

* Change to None

* Undo deletion

* List comprehension

* Refactoring scipy and remove scipy optimiser

* Refactoring

* Refactoring

* Refactoring complete

* change from np to torch tensor

* More refactoring

* Remove hnew equal to None

* Refactor again and remove commented line

* Minor change

* change to params

* Addressing Philip's comment

* Add property

* Add fc2 property weights

* Change to weights

* Docstring

* Linting

* Linting completed

* Add gpu code

* Add gpu to from_numpy and from_pandas

* cuda 0 run out of memory

* Debugging

* put 5

* debugging gpu

* shift to inner loop

* debugging not in place

* Use cada instead of to

* Support both interfaces

* Benchmarking gpu

* Minor fix

* correct import path for test

* change gpu from 5 to 1

* Debugging

* Debugging

* Experimenting

* Linting

* Remove hidden layer and gpu

* Linting

* Testing and linting

* Correct pytorch to torch

* Add init zeros

* Change weight threshold to 0.25

* Revert requirements.txt

* Update release.md

* Address coments

* Corrected release.md

* fc1 to adjacency

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: LiseDiagneQB <60981366+LiseDiagneQB@users.noreply.github.com>
Co-authored-by: Casey Juanxi Li <50737712+caseyliqb@users.noreply.github.com>
Co-authored-by: qbphilip <philip.pilgerstorfer@quantumblack.com>
Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* Pinned sphinx-auto-doc-typehints (#66)

* Corrected a spelling/grammar mistake (#55)

* Fix/lint (#73)

* Hotfix/0.4.3 (#7) - Address broken links and grammar

* Fix documentation links in README (#2)

* Fix links in README

* library -> libraries

* Fix github link in docs

* Clean up grammar and consistency in documentation (#4)

* Clean up grammar and consistency in `README` files

* Add esses, mostly

* Reword feature description to not appear automatic

* Update docs/source/05_resources/05_faq.md

Co-Authored-By: Ben Horsburgh <benhorsburgh@outlook.com>

Co-authored-by: Ben Horsburgh <benhorsburgh@outlook.com>

* hotfix/0.4.3: fix broken links

Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Release/0.5.0

* Plotting now backed by pygraphviz. This allows:
   * More powerful layout manager
   * Cleaner fully customisable theme
   * Out-the-box styling for different node and edge types
* Can now get subgraphs from StructureModel containing a specific node
* Bugfix to resolve issue when fitting CPDs with some missing states in data
* Minor documentation fixes and improvements

* Release/0.6.0

* Release/0.7.0 (#57)

* Added plottting tutorial to the documentation
* Updated `viz.draw` syntax in tutorial notebooks
* Bugfix on notears lasso (`from_numpy_lasso` and `from_pandas_lasso`) where the non-negativity constraint was not being set
* Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.
* Unpinned some requirements

* black

* pin pytorch version

* pin pytorch version

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Structure learning regressor (#68)

* initial commit (local copy-paste)

* fixed minor comments

* minor bugfix

* impute from children inital commit

* bugfixes and method option

* auto thresholding

* autothreshold and bugfix

* make threshold removal explicit

* add l1 argument

* remove child imputation

* feat importance fix and tabu logic

* moved threshold till dag

* restructure with base class

* coef mask

* recipe

* enable bias fitting

* persist bias as node attribute

* allow fit_intercept

* minor PR comment fixes

* minor comment adjustment

* test coverage and l1 clarification

* recipe

* minor test fixes

* more tests

* full test coverage

* revove python 3.5/3.6 unsupported import

* add normalization option

* idiomatic typing

* correct pylint errors

* update some tests

* more typeing updates

* more pylint requirements

* more pylint disable

* python 3.5 support

* try to get to work with 3.5

* full coverage and 3.5 support

* remove base class to pass test

* remove unneeded supression

* black formatting changes

* remove unused import

* pytlint supression

* minor reformat change

* isort fix

* better defensive programming

* fix unittests

* docstring update

* do Raises docstring properly

* action SWE suggestions

* hotfixes

* minor update

* minor black formatting change

* final merge checkbox

* fix end of file

* Data Gen root node initialisation fix (#72)

* Hotfix/0.4.3 (#7) - Address broken links and grammar

* Fix documentation links in README (#2)

* Fix links in README

* library -> libraries

* Fix github link in docs

* Clean up grammar and consistency in documentation (#4)

* Clean up grammar and consistency in `README` files

* Add esses, mostly

* Reword feature description to not appear automatic

* Update docs/source/05_resources/05_faq.md

Co-Authored-By: Ben Horsburgh <benhorsburgh@outlook.com>

Co-authored-by: Ben Horsburgh <benhorsburgh@outlook.com>

* hotfix/0.4.3: fix broken links

Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Release/0.5.0

* Plotting now backed by pygraphviz. This allows:
   * More powerful layout manager
   * Cleaner fully customisable theme
   * Out-the-box styling for different node and edge types
* Can now get subgraphs from StructureModel containing a specific node
* Bugfix to resolve issue when fitting CPDs with some missing states in data
* Minor documentation fixes and improvements

* Release/0.6.0

* Release/0.7.0 (#57)

* Added plottting tutorial to the documentation
* Updated `viz.draw` syntax in tutorial notebooks
* Bugfix on notears lasso (`from_numpy_lasso` and `from_pandas_lasso`) where the non-negativity constraint was not being set
* Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.
* Unpinned some requirements

* fix for consinuous normal data

* generalise across all dtypes

* support fit_intercept

* fixed many test errors

* test logic fixes

* lint test fixes

* python 3.5 failure change

* minor test bugfix

* black

* pin pytorch version

* pin pytorch version

* additional test parameter

* black formatting

* requested changes

* test updates and docstring

* black format change

* disable too many lines

* change

* move recipe to tutorial folder

* releaseMD changes

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>
Co-authored-by: qbphilip <philip.pilgerstorfer@quantumblack.com>

* [1/2] Poisson data for data gen (#61)

* Hotfix/0.4.3 (#7) - Address broken links and grammar

* Fix documentation links in README (#2)

* Fix links in README

* library -> libraries

* Fix github link in docs

* Clean up grammar and consistency in documentation (#4)

* Clean up grammar and consistency in `README` files

* Add esses, mostly

* Reword feature description to not appear automatic

* Update docs/source/05_resources/05_faq.md

Co-Authored-By: Ben Horsburgh <benhorsburgh@outlook.com>

Co-authored-by: Ben Horsburgh <benhorsburgh@outlook.com>

* hotfix/0.4.3: fix broken links

Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Release/0.5.0

* Plotting now backed by pygraphviz. This allows:
   * More powerful layout manager
   * Cleaner fully customisable theme
   * Out-the-box styling for different node and edge types
* Can now get subgraphs from StructureModel containing a specific node
* Bugfix to resolve issue when fitting CPDs with some missing states in data
* Minor documentation fixes and improvements

* Release/0.6.0

* Release/0.7.0 (#57)

* Added plottting tutorial to the documentation
* Updated `viz.draw` syntax in tutorial notebooks
* Bugfix on notears lasso (`from_numpy_lasso` and `from_pandas_lasso`) where the non-negativity constraint was not being set
* Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.
* Unpinned some requirements

* refactor & docstring

* remove unused helper object

* add data gen to init

* make test more robust

* add count data and test, use logs for poisson samples for stability

* fix tests

* duplicate fixtures

* remove unused fixtures

* refactor data_generators into package with core and wrappers

* move wrapper to test_wrapper

* variable name change bugfix

* fix tests

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: angeldrothqb <angel.droth@quantumblack.com>

* [2/2] Nonlinear Data gen (#60)

* Hotfix/0.4.3 (#7) - Address broken links and grammar

* Fix documentation links in README (#2)

* Fix links in README

* library -> libraries

* Fix github link in docs

* Clean up grammar and consistency in documentation (#4)

* Clean up grammar and consistency in `README` files

* Add esses, mostly

* Reword feature description to not appear automatic

* Update docs/source/05_resources/05_faq.md

Co-Authored-By: Ben Horsburgh <benhorsburgh@outlook.com>

Co-authored-by: Ben Horsburgh <benhorsburgh@outlook.com>

* hotfix/0.4.3: fix broken links

Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>

* Release/0.5.0

* Plotting now backed by pygraphviz. This allows:
   * More powerful layout manager
   * Cleaner fully customisable theme
   * Out-the-box styling for different node and edge types
* Can now get subgraphs from StructureModel containing a specific node
* Bugfix to resolve issue when fitting CPDs with some missing states in data
* Minor documentation fixes and improvements

* Release/0.6.0

* Release/0.7.0 (#57)

* Added plottting tutorial to the documentation
* Updated `viz.draw` syntax in tutorial notebooks
* Bugfix on notears lasso (`from_numpy_lasso` and `from_pandas_lasso`) where the non-negativity constraint was not being set
* Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.
* Unpinned some requirements

* refactor & docstring

* remove unused helper object

* add data gen to init

* make test more robust

* add count data and test, use logs for poisson samples for stability

* add nonlinear

* fix tests

* duplicate fixtures

* remove unused fixtures

* refactor data_generators into package with core and wrappers

* move wrapper to test_wrapper

* add nonlinear to init

* change order in all

* change release.md

* root node fix on core + count

* nonlinear support to wrappers

* docstring update

* bugfix and reproducability fix

* many tests and test updates

* poiss bugfix and test fix

* moar test coverage

* categorical dataframe test coverage

* full test coverage and linting

* fix linting and fstring

* black reformat

* fix unused pylint argument

* pytest fix

* FINAL linting fix

* Fix stuff (#75)

CircleCI fixes

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: angeldrothqb <angel.droth@quantumblack.com>
Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* update black version (#76)

* fix black

* Fix/check for NA or Infinity when notears is used  (#54)

* update scipy version (#77)

* add DYNOTEARS implementation (#50)

Adds DYNOTEARS and corresponding data generator (for testing)

* Pytorch NOTEARS extension - Non-Linear/Hidden Layer (#65)

* NoTears as ScoreSolver

* refactor continuous solver

* adding attribute to access weight matrix

* refactoring continuous solver

* Adding fit_lasso method

* add data_gen_continuous.py and tests (#38)

* add data_gen.py

* rename

* wrap SM

* move data_gen_continous, create test

* more coverage

* test fixes

* move discrete sem to another file

* node list dupe check test

* ValueError tests

* replace dag and sem functions with Ben's verions

* add Ben's tests

* fix fstring

* to_numpy_array coverage

* Ben's comments

* remove unreachable ValueError for coverage

* remove unused fixture

* remove redundant test

* remove extensions

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docstring

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docstring

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* docs

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* doc

Co-Authored-By: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* rename file, g_dag rename to sm

* add new tests for equal weights

* docstring

* steve docstring, leq fix

* steve comments + docstrings

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>

* Adding check input and removing some inner functions

* Removing attribute original_ndarray

* Aligning from pandas with new implementation

* Adding tests for fit_lasso

* More tests for lasso

* wrapping tabu params in a dict

* Aligning tests with new tabu params

* Aligning from_pandas with new tabu_params

* Adding fit_intercept option to _fit method

* Adding scaling option

* fixing lasso tests

* Adding a test for fit_intercept

* scaling option only with mean

* Correction in lasso bounds

* Fix typos

* Remove duplicated bounds function

* adding comments

* add torch files from xunzheng

* add from_numpy_torch function that works like from_numpy_lasso

* lint

* add requirements

* add debug functionality

* add visual debug test

* add license

* allow running as main for viz, comments

* move to contrib

* make multi layer work a bit better

* add comment for multi layer

* use polynomial dag constraint for better speed comparison

* revert unnecessary changes to keep PR lean

* revert unnecessary changes to keep PR lean

* revert unnecessary changes to keep PR lean

* fixes

* refactor

* Integrated tests

* Checkpoint

* Refactoring

* Finished initial refactoring

* All tests passed

* Cleaning

* Git add testing

* Get adjacency matrix

* Done cleaning

* Revert change to original notears

* Revert change to original structuremodel

* Revert change to pylintrc

* Undo deletion

* Apply suggestions from Zain

Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* Addressed Zain comments

* Migrated from_numpy

* Delete contrib test

* Migrated w_threshold

* Some linting

* Change to None

* Undo deletion

* List comprehension

* Refactoring scipy and remove scipy optimiser

* Refactoring

* Refactoring

* Refactoring complete

* change from np to torch tensor

* More refactoring

* Remove hnew equal to None

* Refactor again and remove commented line

* Minor change

* change to params

* Addressing Philip's comment

* Add property

* Add fc2 property weights

* Change to weights

* Docstring

* Linting

* Linting completed

* Add gpu code

* Add gpu to from_numpy and from_pandas

* cuda 0 run out of memory

* Debugging

* put 5

* debugging gpu

* shift to inner loop

* debugging not in place

* Use cada instead of to

* Support both interfaces

* Benchmarking gpu

* Minor fix

* correct import path for test

* change gpu from 5 to 1

* Debugging

* Debugging

* Experimenting

* Linting

* Remove hidden layer and gpu

* Linting

* Testing and linting

* Correct pytorch to torch

* Add init zeros

* Change weight threshold to 0.25

* Revert requirements.txt

* Add hidden layer

* small refactor

* directional adj

* minor edits

* fix bias issues

* breaking changes update to the interface

* typo

* new regressor regularisation interface

* update forward method

* forward(X) predictions work

* working!

* bugfix data normalisation

* some fixes

* average regularisation and adj calc at end

* give credit!

Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>

* loc lin docstring update

Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>

* docstring + fc1/fc2 name updates

* moar docstring updates

* more minor updates

* remove normalize option

* plotting util

* rename to DAGRegressor

* rename and checks

* more util functions

* fix bias

* fix bias with no intercept

* fix linear adj

* add tests

* minor fix

* minor fixes

* extend interface to bias

* differentialte coef_ and feature_imporances

* seperate bias element

* tests

* more test coverage

* nonlinear test coverage

* test hotfix

* more test coverage

* test requirements update

* more test coverage

* formatting changes

* final pylint change

* more linting

* more bestpractice structuring

* more minor fixes

* FINAL linting updates

* actual last change

* update to reg defaults, additions to the tutorial

* nonlinear regularisation updates

* regressor tutorial

* almost finishing touches

* gradient based h function!

* soft clamp and coef feature importance seperation

* small api update, closer to batchnorm

* docstring updates

* stronger soft clamping

* gradient L1 rather than L2

* fcpos neg removal, gradient optim

* revert back to create_graph=True for 2nd derivative

* remove print and test fix

* black reformatting

* new black version

* full test coverage

* isort fix

* pylint fix

* first layer h(W) for speed optimization

* fix batch norm system

* add nonlinear test

* test hotfix

* black reformat

* isort fix

* remove X requirement from h_func

* regressor tutorial final commit and black update

* LayerNorm replacement

Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>

* major changes

* add standardization

* minort changes

* fix tests

* rename reg parameters

* linting

* test coverage, docstting

* check array for infs

* fix isinstance to base type

* fix isort, add test coverage

* new tutorial

* docstring fix

Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* test string match

Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* assert improvement

Co-authored-by: Zain Patel <zain.patel@quantumblack.com>

* SWE suggestions

* minor bugfix

* more test fixing

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: LiseDiagneQB <60981366+LiseDiagneQB@users.noreply.github.com>
Co-authored-by: Casey Juanxi Li <50737712+caseyliqb@users.noreply.github.com>
Co-authored-by: qbphilip <philip.pilgerstorfer@quantumblack.com>
Co-authored-by: Zain Patel <zain.patel@quantumblack.com>
Co-authored-by: angeldrothqb <angel.droth@quantumblack.com>
Co-authored-by: angeldrothqb <67913551+angeldrothqb@users.noreply.github.com>
Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>

* release.md, version bump, docs

Co-authored-by: Ben Horsburgh <Ben.Horsburgh@quantumblack.com>
Co-authored-by: GabrielAzevedoFerreiraQB <57528979+GabrielAzevedoFerreiraQB@users.noreply.github.com>
Co-authored-by: Philip Pilgerstorfer <34248114+qbphilip@users.noreply.github.com>
Co-authored-by: stevelersl <55385183+SteveLerQB@users.noreply.github.com>
Co-authored-by: LiseDiagneQB <60981366+LiseDiagneQB@users.noreply.github.com>
Co-authored-by: Casey Juanxi Li <50737712+caseyliqb@users.noreply.github.com>
Co-authored-by: qbphilip <philip.pilgerstorfer@quantumblack.com>
Co-authored-by: Zain Patel <zain.patel@quantumblack.com>
Co-authored-by: KING-SID <sidhantbendre22@gmail.com>
Co-authored-by: Zain Patel <30357972+mzjp2@users.noreply.github.com>
Co-authored-by: Nikos Tsaousis <tsanikgr@users.noreply.github.com>
Co-authored-by: Deepyaman Datta <deepyaman.datta@utexas.edu>
Co-authored-by: Jebq <jb.oger2312@gmail.com>
  • Loading branch information
14 people committed Sep 10, 2020
1 parent 595907c commit 8265f64
Show file tree
Hide file tree
Showing 39 changed files with 7,148 additions and 685 deletions.
4 changes: 4 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,16 @@ utils:
echo ". /home/circleci/miniconda/etc/profile.d/conda.sh" >> $BASH_ENV
echo "conda deactivate; conda activate causalnex_env" >> $BASH_ENV
# needed to control numpy multithreading code since circleci gives incorrect CPU counts
echo "export MKL_NUM_THREADS=1 && export OMP_NUM_THREADS=1 && export NUMEXPR_NUM_THREADS=1" >> $BASH_ENV
setup_requirements: &setup_requirements
name: Install PIP dependencies
command: |
echo "Python version: $(python --version 2>&1)"
pip install -r requirements.txt -U
pip install -r test_requirements.txt -U
pip install ".[pytorch]"
conda install -y virtualenv
setup_pre_commit: &setup_pre_commit
name: Install pre-commit hooks
Expand Down
2 changes: 1 addition & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,7 @@ contextmanager-decorators=contextlib.contextmanager
# List of members which are set dynamically and missed by pylint inference
# system, and so shouldn't trigger E1101 when accessed. Python regular
# expressions are accepted.
generated-members=
generated-members=torch.*

# Tells whether missing members accessed in mixin class should be ignored. A
# mixin class is detected if its name ends with "mixin" (case insensitive).
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The CausalNex team pledges to foster and maintain a welcoming and friendly commu

We use [GitHub Issues](https://github.com/quantumblacklabs/causalnex/issues) to keep track of known bugs. We keep a close eye on them and try to make it clear when we have an internal fix in progress. Before reporting a new issue, please do your best to ensure your problem hasn't already been reported. If so, it's often better to just leave a comment on an existing issue, rather than create a new one. Old issues also can often include helpful tips and solutions to common problems.

If you are looking for help with your code in our documentation haven't helped you, please consider posting a question on [Stack Overflow](https://stackoverflow.com/questions/tagged/causalnex). If you tag it `causalnex` and `python`, more people will see it and may be able to help. We are unable to provide individual support via email. In the interest of community engagement we also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.
If you are looking for help with your code and our documentation hasn't helped you, please consider posting a question on [Stack Overflow](https://stackoverflow.com/questions/tagged/causalnex). If you tag it `causalnex` and `python`, more people will see it and may be able to help. We are unable to provide individual support via email. In the interest of community engagement we also believe that help is much more valuable if it's shared publicly, so that more people can benefit from it.

If you're over on Stack Overflow and want to boost your points, take a look at the `causalnex` tag and see if you can help others out by sharing your knowledge. It's another great way to contribute.

Expand Down
14 changes: 12 additions & 2 deletions RELEASE.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
# Upcoming release

# Release 0.8.0

* Add DYNOTEARS (`from_numpy_dynamic`, an algorithm for structure learning on Dynamic Bayesian Networks).
* Added Pytorch implementation for NOTEARS MLP (`pytorch.from_numpy`) which is much faster and allows nonlinear modelling.
* Added `DAGRegressor` sklearn interface using the Pytorch NOTEARS implementation.
* Add non-linear data generators for multiple data types.
* Add a count data type to the data generator using a zero-inflated Poisson.
* Set bounds/max class imbalance for binary features for the data generators.
* Bugfix to resolve issue when applying NOTEARS on data containing NaN.
* Bugfix for data_gen system. Fixes issues with root node initialization.

# Release 0.7.0

* Added plottting tutorial to the documentation
* Added plotting tutorial to the documentation
* Updated `viz.draw` syntax in tutorial notebooks
* Bugfix on notears lasso (`from_numpy_lasso` and `from_pandas_lasso`) where the non-negativity constraint was not being set
* Added DAG-based synthetic data generator for mixed types (binary, categorical, continuous) using a linear SEM approach.
Expand Down Expand Up @@ -42,6 +52,6 @@ The initial release of CausalNex.

## Thanks for supporting contributions
CausalNex was originally designed by [Paul Beaumont](https://www.linkedin.com/in/pbeaumont/) and [Ben Horsburgh](https://www.linkedin.com/in/benhorsburgh/) to solve challenges they faced in inferencing causality in their project work. This work was later turned into a product thanks to the following contributors:
[Yetunde Dada](https://github.com/yetudada), [Wesley Leong](https://www.linkedin.com/in/wesleyleong/), [Steve Ler](https://www.linkedin.com/in/song-lim-steve-ler-380366106/), [Viktoriia Oliinyk](https://www.linkedin.com/in/victoria-oleynik/), [Roxana Pamfil](https://www.linkedin.com/in/roxana-pamfil-1192053b/), [Nisara Sriwattanaworachai](https://www.linkedin.com/in/nisara-sriwattanaworachai-795b357/) and [Nikolaos Tsaousis](https://www.linkedin.com/in/ntsaousis/).
[Yetunde Dada](https://github.com/yetudada), [Wesley Leong](https://www.linkedin.com/in/wesleyleong/), [Steve Ler](https://www.linkedin.com/in/song-lim-steve-ler-380366106/), [Viktoriia Oliinyk](https://www.linkedin.com/in/victoria-oleynik/), [Roxana Pamfil](https://www.linkedin.com/in/roxana-pamfil-1192053b/), [Nisara Sriwattanaworachai](https://www.linkedin.com/in/nisara-sriwattanaworachai-795b357/), [Nikolaos Tsaousis](https://www.linkedin.com/in/ntsaousis/), [Angel Droth](https://www.linkedin.com/in/angeldroth/), and [Zain Patel](https://www.linkedin.com/in/zain-patel/).

CausalNex would also not be possible without the generous sharing from leading researches in the field of causal inference and we are grateful to everyone who advised and supported us, filed issues or helped resolve them, asked and answered questions or simply be part of inspiring discussions.
2 changes: 1 addition & 1 deletion causalnex/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@
causalnex toolkit for causal reasoning (Bayesian Networks / Inference)
"""

__version__ = "0.7.0"
__version__ = "0.8.0"

__all__ = ["structure", "discretiser", "evaluation", "inference", "network", "plots"]
4 changes: 1 addition & 3 deletions causalnex/inference/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,9 +284,7 @@ def template() -> float:
# initially there are none present, but caller will add appropriate arguments to the function
# getargvalues was "inadvertently marked as deprecated in Python 3.5"
# https://docs.python.org/3/library/inspect.html#inspect.getfullargspec
arg_spec = inspect.getargvalues( # pylint: disable=deprecated-method
inspect.currentframe()
)
arg_spec = inspect.getargvalues(inspect.currentframe())

return self._cpds[arg_spec.args[0]][ # target name
arg_spec.locals[arg_spec.args[0]]
Expand Down
124 changes: 62 additions & 62 deletions causalnex/network/network.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,67 +46,67 @@

class BayesianNetwork:
"""
Base class for Bayesian Network (BN), a probabilistic weighted DAG where nodes represent variables,
edges represent the causal relationships between variables.
``BayesianNetwork`` stores nodes with their possible states, edges and
conditional probability distributions (CPDs) of each node.
``BayesianNetwork`` is built on top of the ``StructureModel``, which is an extension of ``networkx.DiGraph``
(see :func:`causalnex.structure.structuremodel.StructureModel`).
In order to define the ``BayesianNetwork``, users should provide a relevant ``StructureModel``.
Once ``BayesianNetwork`` is initialised, no changes to the ``StructureModel`` can be made
and CPDs can be learned from the data.
The learned CPDs can be then used for likelihood estimation and predictions.
Example:
::
>>> # Create a Bayesian Network with a manually defined DAG.
>>> from causalnex.structure import StructureModel
>>> from causalnex.network import BayesianNetwork
>>>
>>> sm = StructureModel()
>>> sm.add_edges_from([
>>> ('rush_hour', 'traffic'),
>>> ('weather', 'traffic')
>>> ])
>>> bn = BayesianNetwork(sm)
>>> # A created ``BayesianNetwork`` stores nodes and edges defined by the ``StructureModel``
>>> bn.nodes
['rush_hour', 'traffic', 'weather']
>>>
>>> bn.edges
[('rush_hour', 'traffic'), ('weather', 'traffic')]
>>> # A ``BayesianNetwork`` doesn't store any CPDs yet
>>> bn.cpds
>>> {}
>>>
>>> # Learn the nodes' states from the data
>>> import pandas as pd
>>> data = pd.DataFrame({
>>> 'rush_hour': [True, False, False, False, True, False, True],
>>> 'weather': ['Terrible', 'Good', 'Bad', 'Good', 'Bad', 'Bad', 'Good'],
>>> 'traffic': ['heavy', 'light', 'heavy', 'light', 'heavy', 'heavy', 'heavy']
>>> })
>>> bn = bn.fit_node_states(data)
>>> bn.node_states
{'rush_hour': {False, True}, 'weather': {'Bad', 'Good', 'Terrible'}, 'traffic': {'heavy', 'light'}}
>>> # Learn the CPDs from the data
>>> bn = bn.fit_cpds(data)
>>> # Use the learned CPDs to make predictions on the unseen data
>>> test_data = pd.DataFrame({
>>> 'rush_hour': [False, False, True, True],
>>> 'weather': ['Good', 'Bad', 'Good', 'Bad']
>>> })
>>> bn.predict(test_data, "traffic").to_dict()
>>> {'traffic_prediction': {0: 'light', 1: 'heavy', 2: 'heavy', 3: 'heavy'}}
>>> bn.predict_probability(test_data, "traffic").to_dict()
{'traffic_prediction': {0: 'light', 1: 'heavy', 2: 'heavy', 3: 'heavy'}}
{'traffic_light': {0: 0.75, 1: 0.25, 2: 0.3333333333333333, 3: 0.3333333333333333},
'traffic_heavy': {0: 0.25, 1: 0.75, 2: 0.6666666666666666, 3: 0.6666666666666666}}
"""
Base class for Bayesian Network (BN), a probabilistic weighted DAG where nodes represent variables,
edges represent the causal relationships between variables.
``BayesianNetwork`` stores nodes with their possible states, edges and
conditional probability distributions (CPDs) of each node.
``BayesianNetwork`` is built on top of the ``StructureModel``, which is an extension of ``networkx.DiGraph``
(see :func:`causalnex.structure.structuremodel.StructureModel`).
In order to define the ``BayesianNetwork``, users should provide a relevant ``StructureModel``.
Once ``BayesianNetwork`` is initialised, no changes to the ``StructureModel`` can be made
and CPDs can be learned from the data.
The learned CPDs can be then used for likelihood estimation and predictions.
Example:
::
>>> # Create a Bayesian Network with a manually defined DAG.
>>> from causalnex.structure import StructureModel
>>> from causalnex.network import BayesianNetwork
>>>
>>> sm = StructureModel()
>>> sm.add_edges_from([
>>> ('rush_hour', 'traffic'),
>>> ('weather', 'traffic')
>>> ])
>>> bn = BayesianNetwork(sm)
>>> # A created ``BayesianNetwork`` stores nodes and edges defined by the ``StructureModel``
>>> bn.nodes
['rush_hour', 'traffic', 'weather']
>>>
>>> bn.edges
[('rush_hour', 'traffic'), ('weather', 'traffic')]
>>> # A ``BayesianNetwork`` doesn't store any CPDs yet
>>> bn.cpds
>>> {}
>>>
>>> # Learn the nodes' states from the data
>>> import pandas as pd
>>> data = pd.DataFrame({
>>> 'rush_hour': [True, False, False, False, True, False, True],
>>> 'weather': ['Terrible', 'Good', 'Bad', 'Good', 'Bad', 'Bad', 'Good'],
>>> 'traffic': ['heavy', 'light', 'heavy', 'light', 'heavy', 'heavy', 'heavy']
>>> })
>>> bn = bn.fit_node_states(data)
>>> bn.node_states
{'rush_hour': {False, True}, 'weather': {'Bad', 'Good', 'Terrible'}, 'traffic': {'heavy', 'light'}}
>>> # Learn the CPDs from the data
>>> bn = bn.fit_cpds(data)
>>> # Use the learned CPDs to make predictions on the unseen data
>>> test_data = pd.DataFrame({
>>> 'rush_hour': [False, False, True, True],
>>> 'weather': ['Good', 'Bad', 'Good', 'Bad']
>>> })
>>> bn.predict(test_data, "traffic").to_dict()
>>> {'traffic_prediction': {0: 'light', 1: 'heavy', 2: 'heavy', 3: 'heavy'}}
>>> bn.predict_probability(test_data, "traffic").to_dict()
{'traffic_prediction': {0: 'light', 1: 'heavy', 2: 'heavy', 3: 'heavy'}}
{'traffic_light': {0: 0.75, 1: 0.25, 2: 0.3333333333333333, 3: 0.3333333333333333},
'traffic_heavy': {0: 0.25, 1: 0.75, 2: 0.6666666666666666, 3: 0.6666666666666666}}
"""

def __init__(self, structure: StructureModel):
"""
Expand Down Expand Up @@ -573,7 +573,7 @@ def _predict_probability_from_incomplete_data(
cols = []
pattern = re.compile("^{node}_[0-9]+$".format(node=node))
# disabled open pylint issue (https://github.com/PyCQA/pylint/issues/2962)
for col in probability.columns: # pylint: disable=E1133
for col in probability.columns:
if pattern.match(col):
cols.append(col)
probability = probability[cols]
Expand Down
3 changes: 2 additions & 1 deletion causalnex/structure/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
``causalnex.structure`` provides functionality to define or learn structure.
"""

__all__ = ["StructureModel", "notears"]
__all__ = ["StructureModel", "notears", "dynotears", "data_generators", "DAGRegressor"]

from .sklearn import DAGRegressor
from .structuremodel import StructureModel
6 changes: 4 additions & 2 deletions causalnex/structure/categorical_variable_mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ class VariableFeatureMapper:
attribute ``PERMISSIBLE_TYPES``.
"""

PERMISSIBLE_TYPES = {"binary", "categorical", "continuous"}
PERMISSIBLE_TYPES = {"binary", "categorical", "continuous", "count"}
EXPANDABLE_TYPE = "categorical"

def __init__(self, schema: Dict[Hashable, str]):
Expand Down Expand Up @@ -81,10 +81,11 @@ def __init__(self, schema: Dict[Hashable, str]):
)
cat_feature_list = list(self._cat_fte_var_dict.keys())

# we put them together with the cont + binayr in a feature list
# we put them together with the cont + binary in a feature list
self.feature_list = (
self.variable_type_dict["binary"]
+ self.variable_type_dict["continuous"]
+ self.variable_type_dict["count"]
+ cat_feature_list
)

Expand All @@ -98,6 +99,7 @@ def __init__(self, schema: Dict[Hashable, str]):
var: [self._fte_index_dict[var]]
for var in self.variable_type_dict["continuous"]
+ self.variable_type_dict["binary"]
+ self.variable_type_dict["count"]
}
self.var_indices_dict.update(
{
Expand Down
59 changes: 59 additions & 0 deletions causalnex/structure/data_generators/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Copyright 2019-2020 QuantumBlack Visual Analytics Limited
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
# NONINFRINGEMENT. IN NO EVENT WILL THE LICENSOR OR OTHER CONTRIBUTORS
# BE LIABLE FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN
# ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF, OR IN
# CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
#
# The QuantumBlack Visual Analytics Limited ("QuantumBlack") name and logo
# (either separately or in combination, "QuantumBlack Trademarks") are
# trademarks of QuantumBlack. The License does not grant you any right or
# license to the QuantumBlack Trademarks. You may not use the QuantumBlack
# Trademarks or any confusingly similar mark as a trademark for your product,
# or use the QuantumBlack Trademarks in any other manner that might cause
# confusion in the marketplace, including but not limited to in advertising,
# on websites, or on software.
#
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Data generators using DAGs for benchmarking and synthetic data generation.
"""

__all__ = [
"generate_structure",
"nonlinear_sem_generator",
"sem_generator",
"generate_binary_data",
"generate_binary_dataframe",
"generate_categorical_dataframe",
"generate_continuous_data",
"generate_continuous_dataframe",
"generate_count_dataframe",
"gen_stationary_dyn_net_and_df",
"generate_dataframe_dynamic",
"generate_structure_dynamic",
]

from .core import generate_structure, nonlinear_sem_generator, sem_generator
from .wrappers import (
gen_stationary_dyn_net_and_df,
generate_binary_data,
generate_binary_dataframe,
generate_categorical_dataframe,
generate_continuous_data,
generate_continuous_dataframe,
generate_count_dataframe,
generate_dataframe_dynamic,
generate_structure_dynamic,
)

0 comments on commit 8265f64

Please sign in to comment.