Skip to content

Commit

Permalink
PC tutorial using ASIA data (py-why#67)
Browse files Browse the repository at this point in the history
* Create PC algo tutorial
* Add updated poetry lock file
* Fix notebook and update docs for CI. Fix code spell

Signed-off-by: Robert Ness <robertness@gmail.com>
Co-authored-by: Adam Li <adam2392@gmail.com>
Signed-off-by: Adam Li <adam2392@gmail.com>
  • Loading branch information
robertness and adam2392 committed Jan 6, 2023
1 parent 0e821fe commit e980c12
Show file tree
Hide file tree
Showing 15 changed files with 1,704 additions and 279 deletions.
4 changes: 2 additions & 2 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,8 @@ jobs:
- run:
name: Install the latest version of Poetry
command: |
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | POETRY_UNINSTALL=1 python -
curl -sSL https://install.python-poetry.org | python3 - --version 1.2.2
curl -sSL https://install.python-poetry.org | python3 - --version 1.3.0
poetry --version
- run:
name: Set BASH_ENV
command: |
Expand Down
1 change: 1 addition & 0 deletions .codespellignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
raison
wee
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ __pycache__/
# C extensions
*.so

# possibly produced from drawing graphs
*.gv
*.png

junit-results.xml

# Distribution / packaging
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,5 +49,8 @@ To install the package from github, clone the repository and then `cd` into the
# for graph functionality
poetry install --extras graph_func

# to load datasets used in tutorials
poetry install --extras data

# if you would like an editable install of dodiscover for dev purposes
pip install -e .
4 changes: 4 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,10 @@
"image_scrapers": scrapers,
}

# prevent jupyter notebooks from being run even if empty cell
# nbsphinx_execute = "never"
nbsphinx_allow_errors = True

# Custom sidebar templates, maps document names to template names.
html_sidebars = {
"index": ["search-field.html"],
Expand Down
4 changes: 2 additions & 2 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ Contents
--------

.. toctree::
:maxdepth: 1
:maxdepth: 2
:caption: Getting started:

installation
api
use
tutorials
tutorials/index
whats_new

.. toctree::
Expand Down
12 changes: 0 additions & 12 deletions doc/tutorials.rst

This file was deleted.

18 changes: 18 additions & 0 deletions doc/tutorials/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
*********
Tutorials
*********

.. _models_tutorials:

Basic causal discovery models without latent confounders
========================================================
The first tutorial presents several algorithms for causal discovery without latent confounding: the Peters and Clarke (PC) algorithm.
These models provide a basis for learning causal structure from data when we make the **assumption** that there are
no latent confounders.

.. toctree::
:maxdepth: 1
:titlesonly:

markovian/example-pc-algo

899 changes: 899 additions & 0 deletions doc/tutorials/markovian/example-pc-algo.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion doc/use.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Using dodiscover
=====================

To be able to effectively use dodiscover, look at some of the examples here
To be able to effectively use dodiscover, look at some of the basic examples here
to learn everything you need!


Expand Down
1 change: 1 addition & 0 deletions doc/whats_new/_contributors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@

.. _Adam Li: https://adam2392.github.io
.. _Chris Trevino: https://py-why.github.io
.. _Robert Osazuwa Ness: https://py-why.github.io
2 changes: 2 additions & 0 deletions doc/whats_new/v0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ Changelog
- |Feature| Implement FCI algorithm, :class:`dodiscover.constraint.FCI` for learning causal structure from observational data with latent confounders under the ``dodiscover.constraint`` submodule, by `Adam Li`_ (:pr:`52`)
- |Feature| Implement Structural Hamming Distance metric to compare directed graphs, :func:`dodiscover.metrics.structure_hamming_dist`, by `Adam Li`_ (:pr:`55`)
- |Fix| Update dependency on networkx, which removes a PR branch dependency with pywhy-graphs having the MixedEdgeGraph class that was causing a dependency conflict, by `Adam Li`_ (:pr:`74`)
- |Enhancement| Add tutorial for PC algorithm with Asia data, by `Robert Osazuwa Ness`_ (:pr:`67`)

Code and Documentation Contributors
-----------------------------------
Expand All @@ -46,3 +47,4 @@ the project since version inception, including:

* `Adam Li`_
* `Chris Trevino`_
* `Robert Osazuwa Ness`_
51 changes: 25 additions & 26 deletions dodiscover/constraint/pcalg.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import logging
from itertools import combinations, permutations
from itertools import combinations
from typing import Optional

import networkx as nx
Expand Down Expand Up @@ -134,37 +134,36 @@ def orient_edges(self, graph: EquivalenceClass) -> None:
A skeleton graph. If ``None``, then will initialize PC using a
complete graph. By default None.
"""
node_ids = graph.nodes

# For all the combination of nodes i and j, apply the following
# rules.
idx = 0
finished = False
while idx < self.max_iter and not finished: # type: ignore
change_flag = False
for (i, j) in permutations(node_ids, 2):
if i == j:
continue
# Rule 1: Orient i-j into i->j whenever there is an arrow k->i
# such that k and j are nonadjacent.
r1_add = self._apply_meek_rule1(graph, i, j)

# Rule 2: Orient i-j into i->j whenever there is a chain
# i->k->j.
r2_add = self._apply_meek_rule2(graph, i, j)

# Rule 3: Orient i-j into i->j whenever there are two chains
# i-k->j and i-l->j such that k and l are nonadjacent.
r3_add = self._apply_meek_rule3(graph, i, j)

# Rule 4: Orient i-j into i->j whenever there are two chains
# i-k->l and k->l->j such that k and j are nonadjacent.
#
# However, this rule is not necessary when the PC-algorithm
# is used to estimate a DAG.

if any([r1_add, r2_add, r3_add]) and not change_flag:
change_flag = True
for i in graph.nodes:
for j in graph.neighbors(i):
if i == j:
continue
# Rule 1: Orient i-j into i->j whenever there is an arrow k->i
# such that k and j are nonadjacent.
r1_add = self._apply_meek_rule1(graph, i, j)

# Rule 2: Orient i-j into i->j whenever there is a chain
# i->k->j.
r2_add = self._apply_meek_rule2(graph, i, j)

# Rule 3: Orient i-j into i->j whenever there are two chains
# i-k->j and i-l->j such that k and l are nonadjacent.
r3_add = self._apply_meek_rule3(graph, i, j)

# Rule 4: Orient i-j into i->j whenever there are two chains
# i-k->l and k->l->j such that k and j are nonadjacent.
#
# However, this rule is not necessary when the PC-algorithm
# is used to estimate a DAG.

if any([r1_add, r2_add, r3_add]) and not change_flag:
change_flag = True
if not change_flag:
finished = True
logger.info(f"Finished applying R1-3, with {idx} iterations")
Expand Down
Loading

0 comments on commit e980c12

Please sign in to comment.