In [1]:
import pandas as pd

#### Load the example dataset

In [2]:
data_dir = "https://raw.githubusercontent.com/bd2kccd/py-causal/master/data/charity.txt"
df = pd.read_table(data_dir, sep="\t")

#### Start Java VM

In [4]:
from pycausal.pycausal import pycausal as pc
pc = pc()
pc.start_vm()

#### Create the Prior Knowledge Object

In [5]:
from pycausal import prior as p
forbid = [['TangibilityCondition','Impact']]
require =[['Sympathy','TangibilityCondition']]
tempForbid = p.ForbiddenWithin(['TangibilityCondition','Imaginability'])
temporal = [tempForbid,['Sympathy','AmountDonated'],['Impact']]
prior = p.knowledge(forbiddirect = forbid, requiredirect = require, addtemporal = temporal)
prior

Instance of edu.cmu.tetrad.data.Knowledge2: /knowledge
addtemporal

0* Imaginability TangibilityCondition 
1 AmountDonated Sympathy 
2 Impact 

forbiddirect

TangibilityCondition ==> Impact 
requiredirect

Sympathy ==> TangibilityCondition 

#### Load causal algorithms from the py-causal library and Run FGES Continuous

In [6]:
from pycausal import search as s
tetrad = s.tetradrunner()
tetrad.getAlgorithmDescription(algoId = 'fges')

FGES is an optimized and parallelized version of an algorithm developed by Meek [Meek, 1997] called the Greedy Equivalence Search (GES). The algorithm was further developed and studied by Chickering [Chickering, 2002]. GES is a Bayesian algorithm that heuristically searches the space of CBNs and returns the model with highest Bayesian score it finds. In particular, GES starts its search with the empty graph. It then performs a forward stepping search in which edges are added between nodes in order to increase the Bayesian score. This process continues until no single edge addition increases the score. Finally, it performs a backward stepping search that removes edges until no single edge removal can increase the score. For more information see http://www.ccd.pitt.edu/pdfs/fgesc.pdf. The reference is Ramsey et al., 2017.

The algorithms requires a decomposable score—that is, a score that for the entire DAG model is a sum of logged scores of each variables given its parents in the model.

In [7]:
tetrad.getAlgorithmParameters(algoId = 'fges', scoreId = 'fisher-z')

alpha: Cutoff for p values (alpha) (min = 0.0) (java.lang.Double) [default:0.01]
faithfulnessAssumed: Yes if (one edge) faithfulness should be assumed (java.lang.Boolean) [default:true]
symmetricFirstStep: Yes if the first step step for FGES should do scoring for both X->Y and Y->X (java.lang.Boolean) [default:false]
maxDegree: The maximum degree of the graph (min = -1) (java.lang.Integer) [default:100]
verbose: Yes if verbose output should be printed or logged (java.lang.Boolean) [default:false]
bootstrapSampleSize: The number of bootstraps (min = 0) (java.lang.Integer) [default:0]
bootstrapEnsemble: Ensemble method: Preserved (0), Highest (1), Majority (2) (java.lang.Integer) [default:1]


In [8]:
tetrad.run(algoId = 'fges', dfs = df, priorKnowledge = prior,
           alpha = 0.01, maxDegree = -1, faithfulnessAssumed = True, verbose = True)

#### FGES Continuous' Result's Nodes

In [9]:
tetrad.getNodes()

['TangibilityCondition',
 'AmountDonated',
 'Sympathy',
 'Imaginability',
 'Impact']

#### FGES Continuous' Result's Edges

In [10]:
tetrad.getEdges()

['Sympathy --> TangibilityCondition',
 'Sympathy --> Impact',
 'Sympathy --- AmountDonated']

In [None]:
from pycausal import pycausal 

#### Plot The Result's Graph

In [16]:
import pydot

In [18]:
from IPython.display import SVG

In [19]:
dot_str = pc.tetradGraphToDot(tetrad.getTetradGraph())

In [20]:
graphs = pydot.graph_from_dot_data(dot_str)

In [21]:
svg_str = graphs[0].create_svg()

FileNotFoundError: [WinError 2] "dot.exe" not found in path.

In [17]:
SVG(svg_str)

FileNotFoundError: [WinError 2] "dot.exe" not found in path.

#### Stop Java VM

In [11]:
pc.stop_vm()