# Causal Structures
Using Halerium Causal Structures

Author: {{ cookiecutter.author_name }}
Created: {{ cookiecutter.timestamp }}

In [0]:
# Link to project experiments folder hypothesis_experiment_learnings.board (refresh and hit enter on this line to see the link)

## How to use the notebook

The following cells:
- specify objective, variables, and variable types,
- read dataset,
- set up the causal structure,
- present results from the tests,

By default, the notebook is set up to run with an example (wine quality). To see how it works, run the notebook without changing the code.

For your project, adjust the code in the linked cells with your objectives, variables, dataset etc. and then execute all cells in order.

Please refer to causal_structure.board for detailed instructions.

In [0]:
# <halerium id="f36f9c9c-baff-4dc9-b6d4-1151f86f4f5c">
# Link to causal_structure.board
# </halerium id="f36f9c9c-baff-4dc9-b6d4-1151f86f4f5c">


### Imports

In [0]:
import numpy as np
import pandas as pd

### 2. Import the Dataset

In [0]:
# <halerium id="0125f07e-fb0d-47e8-90ff-fd5b9771bece">
time_series = False
test_size = 0.25
path = 'default example' # Specify the path of the data
# </halerium id="0125f07e-fb0d-47e8-90ff-fd5b9771bece">


Importing the dataset

In [0]:
if path =='default example':
    path = 'https://raw.githubusercontent.com/erium/halerium-example-data/main/hypothesis_testing/WineQT.csv'

if time_series:
    df = pd.read_csv(path, parse_dates=['date'])
else:
    df = pd.read_csv(path, sep=None)

Visualising the dataset

In [0]:
df

### 3. Model Causal Structure

#### Manual Modelling
Manually specify the dependencies in the causal structure

In [0]:
# Directed dependencies
# <halerium id="aa8af7c8-f96c-4333-9da1-3769def1afe1">
dependencies = [['fixed acidity', 'pH'], ['volatile acidity', 'pH']]
manual_features_input = ['fixed acidity', 'volatile acidity']
manual_features_output = ['pH']
# </halerium id="aa8af7c8-f96c-4333-9da1-3769def1afe1">


#### Automatic Modelling
Generate all possible DAGs
This becomes computationally slow with > 3 features

In [0]:
# <halerium id="aa8af7c8-f96c-4333-9da1-3769def1afe1">
features = ['fixed acidity', 'volatile acidity', 'pH']
auto_features_input = ['fixed acidity', 'volatile acidity']
auto_features_output = ['pH']
# </halerium id="aa8af7c8-f96c-4333-9da1-3769def1afe1">


### 4. Run the Model

Manual model

In [0]:
from functions.causal_structure import manual_causal_structure

# <halerium id="80f0fbe8-c939-463b-9771-535cd4918c96">
manual_results = manual_causal_structure(df, dependencies, manual_features_input, manual_features_output, test_size)
# </halerium id="80f0fbe8-c939-463b-9771-535cd4918c96">


Automatic Model

In [0]:
from functions.causal_structure import auto_causal_structure

# <halerium id="80f0fbe8-c939-463b-9771-535cd4918c96">
auto_results = auto_causal_structure(df, features, auto_features_input, auto_features_output, test_size)
# </halerium id="80f0fbe8-c939-463b-9771-535cd4918c96">


### 5.Interpret the Results

Manual Model Results

In [0]:
from functions.causal_structure import model_manual_results

if 'manual_results' not in globals():
    print("Manual causal structure not modelled")
else:
# <halerium id="4cfc52f6-8ef5-4434-9d64-16505d9ed731">
    model_manual_results(dependencies, manual_features_input, manual_features_output, manual_results)
# </halerium id="4cfc52f6-8ef5-4434-9d64-16505d9ed731">


Automatic Model Results

In [0]:
from functions.causal_structure import model_auto_results

if 'auto_results' not in globals():
    print("Automatic causal structure not modelled")
else:
# <halerium id="4cfc52f6-8ef5-4434-9d64-16505d9ed731">
    model_auto_results(features, auto_features_input, auto_features_output, auto_results)
# </halerium id="4cfc52f6-8ef5-4434-9d64-16505d9ed731">
