# Creating a Causal Reasoning Dataset with Reasoning Core

This notebook demonstrates how to use the `reasoning_core` library to generate a synthetic dataset for Causal Reasoning tasks (specifically `BayesianAssociation`). We will show how to use the `.set_level()` method to generate problems of increasing difficulty and how to prepare the final dataset for uploading to the Hugging Face Hub.

In [3]:
import pandas as pd
import reasoning_core as rcr

## 1. Initialize the Task

We start by instantiating the `BayesianAssociation` task. This task generates problems where the goal is to compute the probability of a target variable given some evidence in a Bayesian Network, that correspond to the `Rung 1` of Pearl's ladder of causation.

In [4]:
task_r1 = rcr.get_task("BayesianAssociation")
print("Task initialized:", task_r1.task_name)

Task initialized: bayesian_association


We can also use the `Rung 2`  version of the Pearl's ladder of causation.

In [5]:
task_r2 = rcr.get_task("BayesianIntervention")
print("Task initialized:", task_r2.task_name)

Task initialized: bayesian_intervention


## 2. Understanding Difficulty Levels

The `reasoning_core` library allows you to control the difficulty of the generated problems using the `.config.set_level(level)` method. Increasing the level typically increases the complexity of the underlying graph (e.g., more nodes, larger domains, more complex dependencies).

### 2.1 Generation

In [6]:
# Demonstrate Level 0
task_r1.config.set_level(0)
print(f"Level 0 Config:\n {task_r1.config}\n")
example_l0 = task_r1.generate_example()
print("Level 0 Question excerpt: \n", example_l0.prompt,"\n\nAnswer: ", example_l0.answer)

Level 0 Config:
 Rung12Config(c=1.0, level=0, seed=None, size=None, n_nodes=3, max_domain_size=2, edge_prob=0.5, graph_generation_mode='erdos', n_round=1, cpt_relative_threshold=0, cot_scientific_notation=False)



Level 0 Question excerpt: 
 System:
P(X_0) = {'0': 0.9, '1': 0.1} 
X_2 ~ Noisy-OR(leak=0.0, weights={'X_0': 0.8, 'X_1': 0.7}) 
P(X_1) = {'0': 0.3, '1': 0.7}
Observed conditions:
Without further Observation/Knowledge of other variable.
Task: Compute probability distribution for X_1 (possible values: [0, 1]).

Output: Python dict mapping each value to its probability, rounded to 1 decimals.
Example: {0: 0.1, 1: 0.9} 

Answer:  {0: 0.3, 1: 0.7}


In [7]:
# Demonstrate Level 3
task_r1.config.set_level(3)
print(f"\nLevel 3 Config:\n {task_r1.config}\n")
example_l3 = task_r1.generate_example()
print("Level 3 Question excerpt: \n", example_l3.prompt,"\n\nAnswer: ", example_l3.answer)


Level 3 Config:
 Rung12Config(c=1.0, level=3, seed=None, size=None, n_nodes=4, max_domain_size=4, edge_prob=0.5, graph_generation_mode='erdos', n_round=2, cpt_relative_threshold=1.5, cot_scientific_notation=False)



Level 3 Question excerpt: 
 System:
P(X_0) = {'0': 0.478, '1': 0.158, '2': 0.364} 
P(X_3|X_0=0, X_2=0) = {'0': 0.357, '1': 0.643} 
P(X_3|X_0=0, X_2=1) = {'0': 0.755, '1': 0.245} 
P(X_3|X_0=1, X_2=0) = {'0': 0.239, '1': 0.761} 
P(X_3|X_0=1, X_2=1) = {'0': 0.495, '1': 0.505} 
P(X_3|X_0=2, X_2=0) = {'0': 0.566, '1': 0.434} 
P(X_3|X_0=2, X_2=1) = {'0': 0.367, '1': 0.633} 
P(X_2) = {'0': 0.728, '1': 0.272} 
P(X_1) = {'0': 0.156, '1': 0.044, '2': 0.8}
Observed conditions:
Observing/Knowing that the state X_1 is equal to 0, and the state X_2 is equal to 0
Task: Compute probability distribution for X_0 (possible values: [0, 1, 2]).

Output: Python dict mapping each value to its probability, rounded to 3 decimals.
Example: {0: 0.123, 1: 0.877} 

Answer:  {0: 0.478, 1: 0.158, 2: 0.364}


### 2.2 CoT

In [8]:
print(f"Level 0 Cot example: \n{example_l0.metadata.cot}\n")

print(f"Level 3 Cot example: \n{example_l3.metadata.cot}\n")

Level 0 Cot example: 
Result: P(X_1) = {0: 0.3, 1: 0.7}
Result: P(X_1) = {0: 0.3, 1: 0.7}

Level 3 Cot example: 
Result: P(X_0) = {0: 0.478, 1: 0.158, 2: 0.364}
Result: P(X_0) = {0: 0.478, 1: 0.158, 2: 0.364}



### 2.3 Score Answer
The score answer is computed based on Jensen-Shannon divergence, precisely $score(p,q) = (1 - \frac{JS\_div(p,q)}{log(2)}) ^ {power}$. Power was calibrated and set initially to 128.

In [9]:
print("""\nScore of the ground truth:\n""", rcr.score_answer(example_l3.answer, example_l3))


Score of the ground truth:
 1.0


In [10]:
print("""\nScore of a close distribution:\n""", rcr.score_answer({0: 0.35, 1: 0.4, 2: 0.25}, example_l3))


Score of a close distribution:
 9.666472921759194e-10


In [11]:
print("""\nScore of a middly shifted distribution:\n""", rcr.score_answer({0: 0.3, 1: 0.4, 2: 0.3}, example_l3))


Score of a middly shifted distribution:
 3.9929972989703943e-10


In [12]:
print("""\nScore of a shifted distribution:\n""", rcr.score_answer({0: 0.5, 1: 0.1, 2: 0.4}, example_l3))


Score of a shifted distribution:
 0.12848099158049142


## 3. Other features

### 3.1 You can as well seed the generation, that are twined across both rung.

In [13]:
#set tasks to level 2
task_r1.config.set_level(2)
task_r2.config.set_level(2)

task_r1.config.set_seed(graph_seed=42, conditionning_seed=24)
task_r2.config.set_seed(graph_seed=42, conditionning_seed=24)

twin_r1_example = task_r1.generate_example()
print("Rung1 Question excerpt: \n", twin_r1_example.prompt,"\n\nAnswer: ", twin_r1_example.answer)

print("\n ------------------------------ \n")

twin_r2_example = task_r2.generate_example()
print("Rung2 Question excerpt: \n", twin_r2_example.prompt,"\n\nAnswer: ", twin_r2_example.answer)

Rung1 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
X_2 ~ Noisy-MAX(leak=None, influences={'X_0': {'1': [0.61, 0.38, 0.01]}, 'X_1': {'1': [0.79, 0.17, 0.04], '2': [0.56, 0.34, 0.1]}}) 
X_3 ~ Noisy-MIN(leak=None, influences={'X_0': {'1': [0.0, 1.0]}, 'X_1': {'1': [0.2, 0.8], '2': [0.0, 1.0]}, 'X_2': {'1': [0.23, 0.77], '2': [0.0, 1.0]}}) 
P(X_1) = {'0': 0.37, '1': 0.21, '2': 0.42}
Observed conditions:
Observing/Knowing that the state X_0 is equal to 0, and the state X_3 is equal to 0, and the state X_2 is equal to 2
Task: Compute probability distribution for X_1 (possible values: [0, 1, 2]).

Output: Python dict mapping each value to its probability, rounded to 2 decimals.
Example: {0: 0.12, 1: 0.88} 

Answer:  {0: 0.0, 1: 0.2, 2: 0.8}

 ------------------------------ 



Rung2 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
X_2 ~ Noisy-MAX(leak=None, influences={'X_0': {'1': [0.61, 0.38, 0.01]}, 'X_1': {'1': [0.79, 0.17, 0.04], '2': [0.56, 0.34, 0.1]}}) 
X_3 ~ Noisy-MIN(leak=None, influences={'X_0': {'1': [0.0, 1.0]}, 'X_1': {'1': [0.2, 0.8], '2': [0.0, 1.0]}, 'X_2': {'1': [0.23, 0.77], '2': [0.0, 1.0]}}) 
P(X_1) = {'0': 0.37, '1': 0.21, '2': 0.42}
Observed conditions:
Doing/Imposing that the state X_2 is equal to 2. Observing/Knowing that the state X_0 is equal to 0, and the state X_3 is equal to 0
Task: Compute probability distribution for X_1 (possible values: [0, 1, 2]).

Output: Python dict mapping each value to its probability, rounded to 2 decimals.
Example: {0: 0.12, 1: 0.88} 

Answer:  {0: 0.42424242424242425, 1: 0.22222222222222224, 2: 0.35353535353535354}


### 3.2 You can as well choose to configure the configuration in a precise way

### 3.2.1 Noisy mode

Noisy-OR/AND for binary interaction, and there extension Noisy-MAX/MIN are aiming to reduce the size of the exponentially growing parameter's size (w.r.t to their "parents' size")

In [14]:
task_r1.config.Noisy_mode, task_r2.config.Noisy_mode = False , False #disable Noisy interaction

twin_r1_example = task_r1.generate_example()
print("Rung1 Question excerpt: \n", twin_r1_example.prompt,"\n\nAnswer: ", twin_r1_example.answer)

print("\n ------------------------------ \n")

twin_r2_example = task_r2.generate_example()
print("Rung2 Question excerpt: \n", twin_r2_example.prompt,"\n\nAnswer: ", twin_r2_example.answer)

Rung1 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
P(X_2|X_0=0, X_1=0) = {'0': 0.35, '1': 0.35, '2': 0.3} 
P(X_2|X_0=0, X_1=1) = {'0': 0.22, '1': 0.38, '2': 0.4} 
P(X_2|X_0=0, X_1=2) = {'0': 0.6, '1': 0.09, '2': 0.31} 
P(X_2|X_0=1, X_1=0) = {'0': 0.51, '1': 0.33, '2': 0.16} 
P(X_2|X_0=1, X_1=1) = {'0': 0.09, '1': 0.36, '2': 0.55} 
P(X_2|X_0=1, X_1=2) = {'0': 0.5, '1': 0.47, '2': 0.03} 
P(X_3|X_0=0, X_1=0, X_2=0) = {'0': 0.48, '1': 0.52} 
P(X_3|X_0=0, X_1=0, X_2=1) = {'0': 0.41, '1': 0.59} 
P(X_3|X_0=0, X_1=0, X_2=2) = {'0': 0.53, '1': 0.47} 
P(X_3|X_0=0, X_1=1, X_2=0) = {'0': 0.66, '1': 0.34} 
P(X_3|X_0=0, X_1=1, X_2=1) = {'0': 0.09, '1': 0.91} 
P(X_3|X_0=0, X_1=1, X_2=2) = {'0': 0.52, '1': 0.48} 
P(X_3|X_0=0, X_1=2, X_2=0) = {'0': 0.49, '1': 0.51} 
P(X_3|X_0=0, X_1=2, X_2=1) = {'0': 0.8, '1': 0.2} 
P(X_3|X_0=0, X_1=2, X_2=2) = {'0': 0.22, '1': 0.78} 
P(X_3|X_0=1, X_1=0, X_2=0) = {'0': 0.91, '1': 0.09} 
P(X_3|X_0=1, X_1=0, X_2=1) = {'0': 0.71, '1': 0.29} 
P(X_3|X_0=1, X

Rung2 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
P(X_2|X_0=0, X_1=0) = {'0': 0.35, '1': 0.35, '2': 0.3} 
P(X_2|X_0=0, X_1=1) = {'0': 0.22, '1': 0.38, '2': 0.4} 
P(X_2|X_0=0, X_1=2) = {'0': 0.6, '1': 0.09, '2': 0.31} 
P(X_2|X_0=1, X_1=0) = {'0': 0.51, '1': 0.33, '2': 0.16} 
P(X_2|X_0=1, X_1=1) = {'0': 0.09, '1': 0.36, '2': 0.55} 
P(X_2|X_0=1, X_1=2) = {'0': 0.5, '1': 0.47, '2': 0.03} 
P(X_3|X_0=0, X_1=0, X_2=0) = {'0': 0.48, '1': 0.52} 
P(X_3|X_0=0, X_1=0, X_2=1) = {'0': 0.41, '1': 0.59} 
P(X_3|X_0=0, X_1=0, X_2=2) = {'0': 0.53, '1': 0.47} 
P(X_3|X_0=0, X_1=1, X_2=0) = {'0': 0.66, '1': 0.34} 
P(X_3|X_0=0, X_1=1, X_2=1) = {'0': 0.09, '1': 0.91} 
P(X_3|X_0=0, X_1=1, X_2=2) = {'0': 0.52, '1': 0.48} 
P(X_3|X_0=0, X_1=2, X_2=0) = {'0': 0.49, '1': 0.51} 
P(X_3|X_0=0, X_1=2, X_2=1) = {'0': 0.8, '1': 0.2} 
P(X_3|X_0=0, X_1=2, X_2=2) = {'0': 0.22, '1': 0.78} 
P(X_3|X_0=1, X_1=0, X_2=0) = {'0': 0.91, '1': 0.09} 
P(X_3|X_0=1, X_1=0, X_2=1) = {'0': 0.71, '1': 0.29} 
P(X_3|X_0=1, X

Without the Noisy nodes, the number of parameters increase a lot. We might want to reduce the connectivity of the graph.

In [15]:
task_r1.config.edge_prob, task_r2.config.edge_prob = 0.25 , 0.25 #reduce the connectivity of the graph

twin_r1_example = task_r1.generate_example()
print("Rung1 Question excerpt: \n", twin_r1_example.prompt,"\n\nAnswer: ", twin_r1_example.answer)

print("\n ------------------------------ \n")

twin_r2_example = task_r2.generate_example()
print("Rung2 Question excerpt: \n", twin_r2_example.prompt,"\n\nAnswer: ", twin_r2_example.answer)

Rung1 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
P(X_2|X_0=0, X_1=0) = {'0': 0.35, '1': 0.35, '2': 0.3} 
P(X_2|X_0=0, X_1=1) = {'0': 0.22, '1': 0.38, '2': 0.4} 
P(X_2|X_0=0, X_1=2) = {'0': 0.6, '1': 0.09, '2': 0.31} 
P(X_2|X_0=1, X_1=0) = {'0': 0.51, '1': 0.33, '2': 0.16} 
P(X_2|X_0=1, X_1=1) = {'0': 0.09, '1': 0.36, '2': 0.55} 
P(X_2|X_0=1, X_1=2) = {'0': 0.5, '1': 0.47, '2': 0.03} 
P(X_3|X_0=0, X_1=0, X_2=0) = {'0': 0.48, '1': 0.52} 
P(X_3|X_0=0, X_1=0, X_2=1) = {'0': 0.41, '1': 0.59} 
P(X_3|X_0=0, X_1=0, X_2=2) = {'0': 0.53, '1': 0.47} 
P(X_3|X_0=0, X_1=1, X_2=0) = {'0': 0.66, '1': 0.34} 
P(X_3|X_0=0, X_1=1, X_2=1) = {'0': 0.09, '1': 0.91} 
P(X_3|X_0=0, X_1=1, X_2=2) = {'0': 0.52, '1': 0.48} 
P(X_3|X_0=0, X_1=2, X_2=0) = {'0': 0.49, '1': 0.51} 
P(X_3|X_0=0, X_1=2, X_2=1) = {'0': 0.8, '1': 0.2} 
P(X_3|X_0=0, X_1=2, X_2=2) = {'0': 0.22, '1': 0.78} 
P(X_3|X_0=1, X_1=0, X_2=0) = {'0': 0.91, '1': 0.09} 
P(X_3|X_0=1, X_1=0, X_2=1) = {'0': 0.71, '1': 0.29} 
P(X_3|X_0=1, X

Rung2 Question excerpt: 
 System:
P(X_0) = {'0': 0.64, '1': 0.36} 
P(X_2|X_0=0, X_1=0) = {'0': 0.35, '1': 0.35, '2': 0.3} 
P(X_2|X_0=0, X_1=1) = {'0': 0.22, '1': 0.38, '2': 0.4} 
P(X_2|X_0=0, X_1=2) = {'0': 0.6, '1': 0.09, '2': 0.31} 
P(X_2|X_0=1, X_1=0) = {'0': 0.51, '1': 0.33, '2': 0.16} 
P(X_2|X_0=1, X_1=1) = {'0': 0.09, '1': 0.36, '2': 0.55} 
P(X_2|X_0=1, X_1=2) = {'0': 0.5, '1': 0.47, '2': 0.03} 
P(X_3|X_0=0, X_1=0, X_2=0) = {'0': 0.48, '1': 0.52} 
P(X_3|X_0=0, X_1=0, X_2=1) = {'0': 0.41, '1': 0.59} 
P(X_3|X_0=0, X_1=0, X_2=2) = {'0': 0.53, '1': 0.47} 
P(X_3|X_0=0, X_1=1, X_2=0) = {'0': 0.66, '1': 0.34} 
P(X_3|X_0=0, X_1=1, X_2=1) = {'0': 0.09, '1': 0.91} 
P(X_3|X_0=0, X_1=1, X_2=2) = {'0': 0.52, '1': 0.48} 
P(X_3|X_0=0, X_1=2, X_2=0) = {'0': 0.49, '1': 0.51} 
P(X_3|X_0=0, X_1=2, X_2=1) = {'0': 0.8, '1': 0.2} 
P(X_3|X_0=0, X_1=2, X_2=2) = {'0': 0.22, '1': 0.78} 
P(X_3|X_0=1, X_1=0, X_2=0) = {'0': 0.91, '1': 0.09} 
P(X_3|X_0=1, X_1=0, X_2=1) = {'0': 0.71, '1': 0.29} 
P(X_3|X_0=1, X

### 3.2.2 Verbose mode

#### 3.2.2.0 Noisy setting

Unseed the generators, and set them to level 1

In [16]:
task_r1.config.set_seed(), task_r2.config.set_seed()
task_r1.config.set_level(1), task_r2.config.set_level(1)


(Rung12Config(c=1.0, level=1, seed=None, size=None, n_nodes=3, max_domain_size=3, edge_prob=0.5, graph_generation_mode='erdos', n_round=1, cpt_relative_threshold=0.5, cot_scientific_notation=False),
 Rung12Config(c=1.0, level=1, seed=None, size=None, n_nodes=3, max_domain_size=2, edge_prob=0.5, graph_generation_mode='erdos', n_round=2, cpt_relative_threshold=0.5, cot_scientific_notation=False))

set there mode to verbose

In [17]:
task_r1.config.is_verbose, task_r2.config.is_verbose = True,True
task_r1.config.concise_cot, task_r2.config.concise_cot = False, False

#### 3.2.2.1 Verbose prompts

In [18]:
verbose_example_1 = task_r1.generate_example()
print(f"Verbose rung1 example: \n{verbose_example_1.prompt} \n")

verbose_example_2 = task_r2.generate_example()
print(f"Verbose rung2 example: \n{verbose_example_2.prompt} \n")

Verbose rung1 example: 
System:
The probability of X_0 = 0 is 0.3 and The probability of X_0 = 1 is 0.7. 
If X_0 = 0 and X_1 = 0, then The probability of X_2 = 0 is 0.8 and The probability of X_2 = 1 is 0.2. 
If X_0 = 0 and X_1 = 1, then The probability of X_2 = 0 is 0.6 and The probability of X_2 = 1 is 0.4. 
If X_0 = 1 and X_1 = 0, then The probability of X_2 = 0 is 0.8 and The probability of X_2 = 1 is 0.2. 
If X_0 = 1 and X_1 = 1, then The probability of X_2 = 0 is 0.4 and The probability of X_2 = 1 is 0.6. 
The probability of X_1 = 0 is 0.7 and The probability of X_1 = 1 is 0.3.
Observed conditions:
Observing/Knowing that the state X_2 is equal to 0
Task: Compute probability distribution for X_0 (possible values: [0, 1]).

Output: Python dict mapping each value to its probability, rounded to 1 decimals.
Example: {0: 0.1, 1: 0.9} 



Verbose rung2 example: 
System:
The probability of X_0 = 0 is 0.01 and The probability of X_0 = 1 is 0.99. 
If X_0 = 0 and X_1 = 0, then The probability of X_2 = 0 is 0.54 and The probability of X_2 = 1 is 0.18 and The probability of X_2 = 2 is 0.28. 
If X_0 = 0 and X_1 = 1, then The probability of X_2 = 0 is 0.25 and The probability of X_2 = 1 is 0.29 and The probability of X_2 = 2 is 0.46. 
If X_0 = 0 and X_1 = 2, then The probability of X_2 = 0 is 0.69 and The probability of X_2 = 1 is 0.27 and The probability of X_2 = 2 is 0.04. 
If X_0 = 1 and X_1 = 0, then The probability of X_2 = 0 is 0.37 and The probability of X_2 = 1 is 0.34 and The probability of X_2 = 2 is 0.29. 
If X_0 = 1 and X_1 = 1, then The probability of X_2 = 0 is 0.54 and The probability of X_2 = 1 is 0.11 and The probability of X_2 = 2 is 0.35. 
If X_0 = 1 and X_1 = 2, then The probability of X_2 = 0 is 0.32 and The probability of X_2 = 1 is 0.38 and The probability of X_2 = 2 is 0.3. 
The probability of X_1 = 0 is

#### 3.2.2.2 Verbose CoT

In [19]:
print(f"Verbose rung1 example: \n{verbose_example_1.metadata.cot} \n")

print(f"Verbose rung2 example: \n{verbose_example_2.metadata.cot} \n")

Verbose rung1 example: 
Initialization: Selected Elimination Order = ['X_1']

--- Step: Eliminate Variable 'X_1' ---
1. Retrieve relevant factors containing 'X_1':
   Table for P(X_2=0 | X_0, X_1):
     [X_0=0, X_1=0] = 0.8
     [X_0=0, X_1=1] = 0.6
     [X_0=1, X_1=0] = 0.8
     [X_0=1, X_1=1] = 0.4
   Table for P(X_1):
     [X_1=0] = 0.7
     [X_1=1] = 0.3

2. Compute the Intermediate Joint (Product):
   Formula: P(X_1, X_2=0 | X_0) = P(X_2=0 | X_0, X_1) * P(X_1)
   Table for P(X_1, X_2=0 | X_0):
     [X_0=0, X_1=0] = 0.6
     [X_0=0, X_1=1] = 0.2
     [X_0=1, X_1=0] = 0.6
     [X_0=1, X_1=1] = 0.1

3. Marginalize (Sum) out variable 'X_1':
   Formula: P(X_2=0 | X_0) = ∑_{X_1} P(X_1, X_2=0 | X_0)
   Table for P(X_2=0 | X_0):
     [X_0=0] = 0.7
     [X_0=1] = 0.7


--- Final Step: Normalization ---
1. Gather all remaining factors (Query Variables + Priors):
   Table for P(X_0):
     [X_0=0] = 0.3
     [X_0=1] = 0.7
   Table for P(X_2=0 | X_0):
     [X_0=0] = 0.7
     [X_0=1] = 0.7

2. 

## 4. Generating the Dataset

We will now generate a dataset.

### 4.1 Generate a dataset with the previous config settings (verbose in that case)

In [20]:
task_r1.generate_balanced_batch(batch_size=2)

[---Prompt:System:
 The probability of X_0 = 0 is 0.5 and The probability of X_0 = 1 is 0.3 and The probability of X_0 = 2 is 0.2. 
 If X_0 = 0, then The probability of X_2 = 0 is 0.9 and The probability of X_2 = 1 is 0.1. 
 If X_0 = 1, then The probability of X_2 = 0 is 0.3 and The probability of X_2 = 1 is 0.7. 
 If X_0 = 2, then The probability of X_2 = 0 is 0.1 and The probability of X_2 = 1 is 0.9. 
 The probability of X_1 = 0 is 0.5 and The probability of X_1 = 1 is 0.5. 
 If X_1 = 0 and X_2 = 0, then The probability of X_3 = 0 is 0.5 and The probability of X_3 = 1 is 0.5 and The probability of X_3 = 2 is 0.0. 
 If X_1 = 0 and X_2 = 1, then The probability of X_3 = 0 is 0.3 and The probability of X_3 = 1 is 0.4 and The probability of X_3 = 2 is 0.3. 
 If X_1 = 1 and X_2 = 0, then The probability of X_3 = 0 is 0.3 and The probability of X_3 = 1 is 0.2 and The probability of X_3 = 2 is 0.5. 
 If X_1 = 1 and X_2 = 1, then The probability of X_3 = 0 is 0.3 and The probability of X_3 

### 4.2 Generating with specific level

Generate with a precise level, whatever the current config of the generator

In [21]:
task_r1.generate_balanced_batch(batch_size=2, level=1)

[---Prompt:System:
 The probability of X_0 = 0 is 0.8 and The probability of X_0 = 1 is 0.1 and The probability of X_0 = 2 is 0.1. 
 If X_0 = 0, then The probability of X_1 = 0 is 0.7 and The probability of X_1 = 1 is 0.3. 
 If X_0 = 1, then The probability of X_1 = 0 is 0.5 and The probability of X_1 = 1 is 0.5. 
 If X_0 = 2, then The probability of X_1 = 0 is 0.8 and The probability of X_1 = 1 is 0.2. 
 If X_0 = 0 and X_1 = 0, then The probability of X_2 = 0 is 0.7 and The probability of X_2 = 1 is 0.3. 
 If X_0 = 0 and X_1 = 1, then The probability of X_2 = 0 is 0.6 and The probability of X_2 = 1 is 0.4. 
 If X_0 = 1 and X_1 = 0, then The probability of X_2 = 0 is 0.6 and The probability of X_2 = 1 is 0.4. 
 If X_0 = 1 and X_1 = 1, then The probability of X_2 = 0 is 0.6 and The probability of X_2 = 1 is 0.4. 
 If X_0 = 2 and X_1 = 0, then The probability of X_2 = 0 is 0.6 and The probability of X_2 = 1 is 0.4. 
 If X_0 = 2 and X_1 = 1, then The probability of X_2 = 0 is 0.5 and The 

### 4.3 Converting to Hugging Face Dataset

In [22]:
import datasets
import pandas as pd

# 1. Generate data, as a list of Problem
data = task_r1.generate_balanced_batch(5, level=1)

# 2. Create DataFrame
df = pd.DataFrame(data)
display(df.head())

# 3. Convert to Hugging Face Dataset (Corrected Syntax)
hf_dataset = datasets.Dataset.from_pandas(df)

# 4. Split into train and test
dataset_dict = hf_dataset.train_test_split(test_size=0.2)
print("\nDataset Structure:", dataset_dict)

Unnamed: 0,answer,metadata,prompt,task
0,"{0: 0.82, 1: 0.18}","{'target_var_values': [0, 1], 'bif_description...",System:\nThe probability of X_0 = 0 is 0.69 an...,bayesian_association
1,"{0: 0.3, 1: 0.7}","{'target_var_values': [0, 1], 'bif_description...",System:\nThe probability of X_0 = 0 is 0.2 and...,bayesian_association
2,"{0: 0.47, 1: 0.53}","{'target_var_values': [0, 1], 'bif_description...",System:\nThe probability of X_0 = 0 is 0.47 an...,bayesian_association
3,"{0: 0.1, 1: 0.7, 2: 0.2}","{'target_var_values': [0, 1, 2], 'bif_descript...",System:\nThe probability of X_1 = 0 is 0.4 and...,bayesian_association
4,"{0: 0.37, 1: 0.63}","{'target_var_values': [0, 1], 'bif_description...",System:\nThe probability of X_0 = 0 is 0.37 an...,bayesian_association



Dataset Structure: DatasetDict({
    train: Dataset({
        features: ['answer', 'metadata', 'prompt', 'task'],
        num_rows: 4
    })
    test: Dataset({
        features: ['answer', 'metadata', 'prompt', 'task'],
        num_rows: 1
    })
})


### 4.4 Uploading to Hugging Face Hub

In [23]:
# DATASET_NAME = "your-username/causal-reasoning-curriculum"
# dataset_dict.push_to_hub(DATASET_NAME)
# print(f"Dataset uploaded to https://huggingface.co/datasets/{DATASET_NAME}")