# Recreating the RW17 Dataset from Scratch

This notebook first explains the datastructure used in (Rehder and Waldmann, 2017)
and then demonstrates how to **construct the RW17 domain components** using helper functions 
from `dataset_creation`, and **convert them into a structured DataFrame**.

We will:
1. **Create a domain dictionary** using `create_domain_dict`
2. **Expand it into a DataFrame** using `expand_domain_to_dataframe`
3. **Add inference tasks** to extend the dataset
4. **Generate verbalized prompts** for human evaluation


This step will:

Load and examine rw_17_domain_components (the predefined dataset components).
Break down its structure to understand:
Domain dictionary (specification of causal variables).
Graph structure (causal relationships).
Inference tasks (reasoning scenarios).
Explain how these elements combine to generate prompts for LLMs.


In [1]:
import os
import sys
import pprint
import pandas as pd

# Ensure Python finds the `src` directory
sys.path.append(os.path.abspath("../../src"))

# Import everything defined in `__all__`
from causalalign.dataset_creation import (
    rw_17_domain_components,
    graph_structures,
    inference_tasks_rw17,
    generate_prompt_dataframe,
    expand_domain_to_dataframe,
    expand_df_by_task_queries,
    create_domain_dict,
    verbalize_domain_intro,
    verbalize_causal_mechanism,
    verbalize_inference_task,
    append_dfs,
)


econ_config_rw17 = rw_17_domain_components["economy"]["variables"]
socio_config_rw17 = rw_17_domain_components["sociology"]["variables"]
weather_config_rw17 = rw_17_domain_components["weather"]["variables"]
print("Domain components loaded successfully!")


print("Dataset creation module imported successfully!")


Domain components loaded successfully!
Dataset creation module imported successfully!


# 1. Understanding the RW17 Dataset Structure

Before generating new datasets, we need to understand the structure of **RW17 domain components**.

## 🔹 What Is RW17?
RW17 presented humans with causal inference tasks and asked for their likelihood judgements. Each causal inference task was presented on four subsequent screens.
In the following, I will describe how I translated the experimental materials used by RW17 into nested dictionaries, which serve as the backbone to algorithmically  generate  the materials in RW17 in *textual form* such that we can prompt and compare LLMs' causal judgements. The following notebook explains how to algoritmically re-create the the textual form used by RW17 and also, how to easily create new prompts.

RW17 used 3 different knowledge domains, in particular economy, sociology, and weather, in which the inference tasks were thematically embedded.
Each domain specifies:
- **Variables (`C1`, `C2`, `E`)**: The causal variables / graph nodes, e.g., C1: interest rates
- **Variable Sense depending on binary values 0 or 1 for (`C1`, `C2`, `E`)**: e.g. for C1=1: *high* interest rates
- ** Counterbalance-dependent Sense Assignments (`p/m`)**: How we represent conditions in counterbalanced ways (optional, but used in RW17). Essentially, this flips the senses of what it means for the variable to be on (1) or off (0).

The verbalization of the prompt depends on the domain and:
- **Causal Mechanisms**: How the variables influence each other (e.g., specified by collider, graph or chain graph)
- **Inference Tasks**: The reasoning problems we ask an LLM to solve specified by 

By combining these components, we **generate structured natural language prompts** that can be used for causal reasoning tasks in an LLMs.


##  Understanding the RW17 Domain Dictionary

Each domain in `rw_17_domain_components` contains:
1. **Domain Name & Introduction**:  
   - Explains the overall knowledge structure / cover story the inference task is embedded in.
   
2. **Causal Variables (`C1`, `C2`, `E`)**:  
   - `C1` and `C2` are **causes**, and `E` is the **effect**.
   - Each variable has:
     - A **name** and **detailed description**.
     - `p_value` and `m_value`: Counterbalanced values.
     - **Explanation mappings**: How specific conditions lead to outcomes. (optional, plus, they are graph dependent!)

3. **Example: Economy Domain**
   - `C1`: **Interest Rates** (low vs. high)
   - `C2`: **Trade Deficit** (small vs. large)
   - `E`: **Retirement Savings** (high vs. low)


#### Make sure you understand the building blocks of the dictionary. 

in ``src/causalalign/dataset_creation/constants.py``, there are dictionaries that define the domain building blocks, causal mechanism for 3 different graph topologies (collider, fork, and chain), and inference tasks. Before re-creating the prompts used in RW17, let's first load them and get a feeling for the prompt structure:




In [2]:
# load domain components
rw_17 = rw_17_domain_components


# list the domains in the dataset
print("Domains in the dataset:")
print(rw_17.keys())


# Pretty-print RW17 dataset structure

pprint.pprint(rw_17)

Domains in the dataset:
dict_keys(['economy', 'sociology', 'weather'])
{'economy': {'domain_name': 'economy',
             'introduction': 'Economists seek to describe and predict the '
                             'regular patterns of economic fluctuation. To do '
                             'this, they study some important variables or '
                             'attributes of economies. They also study how '
                             'these attributes are responsible for producing '
                             'or causing one another.',
             'variables': {'X': {'detailed': 'Interest rates are the rates '
                                             'banks charge to loan money.',
                                 'explanations': {'m_m': 'A lot of people are '
                                                         'making large monthly '
                                                         'interest payments on '
                                                  

##  How Graph Structures Define Causal Mechanisms

A **graph structure** specifies how causal variables (`C1`, `C2`, `E`) relate to each other.

### Example Graph Structures:
1. **Collider** (`C1 → E ← C2`)
   - `C1` and `C2` both cause `E`.

2. **Fork** (`C1 ← E → C2`)
   - `E` causes both `C1` and `C2`.

3. **Chain** (`C1 → C2 → E`)
   - `C1` causes `C2`, which then affects `E`.

Let's look at the graph structures dictionary that is already pre-defined in ``src/causalalign/dataset_creation/constants.py``


In [3]:
# Pretty-print available graph structures
pprint.pprint(graph_structures)


{'chain': {'causal_template': '{x_sense} {x_name} causes {y_sense} {y_name}. '
                              'And {y_sense} {y_name} causes {z_sense} '
                              '{z_name}.',
           'description': 'A→B→C'},
 'collider': {'causal_template': '{x_sense} {x_name} causes {z_sense} '
                                 '{z_name}. Also, {y_sense} {y_name} causes '
                                 '{z_sense} {z_name}.',
              'description': 'X→Z←Y'},
 'fork': {'causal_template': '{x_sense} {x_name} causes {y_sense} {y_name}. '
                             'Also, {x_sense} {x_name} causes {z_sense} '
                             '{z_name}.',
          'description': 'B←A→C'}}


##  How Inference Tasks Define the Final Prompt

Inference tasks specify **what the LLM needs to predict** given certain observations.

For example:
- `"a": {"query_node": "Ci", "observation": "Cj=1", "query": "Ci=?"}`
  - **Ask:** Given that `Cj=1`, what is the likely value of `Ci`?
  - This corresponds to **a causal reasoning question**.

Inference tasks work together with **domain dictionaries** and **graph structures** to create **verbalized prompts**.

### 🔗 How It All Connects:
1. **Domain Dictionary** → Specifies **variables and values**.
2. **Graph Structure** → Defines **causal relationships**.
3. **Inference Tasks** → Frame **the reasoning problem**.
4. **Prompt Verbalization** → Converts this into **natural language for LLMs**.



In [4]:
pprint.pprint(inference_tasks_rw17)

{'a': {'observation': 'Z=1, Yj=1',
       'query': 'p(Xi=1|Z=1, Yj=1)',
       'query_node': 'Xi=1'},
 'b': {'observation': 'Z=1', 'query': 'p(Xi=1|Z=1)', 'query_node': 'Xi=1'},
 'c': {'observation': 'Z=1, Yj=0',
       'query': 'p(Xi=1|Z=1, Yj=0)',
       'query_node': 'Xi=1'},
 'd': {'observation': 'Yj=1', 'query': 'p(Xi=1|Yj=1)', 'query_node': 'Xi=1'},
 'e': {'observation': 'Yj=0', 'query': 'p(Xi=1|Yj=0)', 'query_node': 'Xi=1'},
 'f': {'observation': 'Z=0, Yj=1',
       'query': 'p(Xi=1|Z=0, Yj=1)',
       'query_node': 'Xi=1'},
 'g': {'observation': 'Z=0', 'query': 'p(Xi=1|Z=0)', 'query_node': 'Xi=1'},
 'h': {'observation': 'Z=0, Yj=0',
       'query': 'p(Xi=1|Z=0, Yj=0)',
       'query_node': 'Xi=1'},
 'i': {'observation': 'Xi=0, Yj=0',
       'query': 'p(Z=1|Xi=0, Yj=0)',
       'query_node': 'Z=1'},
 'j': {'observation': 'Xi=0, Yj=1',
       'query': 'p(Z=1|Xi=0, Yj=1)',
       'query_node': 'Z=1'},
 'k': {'observation': 'Xi=1, Yj=1',
       'query': 'p(Z=1|Xi=1, Yj=1)',
       

Now, let's explore how these components are combined by re-creating the prompts used in RW17 starting with re-creating the dictionaries that are stored in ``constants.py``!


# Step 1: Create Domain Dictionray:


In [5]:
# # Example usage with enforcement of "normal" for zero values
# try:
#     economy_test_dict = create_domain_dict(
#         domain="economy",
#         introduction="Economists seek to describe and predict the regular patterns of economic fluctuation. To do this, they study some important variables or attributes of economies. They also study how these variables are responsible for producing or causing one another.",
#         X_name="interest rates",
#         X_detailed="Interest rates are the rates banks charge to loan money.",
#         X_range_intro =f"Some economies have <tag> interest rates. Others have <tag> interest rates."
#         X_values={"1": "low", "0": "high"},
#         Y_name="trade deficits",
#         Y_detailed="A country's trade deficit...",
#         Y_values={"1": "small", "0": "large"},
#         Z_name="retirement savings",
#         Z_detailed="Retirement savings is the money people save for their retirement.",
#         Z_values={"1": "high", "0": "low"},
#         counterbalance_enabled=True,
#         enforce_zero_label=True,  # Ensures '0' is verbalized as 'normal'
#         zero_label="normal",
#     )

#     # Convert dictionary to a readable JSON format for display
#     import json

#     formatted_output = json.dumps(economy_test_dict, indent=4)
#     print(formatted_output)

# except Exception as e:
#     print(f"Error: {e}")

## Create Domain dicts

In [6]:
econ_dict_rw17 = create_domain_dict(
    domain_name="economy",
    introduction="Economists seek to describe and predict the regular patterns of economic fluctuation. To do this, they study some important variables or attributes of economies. They also study how these attributes are responsible for producing or causing one another.",
    variables_config=econ_config_rw17,
    graph_type="collider",
)
# Generate prompts based on dict created in this notebook
econ_prompts_df = generate_prompt_dataframe(
    econ_dict_rw17, inference_tasks_rw17, "collider", graph_structures
)
print(f"shape: {econ_prompts_df.shape}")


weather_dict_rw17 = create_domain_dict(
    domain_name="weather",
    introduction="Meteorologists seek to describe and predict the regular patterns that govern weather systems. To do this, they study some important variables or attributes of weather systems. They also study how these attributes are responsible for producing or causing one another.",
    variables_config=weather_config_rw17,
    graph_type="collider",
)
# Generate prompts based on dict created in this notebook
weather_prompts_df = generate_prompt_dataframe(
    weather_dict_rw17, inference_tasks_rw17, "collider", graph_structures
)
weather_prompts_df.shape

sociology_dict_rw17 = create_domain_dict(
    domain_name="sociology",
    introduction="Sociologists seek to describe and predict the regular patterns of societal interactions. To do this, they study some important variables or attributes of societies. They also study how these attributes are responsible for producing or causing one another.",
    variables_config=socio_config_rw17,
    graph_type="collider",
)
# Generate prompts based on dict created in this notebook
sociology_prompts_df = generate_prompt_dataframe(
    sociology_dict_rw17, inference_tasks_rw17, "collider", graph_structures
)
sociology_prompts_df.shape

shape: (160, 24)


(160, 24)

Now append the individual dataframes

In [7]:
rw_17_over_complete_df = append_dfs(
    econ_prompts_df, weather_prompts_df, sociology_prompts_df
)
# add a unique id to each row
rw_17_over_complete_df["id"] = range(1, len(rw_17_over_complete_df) + 1)


# subset for the counterbalance conditions used in the RW17 study
select_contbl_cond_xs = ["ppp", "pmm", "mmp", "mpm"]
print(
    f"unique counterbalance conditions used: {rw_17_over_complete_df['cntbl_cond'].unique()}"
)

rw_17_complete_df = rw_17_over_complete_df[
    rw_17_over_complete_df["cntbl_cond"].isin(select_contbl_cond_xs)
]


unique counterbalance conditions used: ['ppp' 'ppm' 'pmp' 'pmm' 'mpp' 'mpm' 'mmp' 'mmm']


In [8]:
# this has already been saved in the collider numeric-certainty notebook
# drop text verbalization columns from merged df
# merged_df = merged_df.drop(
#     columns=["C1_detailed", "C2_detailed", "E_detailed", "prompt"]
# )
# # save merged data
# merged_df.to_csv(
#     f"../../../results/17_rw/humans/humans_responses_w_prompt_id_{graph_type}_{prompt_type}.csv",
#     index=False,
#     sep=";",
# )
# print(merged_df.columns)


### Saving the Prompts for LLMs

Numeric-only version

In [9]:
version = "5"
# save for prompting LLMs
LLM_prompting_df = rw_17_complete_df[
    ["id", "prompt", "prompt_category", "graph", "domain", "cntbl_cond", "task"]
]
prompt_category = LLM_prompting_df["prompt_category"].unique()[0]
graph_type = LLM_prompting_df["graph"].unique()[0]
print(prompt_category, graph_type)
# LLM_prompting_df.to_csv(
#     f"../datasets/17_rw/prompts_for_LLM_api/{version}_v_{prompt_category}_LLM_prompting_{graph_type}.csv",
#     index=False,
# )

print(
    f"Prompts for LLMs for version {version}, graph = {graph_type}, and prompt type = {prompt_category} saved successfully!"
)
print("prompt data file has the following columns:")
print(LLM_prompting_df.columns)

single_numeric_response collider
Prompts for LLMs for version 5, graph = collider, and prompt type = single_numeric_response saved successfully!
prompt data file has the following columns:
Index(['id', 'prompt', 'prompt_category', 'graph', 'domain', 'cntbl_cond',
       'task'],
      dtype='object')


### New Human Collider Data (non-aggregated)

In [10]:
human_data_new_ce = pd.read_csv(
    "../data/17_rw/human_data/rw17_collider_ce.csv", sep=";"
)
# rename colum task as label
human_data_new_ce.rename(columns={"letter.type": "label"}, inplace=True)
# print the columns
# turn label into lower case
human_data_new_ce["label"] = human_data_new_ce["label"].str.lower()
# rename attr.polarity to ppp
human_data_new_ce.rename(columns={"attr.polarity": "ppp"}, inplace=True)
# drop task
human_data_new_ce.drop(columns=["task"], inplace=True)
print(human_data_new_ce.columns)

print(human_data_new_ce["label"].unique())
human_data_new_ce.head()


Index(['s', 'domain', 'diagram', 'ppp', 'label', 'study.type', 'trial', 'rt',
       'y', 'type', 'aggr.type', 'betw.factors'],
      dtype='object')
['d' 'j' 'b' 'f' 'c' 'k' 'h' 'e' 'a' 'g' 'i']


Unnamed: 0,s,domain,diagram,ppp,label,study.type,trial,rt,y,type,aggr.type,betw.factors
0,0,economy,a,ppp,d,Y|X=1,0,38788,65,CB|CA=1,Ci|Cj=1,"domain, attr.polarity, diagram, letter.type"
1,0,economy,a,ppp,j,"Z|X=0,Y=1",1,16532,80,"E|CA=0,CB=1","E|Ci=0,Cj=1","domain, attr.polarity, diagram, letter.type"
2,0,economy,a,ppp,b,X|Z=1,2,10979,90,CA|E=1,Ci|E=1,"domain, attr.polarity, diagram, letter.type"
3,0,economy,a,ppp,f,"X|Y=1,Z=0",3,17956,55,"CA|E=0,CB=1","Ci|E=0,Cj=1","domain, attr.polarity, diagram, letter.type"
4,0,economy,a,ppp,c,"Y|X=0,Z=1",4,954,60,"CB|E=1,CA=0","Ci|E=1,Cj=0","domain, attr.polarity, diagram, letter.type"


### we need to get the human data into the followin format (mostly column renaming):

(['human_subj_id', 'domain', 'cntbl_cond', 'response',
       'num_responses_agg', 'task', 'subject', 'graph'],
      dtype='object')

In [11]:
# rename the columns
human_data_new_ce.rename(
    columns={
        "y": "response",
        "label": "task",
        "s": "human_subj_id",
        "ppp": "cntbl_cond",
    },
    inplace=True,
)

# and drop all other columns

human_data_new_ce = human_data_new_ce[
    [
        "response",
        "task",
        "human_subj_id",
        "cntbl_cond",
        # "type",
        "domain",
    ]
]


# renaming and new columns
human_data_new_ce["domain"] = human_data_new_ce["domain"].replace(
    {"society": "sociology"}
)

human_data_new_ce["subject"] = "human"
# add a column graph that contains collider everywhere
human_data_new_ce["graph"] = "collider"


# turn response into floats

human_data_new_ce["response"] = (
    human_data_new_ce["response"].replace(",", ".", regex=True).astype(float)
)

In [12]:
print(f"uniue domain values: {human_data_new_ce['domain'].unique()}")
print(f"uniue cntbl_cond values: {human_data_new_ce['cntbl_cond'].unique()}")
print(f"uniue task values: {human_data_new_ce['task'].unique()}")
# print(f"uniue type values: {human_data_new_ce['type'].unique()}")
print(f" unique response values: {human_data_new_ce['response'].unique()}")
# save human data
human_data_new_ce.head()

uniue domain values: ['economy' 'sociology' 'weather']
uniue cntbl_cond values: ['ppp' 'pmm' 'mmp' 'mpm']
uniue task values: ['d' 'j' 'b' 'f' 'c' 'k' 'h' 'e' 'a' 'g' 'i']
 unique response values: [ 65.  80.  90.  55.  60.  75.  85.  45.  50.  95.  40.  30.  20. 100.
  70.  15.   0.  10.  35.  25.   5.]


Unnamed: 0,response,task,human_subj_id,cntbl_cond,domain,subject,graph
0,65.0,d,0,ppp,economy,human,collider
1,80.0,j,0,ppp,economy,human,collider
2,90.0,b,0,ppp,economy,human,collider
3,55.0,f,0,ppp,economy,human,collider
4,60.0,c,0,ppp,economy,human,collider


In [None]:
print(f"NEW: human data shape: {human_data_new_ce.shape}")
# print(f"Old human data shape: {human_data.shape}")

NEW: human data shape: (960, 7)


### Merge new human data (non-aggregated) with the prompts

In [15]:
# now merge the all_domains_df with  human data on the columns: domain, task, cntbl_cond
# merge the dataframes
merged_df_new = pd.merge(
    rw_17_complete_df, human_data_new_ce, on=["domain", "task", "cntbl_cond", "graph"]
)
# merged_df_new["response"] = merged_df_new["response"].str.replace(",", ".").astype(float)
# print the columns
print(merged_df_new.columns)

Index(['domain', 'X', 'X_values', 'X_cntbl', 'X_sense', 'X_detailed', 'Y',
       'Y_values', 'Y_cntbl', 'Y_sense', 'Y_detailed', 'Z', 'Z_values',
       'Z_cntbl', 'Z_sense', 'Z_detailed', 'cntbl_cond', 'task', 'query_node',
       'observation', 'query', 'graph', 'prompt', 'prompt_category', 'id',
       'response', 'human_subj_id', 'subject'],
      dtype='object')


In [16]:
merged_df_new
# drop duplicate rows
merged_df_new = merged_df_new.drop_duplicates()

In [17]:
merged_df_new[merged_df_new["id"] == 66]

Unnamed: 0,domain,X,X_values,X_cntbl,X_sense,X_detailed,Y,Y_values,Y_cntbl,Y_sense,...,query_node,observation,query,graph,prompt,prompt_category,id,response,human_subj_id,subject
202,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,100.0,13,human
204,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,90.0,40,human
205,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,60.0,40,human
206,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,20.0,112,human
207,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,70.0,112,human
208,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,40.0,119,human
209,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,45.0,119,human
210,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,m,large,...,Y=1,"Z=1, X=0","p(Y=1|Z=1, X=0)",collider,Economists seek to describe and predict the re...,single_numeric_response,66,85.0,133,human


In [None]:
# print are there any duplicates in the merged_df_new
duplicates = merged_df_new[merged_df_new.duplicated()]
# print(f"duplicates: {duplicates}")
# print the number of duplicates
print(f"number of duplicates: {len(duplicates)}")
# print the number of rows in the merged_df_new
# print(f"number of rows in the merged_df: {len(merged_df)}")
# print the number of rows in the merged_df
print(f"number of rows in the merged_df_new: {len(merged_df_new)}")


number of duplicates: 0
number of rows in the merged_df_new: 1540


In [20]:
duplicates

Unnamed: 0,domain,X,X_values,X_cntbl,X_sense,X_detailed,Y,Y_values,Y_cntbl,Y_sense,...,query_node,observation,query,graph,prompt,prompt_category,id,response,human_subj_id,subject


In [None]:
# for each column, print the number of unique values
for col in merged_df_new.columns:
    print(
        f"{col}:  {merged_df_new[col].nunique()} unique values: {merged_df_new[col].unique()}"
    )
merged_df_new.head()

domain:  3 unique values: ['economy' 'weather' 'sociology']
X:  3 unique values: ['interest rates' 'ozone levels' 'urbanization']
X_values:  1 unique values: ['1']
X_cntbl:  2 unique values: ['p' 'm']
X_sense:  2 unique values: ['low' 'high']
X_detailed:  3 unique values: ['Interest rates are the rates banks charge to loan money.'
 'Ozone is a gaseous allotrope of oxygen (O3) and is formed by exposure to UV radiation.'
 'Urbanization is the degree to which the members of a society live in urban environments (i.e., cities) versus rural environments.']
Y:  3 unique values: ['trade deficits' 'air pressure' 'interest in religion']
Y_values:  1 unique values: ['1']
Y_cntbl:  2 unique values: ['p' 'm']
Y_sense:  4 unique values: ['small' 'large' 'low' 'high']
Y_detailed:  3 unique values: ["A country's trade deficit is the difference between the value of the goods that a country imports and the value of the goods that a country exports."
 'Air pressure is force exerted due to concentrations 

Unnamed: 0,domain,X,X_values,X_cntbl,X_sense,X_detailed,Y,Y_values,Y_cntbl,Y_sense,...,query_node,observation,query,graph,prompt,prompt_category,id,response,human_subj_id,subject
0,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,p,small,...,X=1,"Z=1, Y=1","p(X=1|Z=1, Y=1)",collider,Economists seek to describe and predict the re...,single_numeric_response,1,80.0,0,human
1,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,p,small,...,X=1,"Z=1, Y=1","p(X=1|Z=1, Y=1)",collider,Economists seek to describe and predict the re...,single_numeric_response,1,85.0,0,human
2,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,p,small,...,X=1,"Z=1, Y=1","p(X=1|Z=1, Y=1)",collider,Economists seek to describe and predict the re...,single_numeric_response,1,75.0,25,human
3,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,p,small,...,X=1,"Z=1, Y=1","p(X=1|Z=1, Y=1)",collider,Economists seek to describe and predict the re...,single_numeric_response,1,70.0,25,human
4,economy,interest rates,1,p,low,Interest rates are the rates banks charge to l...,trade deficits,1,p,small,...,X=1,"Z=1, Y=1","p(X=1|Z=1, Y=1)",collider,Economists seek to describe and predict the re...,single_numeric_response,1,65.0,58,human


In [None]:
# save merged data
# merged_df.to_csv("../datasets/17_rw/merged_data/humans_prompts_coll.csv", index=False)
# print(merged_df.columns)

# merged_df_new.to_csv(
#     "../../../results/17_rw/humans/humans_responses_w_prompt_id_coll.csv",
#     index=False,
#     sep=";",
# )
print(merged_df_new.columns)


Index(['domain', 'X', 'X_values', 'X_cntbl', 'X_sense', 'X_detailed', 'Y',
       'Y_values', 'Y_cntbl', 'Y_sense', 'Y_detailed', 'Z', 'Z_values',
       'Z_cntbl', 'Z_sense', 'Z_detailed', 'cntbl_cond', 'task', 'query_node',
       'observation', 'query', 'graph', 'prompt', 'prompt_category', 'id',
       'response', 'human_subj_id', 'subject'],
      dtype='object')


## next, save LLM version (prompts)


In [None]:
version = "5"
# save for prompting LLMs
LLM_prompting_df = rw_17_complete_df[
    ["id", "prompt", "prompt_category", "graph", "domain", "cntbl_cond", "task"]
]
prompt_category = LLM_prompting_df["prompt_category"].unique()[0]
graph_type = LLM_prompting_df["graph"].unique()[0]
print(prompt_category, graph_type)
LLM_prompting_df.to_csv(
    f"../datasets/17_rw/prompts_for_LLM_api/{version}_v_{prompt_category}_LLM_prompting_{graph_type}.csv",
    index=False,
)

print(
    f"Prompts for LLMs for version {version}, graph = {graph_type}, and prompt type = {prompt_category} saved successfully!"
)
print("prompt data file has the following columns:")
print(LLM_prompting_df.columns)

single_numeric_response collider
Prompts for LLMs for version 5, graph = collider, and prompt type = single_numeric_response saved successfully!
prompt data file has the following columns:
Index(['id', 'prompt', 'prompt_category', 'graph', 'domain', 'cntbl_cond',
       'task'],
      dtype='object')


In [29]:
LLM_prompting_df.head()

Unnamed: 0,id,prompt,prompt_category,graph,domain,cntbl_cond,task
0,1,Economists seek to describe and predict the re...,single_numeric_response,collider,economy,ppp,a
1,2,Economists seek to describe and predict the re...,single_numeric_response,collider,economy,ppp,a
2,3,Economists seek to describe and predict the re...,single_numeric_response,collider,economy,ppp,b
3,4,Economists seek to describe and predict the re...,single_numeric_response,collider,economy,ppp,b
4,5,Economists seek to describe and predict the re...,single_numeric_response,collider,economy,ppp,c
