# CSV to JS Converter
Created on Dec 18, 2024

This notebook was developed for internal use by the Mind in Society Lab (P.I. Nick Camp).

### Features

1. **Dynamic Hierarchical Structure Generation**  
   - Automatically generates a hierarchical array structure in the `groups.js` file based on the number of specified **independent variables**.
   - Each array is populated with the corresponding stimulus file paths.

2. **Automatic Annotation**  
   - Total number of stimuli  
   - Number of levels for each independent variable  
   - Number of stimuli in the smallest unit arrays  
   - A hierarchical structure diagram summarizing the data

### How to Use

1. **Prepare the CSV File:** 

   
   The CSV file must include column(s) representing **independent variables** and a column containing the **file paths** of the stimuli 


2. **Define Independent Variables (`level_keys`)**  
   - Specify the list of keys (CSV column names) representing the **independent variables** in the `level_keys` variable.  
   - Changing the number of independent variables will dynamically adjust the hierarchical structure and the number of stimuli in the smallest unit arrays  


3. **Define the Stimulus Path Column (`file_path_key`):** 


   Assign the column name containing the **stimulus file paths** to the `file_path_key` variable.


4. **Run the Notebook**  
   - Execute the notebook to generate the `groups.js` file.  
   - The file will include a header comment summarizing the following:  
     - Total number of stimuli  
     - Number of levels for each independent variable  
     - Number of stimuli in the smallest unit arrays  
     - A hierarchical structure diagram

## 1. Load Packages

In [184]:
import pandas as pd
import json

## 2. Helper Functions

### (1) Utility functions

In [178]:
def add_to_nested_dict(data_dict, keys, value):
    """
    Create a nested dictionary structure dynamically.
    """
    current_dict = data_dict
    for key in keys[:-1]:
        current_dict = current_dict.setdefault(key, {})
    current_dict.setdefault(keys[-1], []).append(value)



def calculate_smallest_group_stimuli(groups):
    """
    Calculate the number of stimuli in the smallest groups.
    """
    if isinstance(groups, list):
        return len(groups)
    elif isinstance(groups, dict):
        return min(calculate_smallest_group_stimuli(v) for v in groups.values())
    return 0

### (2) Data processing function

In [179]:
def process_data(df, level_keys, file_path_key):
    """
    Process the DataFrame to create a nested structure and calculate counts.
    """
    # Calculate total counts for each level
    total_counts = {key: df[key].nunique() for key in level_keys}

    # Create nested dictionary
    groups = {}
    for _, row in df.iterrows():
        keys = [row[level_key] for level_key in level_keys]
        add_to_nested_dict(groups, keys, row[file_path_key])

    # Calculate stimuli count in the smallest groups
    stimuli_count = calculate_smallest_group_stimuli(groups)

    return groups, total_counts, stimuli_count

### (3) Structure generation function

In [180]:
def create_structure_diagram(level_keys, total_counts, stimuli_per_smallest_group):
    """
    Create a structured diagram summarizing the hierarchical structure and stimuli counts.
    """
    # Calculate the total number of groups across all levels
    total_groups = 1
    for key in level_keys:
        total_groups *= total_counts[key]
    total_stimuli = total_groups * stimuli_per_smallest_group


    summary = "// -------------- S U M M A R Y --------------\n" 
    summary += f"// Total {total_stimuli} stimuli = \n"
    summary += "//      " + " * ".join([f"{total_counts[key]} {key}" for key in level_keys])
    summary += f" * {stimuli_per_smallest_group} stimuli per smallest array\n\n"


    diagram = "// ------------ S T R U C T U R E ------------\n" 
    diagram += "// groups = {\n"
    indent = "  "
    for i, key in enumerate(level_keys):
        if i == len(level_keys) - 1:
            diagram += f"//{indent * (i + 1)}'{key}': [\n"
            diagram += f"//{indent * (i + 2)}'path_to_stimuli_1.jpg',\n"
            diagram += f"//{indent * (i + 2)}'path_to_stimuli_2.jpg'\n"
            diagram += f"//{indent * (i + 1)}],\n"
        else:
            diagram += f"//{indent * (i + 1)}'{key}' : {{\n"
    # Close all open brackets at the correct indentation level
    for i in range(len(level_keys) - 1, -1, -1):
        diagram += f"// {indent * i}}},\n" if i > 0 else f"// {indent * i}}}\n"
    diagram += "\n\n"
    return summary + diagram

### (4) Main pipeline function

In [181]:
def convert(df, level_keys, file_path_key, output_file):
    """
    Process the entire pipeline from DataFrame to JavaScript file generation.
    """
    try:
        # Process data
        groups, total_counts, stimuli_count = process_data(df, level_keys, file_path_key)

        # Create structure diagram
        structure_diagram = create_structure_diagram(level_keys, total_counts, stimuli_count)

        # Write to JavaScript file
        with open(output_file, 'w') as js_file:
            js_file.write(structure_diagram)
            js_file.write("var groups = ")
            js_file.write(json.dumps(groups, indent=2))
            js_file.write(";")

        # Print success message
        print(f"Conversion successful! Output saved to '{output_file}'.")

    except Exception as e:
        # Print error message
        print(f"An error occurred during conversion: {e}")

## 3. Convert CSV file to JS file

### (1) Configuration

In [182]:
# Load the CSV file
df = pd.read_csv('stim/example_stim.csv')  ## TODO: Set the directory to your CSV file.

# Define the keys and item based on the CSV file
level_keys = ['family', 'city', 'cluster']  ## TODO: Specify the column names for the hierarchical levels.
file_path_key = 'path'  ## TODO: Specify the column name that contains the paths to the stimuli files.
output_file = './groups.js'  # Output file path

### (2) Convert!

In [183]:
convert(df, level_keys, file_path_key, output_file)

Conversion successful! Output saved to './groups.js'.
