<a href="https://colab.research.google.com/github/MoAljam/autora-closed-loop-g-1/blob/develop/AutoRA_Closed_Loop_Challenge_g_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

In this challenge, you are tasked to document your closed-loop case study.








## Grading

- Due date: **August 30**
- Submission: Through ``Stud.IP -> Tasks -> Closed-Loop Challenge``

The grading is independent of the outcome of the benchmarking challenge. In total, you can obtain **30 points**, which are detailed in the graded sections below.

Teams must submit a link to this notebook, along with a link to their repository on the respective Stud.IP task. In addition, teams must  outline the contributions of each team member in their submission on ``Stud.IP -> Tasks -> Closed-Loop Challenge``.

# Research Question (2 points)

Please outline in the text block below the overarching research question you are addressing with your closed-loop study. What aspects of human cognition are you studying?

With our closed-loop experiment, we aim to replicate a study conducted by Tsushima et al. (2006). In particular, we want to look into how stimuli which are not consciously perceived -- so-called subthreshold or invisible stimuli -- and which are irrelevant to a task influence task performance. In general, the authors propose two different hypotheses regarding this influence: first, that irrelevant, invisible stimuli disrupt task performance more as their intensity increases; or second, that there is no effect of this type of stimulus on task performance, as they are being filtered out or suppressed.

# Paradigm (2 points)

Provide a brief description of the overall paradigm (psychphysics paradigm, multi-arm bandit paradigm, task switching paradigm, etc.) you are using. (1 point)

The overall paradigm used in this study is a psychophysics task -- namely, a rapid serial visual presentation (RSVP). In this task, participants are presented a small circle in the middle of the screen, which shows a sequence of a total of six letters and two numbers one by one. This means that each trial consists of eight screens, which are presented one after another, with a 45 ms interval in between them, during which a blank screen is being shown. This sequence is task-relevant, as participants have to focus on it and after each trial, report the two numbers which have been shown to them throughout the whole sequence. In addition to that, there is task-irrelevant component: a dynamic random-dot (DRD) display in the background, which appears within a larger circle surrounding the central circle showing the previously explained sequence. The DRD display varies across trials with regard to both the coherence ratio of the dots and their motion.

Outline below why you think the paradigm is well suited to address the research question. (1 point)

This paradigm is well suited to address the research question at hand, since it incorporates both a task-relevant and task-irrelevant component. Further, it allows for assessing the participants' performance on one task while giving the opportunity to manipulate a second task and its difficulty level; for example, the irrelevant task can be made more difficult by lowering the coherence ratio of the moving dots.

# Experiment Variables (4 points)

Deescribe the **independent** variables of your experiment. What do these variables represent, which values can they take? (2 points)

There are two key independent variables:
1. **Stimulus Strength**: this variable manipulates **the visibility of the stimulus** presented to participants.

2. **Task-relevance** (Irrelevant vs. relevant stimuli): this variable manipulates whether **the presented stimulus is relevant or irrelevant** to the task being performed by the participants.
*   **Irrelevant stimuli** are those that the participants are not supposed to pay attention to during the task.
*   **Relevant stimuli** refer to those directly related to the participants' task.

What these variables represent:
* **Stimulus Strength** represents the degree of perceptual visibility and awareness of the stimuli (whether the stimuli are consciously seen or remain unnoticed).
* **Task-Relevance** represents whether the stimulus should be attended to or ignored by participants during the cognitive task.

The experiment aimed to study how different types of stimuli affect brain activity and task performance, particularly when inhibitory control fails (i.e., how well participants can ignore or suppress irrelevant stimuli).






Describe the **dependent** variables of your experiment. What do these variables represent, which values can they take? (2 points)

There are two dependent variables of the experiment: **Reaction time** and **Accuracy**
* **Reaction time** refers to how quickly participants respond to the task, while **accuracy** reflects how correct their responses are.
* **Values:** Reaction time (in milliseconds) and accuracy (percentage of correct responses).
* **What it represents:** This variable assesses the **cognitive disruption** caused by different types of distractors. If participants perform slower or less accurately when exposed to certain stimuli (especially subthreshold, irrelevant stimuli), it suggests that these stimuli are causing cognitive interference, despite being outside of conscious awareness.

Measuring these dependent variables can examine how cognitive and neural processes are affected by subthreshold vs. suprathreshold stimuli and the brain’s ability to inhibit irrelevant distractors.




# Counterbalancing (4 points)

Describe how you counterbalanced the independent variables of the experiment into an experimental sequence. Make sure to mention all constraints that were used to generate the experiment sequence (2 points).

To counterbalance the independent variables in the experimental sequence, we followed a structured approach to ensure balance and control across trials.

**Independent Variables**
* Coherence Ratio: This refers to the ratio of coherent motion signals presented in the experiment.
* Motion Direction: Represents the direction of movement during the task.
* Repetition: Each combination of coherence and direction was repeated a set number of times (default: 5 repetitions).
* Trial Items: For each trial, six letters and two numbers were selected. One number was presented in the first four trials and the second number in the last four, maintaining consistent difficulty across trials.

**Design**
Crossing: Each experimental block crossed these factors to ensure that each combination of coherence ratio and motion direction appeared multiple times across repetitions.

**Constraints**
The number distribution ensured that one number was displayed in the first half of the trial, and another number was displayed in the second half, aligning with the procedure of the original study.

Please paste the code you used for counterbalancing (e.g., SweetPea code) here. You don't have to execute this code here, so you can omit import statements. (2 points)

In [3]:

def generate_trial_items():
  """
  generates one sequence while making sure that one number is displayed throughout the first four trials and the second number throughout
  the last four trials. this is in accordance with the procedure of the original study and ensures that the difficulty level remains similar
  across all trials.

  input:
      --

  output:
      list: one sequence, containing six letters and two numbers
  """

    # create list of all possible letters
    letters = [
        "A",
        "B",
        "C",
        "D",
        "E",
        "F",
        "G",
        "H",
        "I",
        "J",
        "K",
        "L",
        "M",
        "N",
        "O",
        "P",
        "Q",
        "R",
        "S",
        "T",
        "U",
        "V",
        "W",
        "X",
        "Y",
        "Z",
    ]

    # create list of all possible numbers
    numbers = ["1", "2", "3", "4"]

    # make 2 lists of 4 items with each list containing 3 letters and 1 number
    items_first = random.sample(letters, 3) + random.sample(numbers, 1)
    items_second = random.sample(letters, 3) + random.sample(numbers, 1)

    # shuffle the items of each list
    random.shuffle(items_first)
    random.shuffle(items_second)

    # concatenate both lists to a sinlge one, resulting in a whole sequence of one experimental trial
    items = items_first + items_second

    return items


def shuffle_chunks(group):
    # Shuffle the group to randomize data before chunking
    group = group.sample(frac=1).reset_index(drop=True)
    n = 8  # Define chunk size
    chunks = [group.iloc[i : i + n] for i in range(0, len(group), n)]

    return chunks


def assign_items_shuffle_chunk(group):
    # Shuffle the group to randomize data before chunking
    group = group.sample(frac=1).reset_index(drop=True)
    items = generate_trial_items()
    if len(group) != len(items):
        raise ValueError("Group and items must have the same length")
    # add one item to each trial from the generated items
    for idx, _ in group.iterrows():
        group.at[idx, "item"] = items[idx]

    n = len(group)
    chunks = [group.iloc[i : i + n] for i in range(0, len(group), n)]
    return chunks


def trial_sequences(
    coherence_ratios: list,
    motion_directions: list,
    sequence_type="target",
    all_items_in_one_trial=True,
    num_repetitions=5,
):

    coherence_ratio = Factor("coherence_ratio", coherence_ratios)
    motion_direction = Factor("motion_direction", motion_directions)
    repetition = Factor("repetetion", list(range(1, num_repetitions + 1)))

    # used in case of repeating the items of each trial
    _num = Factor("_num", list(range(1, 9)))

    # counterbalance experimental variables by coherence ratio and motion directon
    # design = [coherence_ratio, motion_direction, repetetion, *items]
    design = [coherence_ratio, motion_direction, repetition]
    crossing = [coherence_ratio, motion_direction, repetition]

    if not all_items_in_one_trial:
        design.append(_num)
        crossing.append(_num)

    constraints = []

    # block = CrossBlock(design, crossing, constraints)
    block = CrossBlock(design, crossing, constraints)

    # synthesize trialsequence
    experiments = synthesize_trials(block, 1, CMSGen)

    experiments_dicts = experiments_to_dicts(block, experiments)

    # display(experiments_dicts[0])
    if all_items_in_one_trial:
        # extend the the experiment with the trial items
        for experiment in experiments_dicts:
            for trial in experiment:
                trial_items = generate_trial_items()
                items_dict = {f"item_{i}": trial_items[i - 1] for i in range(1, 9)}
                trial.update(items_dict)
                # add the correct response to each trial
                numbers = [int(item) for item in trial_items if item.isdigit()]
                # trial["correct_choice"] = [chr(n) for n in numbers]
                trial["correct_choice"] = numbers
                # add the trail type
                trial["sequence_type"] = sequence_type

    else:
        raise NotImplementedError(
            "Not implemented , use all_items_in_one_trial=True as it fits the current implementation"
        )

    return experiments_dicts

IndentationError: unexpected indent (<ipython-input-3-2b63ddea8fa8>, line 16)

# Experiment Implementation (4 points)

Describe the sequence of events in your experiment. Make sure to describe the structure of the experiment (e.g., how many and which blocks were used) and make sure to describe the sequence of events within each trial (along with the timings of those events). In addition, you may describe all stimuli used in the experiment. (2 points)

There are 4 separate blocks in our experiment.

1. Introduction Block:

    Participants are greeted and presented with instructions.
    Three instruction screens explain the task: a sequence of 8 items (6 letters, 2 numbers), how to focus on recalling the digits, and ignoring the moving dots in the background.

2. Training Block:

    Participants undergo training rounds with example trials to get accustomed to the task.
    The stimuli and sequence structure in the training are identical to the actual experiment to help participants familiarize themselves with the task.

3. Experiment Block:

    The actual experiment follows the same structure as the training but with real data collection.
    Participants perform multiple trials where they are asked to focus on recalling the two digits from each sequence.

4. Debriefing Block:

    After completing the experiment, participants receive a debrief message and can provide feedback.



**The Sequence of the Events Within Each Trial**

Each trial consists of a sequence of events, designed to measure the participant’s recall of two digits from a series of items presented on the screen.

1. Fixation Onset: A blank screen (680 ms) to prepare participants.

2. Fixation Cross: A fixation cross ("+") appears for 915 ms, helping participants focus on the center of the screen.

3. Fixation Offset: A short blank screen (400 ms) before the presentation of the stimulus items.

4. Presentation of Items:
   * RSVP Task: The eight items (6 letters and 2 numbers) are displayed in the middle of the screen, one at a time.
   * Each item is presented for 75 ms, followed by a brief blank screen (45 ms) before the next item appears.
   * Background motion with dots moves during the sequence, but participants are instructed to ignore it.

5. Participant Response:
   * After all items are shown, participants are asked to recall the two digits in the sequence.
   * They provide their responses by typing the numbers on their keyboard. If they cannot remember the digits, they can press "x" to skip.


**Stimuli Used**

 1. Visual Items: Sequences contain six letters from the alphabet and two digits randomly selected from [1, 2, 3, 4]. These are presented in rapid succession.
 2. Background Motion: Random dot motion pattern (RDP) is shown in the background during the trials, but participants are told to ignore it.



Please paste (SweetBean) code you used for creating your experiment. If you didn't use SweetBean, you can paste the JavaScript code here. (2 points)

In [2]:
def stimulus_sequence(experiment_timeline, training_timeline, to_html=False):
  """
  input:
      experiment_timeline:
      training_timeline:
      to_html: specifies wether html file of experiment should be created (False by default)

  output:
    whole experiment as JavaScript code
  """
    ############################################################################
    # define timeline variables
    ############################################################################
    coherence_ratio = TimelineVariable("coherence_ratio", [])
    motion_direction = TimelineVariable("motion_direction", [])

    item_1 = TimelineVariable("item_1", [])
    item_2 = TimelineVariable("item_2", [])
    item_3 = TimelineVariable("item_3", [])
    item_4 = TimelineVariable("item_4", [])
    item_5 = TimelineVariable("item_5", [])
    item_6 = TimelineVariable("item_6", [])
    item_7 = TimelineVariable("item_7", [])
    item_8 = TimelineVariable("item_8", [])

    correct_choice = TimelineVariable("correct_choice", [])
    sequence_type = TimelineVariable("sequence_type", [])
    choices = TimelineVariable("choices", [])

    ############################################################################
    # create all stimulus sequences needed for whole experiment
    ############################################################################

    # write all instructions
    # introduction
    introduction = TextStimulus(
        text="<p>Welcome to the experiment!<br>Let's look into the instructions of the experiment first.</p>Press SPACE to continue.",
        choices=[" "],
    )

    # instructions on experiment divided into three different screens
    instruction_1 = TextStimulus(
        text="<p>In this experiment, you will see a sequence of <strong>eight items</strong>, which consist of <strong>six letters</strong> " +
        "out of the whole alphabet and <strong>two digits</strong> out of '1', '2', '3', and '4'.<br>The items will be presented one after " +
        "another in the middle of your screen.</p>Press SPACE to continue.",
        choices=[" "],
    )

    instruction_2 = TextStimulus(
        text="<p>Your task will be to focus on the two digits and remember them. You will then also be asked to report them by pressing the" +
        "according keys on your keyboard. The order in which they were presented doesn't matter here – just the two digits themselves." +
        "</p>Press SPACE to continue.",
        choices=[" "],
    )

    instruction_3 = TextStimulus(
        text="<p>There will also be some pattern with moving dots displayed in the background, but you don't have to pay attention to that." +
        "<br>Try focusing on the sequence, and in particular, the two digits instead.</p>Press SPACE to continue.",
        choices=[" "],
    )

    # training onboarding
    training_boarding = TextStimulus(
        text="<p>Now let's look at some examples.<br>In the following, we have a couple of training rounds for you to get accustomed to the " +
        "experiment and the task.</p>Press SPACE to start the training.",
        choices=[" "],
    )

    # experiment onboarding
    experiment_boarding = TextStimulus(
        text="<p>Now that you know how the experiment works and what you have to do, let's start with the actual experiment!" +
        "<br>It will look exactly like the training you just had. Again, try to focus on the sequence and the two digits in the middle of " +
        "your screen. Good luck!</p>Press SPACE to start the experiment.",
        choices=[" "],
    )

    # break
    pause = TextStimulus(
        text="<p>Feel free to take a short break now!</p>Press SPACE when you are ready to continue the experiment.",
        choices=[" "],
    )

    # debriefing
    debriefing = TextStimulus(
        text="<p>Congratulations, you finished the experiment!<br>Thank you for participating and for sticking until the end!" +
        "I hope you had at least some fun during all of it.</p>Press SPACE to continue.",
        choices=[" "],
    )

    # feedback
    feedback = TextSurveyStimulus(
        prompts=[
            "<p>If you have any feedback for us, we would appreciate hearing about it. Did you encounter any issues during the experiment?" +
            "Do you have any suggestions for improvement? Or do you have any other comments you want to share with us? Let us know here!</p>"
        ]
    )

    # closure
    closure = TextStimulus(
        text="<p>Thank you once again for your participation, and have a great day!</p>Press SPACE to end the experiment.",
        choices=[" "],
    )


    # screen with fixation cross and blank screens around it
    fixation_onset = BlankStimulus(duration=680)
    fixation = TextStimulus(duration=915, text="+")
    fixation_offset = BlankStimulus(duration=400)

    # blank screen in between items of a sequence within a trial
    between_items = BlankStimulus(duration=45)

    # participant response, i.e. input of two numbers within previously displayed sequence
    response_1 = TextStimulus(
        text="<p>Use your keyboard to enter a digit which you recall seeing out of<br>'1', '2', '3', '4'</p>" +
        "If you cannot remember any number, press 'x'.",
        choices=["1", "2", "3", "4", "x"],
        correct_key=correct_choice,
    )
    response_2 = TextStimulus(
        text="<p>Use your keyboard to enter the other digit you recall seeing out of<br>'1', '2', '3', '4'.</p>" +
        "If you cannot remember another number, press 'x'.",
        choices=["1", "2", "3", "4", "x"],
        correct_key=correct_choice,
    )

    ############################################################################
    # function for creating screens for individual items of a trial sequence, i.e. of RSVP task
    ############################################################################
    def rsvp_maker(
        item,
        coherence_ratio=coherence_ratio,
        motion_direction=motion_direction,
        correct_choice=correct_choice,
        sequence_type=sequence_type,
    ):
    """
    input:
        item:
        coherence_ratio:
        motion_direction:
        correct_choice:
        sequence_type:

    output:


    """
        rdp = rdp_rsvp_stimulus(
            duration=75,
            number_of_oobs=20,
            number_of_apertures=1,
            movement_speed=40,
            coherence_movement=coherence_ratio,
            coherent_movement_direction=motion_direction,
            oob_color="white",
            background_color="black",
            aperture_height=300,
            aperture_width=300,
            stimulus_type=1,  # 1 is for circles
            text=sequence_type,
            prompt=item,
            color="black",
            correct_key=correct_choice,
        )
        return rdp

    ############################################################################
    # create all lists of stimulus sequences and individual blocks
    ############################################################################
    introduction_list = [introduction]
    introduction_block = Block(introduction_list)

    instruction_list = [instruction_1, instruction_2, instruction_3]
    instruction_block = Block(instruction_list)

    training_boarding_list = [training_boarding]
    training_boarding_block = Block(training_boarding_list)

    # training
    training_list = [
        fixation_onset,
        fixation,
        fixation_offset,
        rsvp_maker(item_1),
        between_items,
        rsvp_maker(item_2),
        between_items,
        rsvp_maker(item_3),
        between_items,
        rsvp_maker(item_4),
        between_items,
        rsvp_maker(item_5),
        between_items,
        rsvp_maker(item_6),
        between_items,
        rsvp_maker(item_7),
        between_items,
        rsvp_maker(item_8),
        response_1,
        response_2,
    ]

    training_block = Block(training_list, training_timeline)

    experiment_boarding_list = [experiment_boarding]
    experiment_boarding_block = Block(experiment_boarding_list)

    # actual experiment
    experiment_list = [
        fixation_onset,
        fixation,
        fixation_offset,
        rsvp_maker(item_1),
        between_items,
        rsvp_maker(item_2),
        between_items,
        rsvp_maker(item_3),
        between_items,
        rsvp_maker(item_4),
        between_items,
        rsvp_maker(item_5),
        between_items,
        rsvp_maker(item_6),
        between_items,
        rsvp_maker(item_7),
        between_items,
        rsvp_maker(item_8),
        response_1,
        response_2,
    ]
    experiment_block = Block(experiment_list, experiment_timeline)

    debriefing_list = [debriefing, feedback, closure]
    debriefing_block = Block(debriefing_list)

    ############################################################################
    # set up the final experiment consisting of all blocks
    ############################################################################
    block_list = [
        introduction_block,
        instruction_block,
        training_boarding_block,
        training_block,
        experiment_boarding_block,
        experiment_block,
        debriefing_block,
    ]

    # create final, whole experiment
    experiment = Experiment(block_list)

    # create html file of experiment if specified to do so based on function input
    if to_html:
        return experiment.to_html("test_experiment.html")

    return experiment.to_js_string(as_function=True, is_async=True)

IndentationError: unexpected indent (<ipython-input-2-68b2568a971c>, line 15)

# Preprocessing (2 points)

Describe how you are pre-processing the raw data from the experiment for further analysis. (1 point)

 The pre-processing pipleline:

 1. Load raw data.
 2. Separate rok_trials and response_trials.
 3. Deduplicate and structure trials.
 4. Merge stimuli conditions and responses.
 5. Calculate hits and misses for each trial.
 6. Group by condition and compute the dependent variable (d_prime).




Paste the relevant preprocessing code below. (1 point)

In [None]:

data_raw = experiment_runner()  # returns observations for each condition as jsPsych data
print("## got raw data ##")
print("data lenght", len(data_raw))
print("data type", type(data_raw), "type of first element", type(data_raw[0]))
# print("data_raw[0]", data_raw[0])

# process the experiment data
experiment_data = pd.DataFrame()
_df = trial_list_to_experiment_data(data_raw)
experiment_data = pd.concat([experiment_data, _df], axis=0)
print("processed experiment_data:")
display(experiment_data)
return Delta(experiment_data=experiment_data)


def trial_list_to_experiment_data(trial_sequence):
    """
    Parse a trial sequence (from jsPsych) into dependent and independent variables
    independent: coherence_ratio, motion_direction
    dependent: d_prime
    """
    trial_sequence = pd.DataFrame(trial_sequence).fillna(pd.NA)
    # display(trial_sequence.head())

    # target cleaned up data:
    # index(actual trial index), coherence_ratio, motion_direction, response, bean_correct_key
    # inference: hit, miss, d_prime
    # final: index(actual trial index), coherence_ratio, motion_direction, d_prime

    # get all rok trials
    rok_trials = trial_sequence[trial_sequence["trial_type"] == "rok"]
    # get only relevant columns
    rok_trials = rok_trials.loc[:, ["bean_text", "coherence_movement", "coherent_movement_direction"]]
    rok_trials = rok_trials.rename(
        columns={
            "bean_text": "type",
            "coherence_movement": "coherence_ratio",
            "coherent_movement_direction": "motion_direction",
        }
    )
    # get rid of duplicated information (each 8 rows are one actual trial)
    rok_trials = rok_trials.reset_index(drop=True)
    rok_trials = rok_trials[::8]
    rok_trials = rok_trials.reset_index(drop=True)
    # display(rok_trials)

    # get all responses trails
    # all the html-keyboard-response where bean_correct_key in not null or empty or NA / NaN
    response_trials = trial_sequence[
        (trial_sequence["trial_type"] == "html-keyboard-response") & (trial_sequence["bean_correct_key"].notna())
    ]

    # get only relevant columns
    response_trials = response_trials.loc[:, ["response", "bean_correct_key"]]
    response_trials = response_trials.rename(columns={"bean_correct_key": "correct_response"})
    # put each 2 response trials after each others into one row
    # pair the responses
    responses = response_trials["response"].values.reshape(-1, 2).tolist()
    # make sure responses are floats
    responses = [[float(r) for r in response] for response in responses]
    # get rid of the duplicates
    response_trials = response_trials[::2]
    response_trials["response"] = responses
    # convert correct_response from a string list "[1,2]" to a list [1,2]
    response_trials["correct_response"] = response_trials["correct_response"].apply(lambda x: json.loads(x))
    response_trials = response_trials.reset_index(drop=True)
    # display(response_trials)

    # merge the two dataframes
    trials = pd.concat([rok_trials, response_trials], axis=1)

    # infer the hit and miss per trial
    # hit: number of elements in the
    # miss: number of elements in the correct_response that are not in the response
    def num_hits(array_1, array_2):
        sorted_array_1 = np.sort(array_1)
        sorted_array_2 = np.sort(array_2)
        return np.sum(sorted_array_1 == sorted_array_2)

    def num_misses(array_1, array_2):
        sorted_array_1 = np.sort(array_1)
        sorted_array_2 = np.sort(array_2)
        return np.sum(sorted_array_1 != sorted_array_2)

    trials["hit"] = trials.apply(lambda x: num_hits(x["response"], x["correct_response"]), axis=1)
    trials["miss"] = trials.apply(lambda x: num_misses(x["response"], x["correct_response"]), axis=1)
    display(trials)

    # get only target trials (type: "target")
    trials = trials[trials["type"] == "target"]

    # group the trails on condition (coherence_ratio, motion_direction)
    trials_grouped = trials.groupby(["coherence_ratio", "motion_direction"]).agg({"hit": "sum", "miss": "sum"})
    # calculate d_prime
    # d_prime: d_prime(hit, miss)
    # where hit and misses are aggregated over all trials with the same conditions
    trials_grouped["d_prime"] = trials_grouped.apply(lambda x: d_prime(x["hit"], x["miss"]), axis=1)
    trials_grouped = trials_grouped.reset_index()
    display(trials_grouped)

    # select only the experiment data columns (drop hit and miss and type)
    trials_grouped = trials_grouped.loc[:, ["coherence_ratio", "motion_direction", "d_prime"]]
    # display(trials_grouped)
    return trials_grouped


# Model Discovery Method (3 points)

Describe the model discovery method you are using here. How did you parameterize it and why? (2 points)

*YOUR ANSWER GOES HERE*

Paste the code that instantiates and parameterizes the model discovery method (this may be a single line defining an ``sklearn`` regressor). (1 point)

In [None]:
# YOUR CODE GOES HERE

# Experimental Sampling Method (4 points)

Describe the experimental sampling method you are using here. What is the design space you are searching over? What are the inputs to the experimental sampling method? And is your method selecting the new experiment condition? (3 points)

Random sampling randomly selects experimental conditions from the design space of independent variables, aiming to sample without bias or specific optimization strategies.

Design Space

The design space you are searching over consists of two independent variables:
1. Coherence Ratio: Values range from 0% to 100% (in increments of 1%).
2. Motion Direction: Values range from 0° to 360° (in increments of 1°).

This forms a continuous 2D space where each combination of a coherence ratio and a motion direction defines a possible experimental condition.

The inputs to the pool method include:

1. Variable Collection: A set of independent variables with defined ranges (coherence_ratio and motion_direction) and a dependent variable (d_prime).
2. Number of Samples: num_samples=2 is passed, meaning two new conditions are selected randomly each time.



The pool method selects new experimental conditions by randomly sampling from the allowed values of the independent variables. For each sample, it selects a pair of coherence_ratio and motion_direction from their respective ranges, generating a condition to be tested in the next iteration of the experiment.

This sampling method does not account for prior knowledge or the results of previous experiments. It simply ensures that different combinations of the independent variables are explored without favoring any particular region of the design space.

Paste the function implementing your experimental sampling method. If you are using an off-the-shelf AutoRA experimentalist, you may instead just paste the declaration of the experimentalist. (1 point)

In [4]:
from autora.experimentalist.random import pool

# AutoRA Workflow (5 points)

Describe the flow of events in your closed-loop discovery loop. Make sure to explain, for each AutoRA component you are using, which inputs are provided to the component, and what outputs the component is producing. Make also sure to outline the sequence of events executed in the AutoRA loop. (3 points)

AutoRA component

1. **Experimentalist:**

* Input: **The experimentalist_on_state function** takes in the current state, the variables (independent and dependent), and a specified number of conditions to sample (num_samples).
* Component: **The pool function** randomly samples the space of independent variables (coherence_ratio, motion_direction), selecting new conditions to test.
* Output: The output is a set of conditions (independent variable values) that will be used in the next experiment run.
* Role: **The experimentalist** creates new combinations of conditions for the experiment runner.

2. **Experiment Runner:**

* Input: **The runner_on_state function** uses the newly generated conditions (from the experimentalist) to run an experiment. The conditions are transformed into a sequence of trials using the trial_sequences and stimulus_sequence functions, which generate the experimental stimuli.

* Component: **The experiment** is conducted, generating a set of raw data observations.
* Output: This raw data is processed by **the trial_list_to_experiment_data function** to calculate the dependent variable (d_prime) from the responses in the experiment.
* Role: The **experiment runner** tests the conditions generated by the experimentalist and returns the corresponding outcomes.

3. **Theorist:**

* Input: the theorist_on_state function uses the processed experimental data (independent variables, dependent variable d_prime) and the theorist model (LR model)
* Output: the output is the updated model(theorist), which is returned as part of the updated state.
* Role: the theorist refines its understanding of the relationships between the independent and dependent variables based on new experimental data.



**The outline of the Sequence of Events in the AutoRA Loop**

The loop begins with a random sampling of conditions by the experimentalist, which are used to generate trials, and the experiment runner gathers participant responses. The raw data is processed into a clean dataset, where dependent variables (d_prime) are computed based on the participant’s performance. The theorist fits a model to this data, refining its predictions for future experiments. The loop continues for a set number of iterations, with each cycle using the updated state to further refine the experimentalist’s sampling and the theorist’s model.

By cycling through these components, the system progressively learns about the relationships between the independent variables and outcomes in an efficient, adaptive manner.




    

Paste the code implementing the AutoRA workflow below. (2 points)

In [None]:
"""
Basic Workflow
    Single condition Variable (0-1), Single Observation Variable(0-1)
    Theorist: LinearRegression
    Experimentalist: Random Sampling
    Runner: Firebase Runner (no prolific recruitment)
"""

import json

from autora.variable import VariableCollection, Variable
from autora.experimentalist.random import pool
from autora.experiment_runner.firebase_prolific import firebase_runner
from autora.state import StandardState, on_state, Delta

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sweetbean.sequence import Block, Experiment
from sweetbean.stimulus import TextStimulus

from trial_sequence import trial_sequences
from stimulus_sequence import stimulus_sequence
from utils import update_html_script
from methods import d_prime
from IPython.display import display
import os


def psudo_experiment_runner():
    # load a csv file called experiment_data.csv
    if not os.path.exists("myexperiment.csv"):
        raise FileNotFoundError("myexperiment.csv not found")

    raw_data = pd.read_csv("myexperiment.csv")
    return raw_data.to_dict(orient="records")


def run_experiment_once():
    experiment_seq = trial_sequences(
        coherence_ratios=[100],
        motion_directions=[0],
        num_repetitions=2,
        sequence_type="target",
    )

    training_seq = trial_sequences(
        coherence_ratios=[90],
        motion_directions=[45],
        num_repetitions=1,
        sequence_type="training",
    )

    print("len training sequence: ", len(training_seq[0]))
    print("len experiment sequence: ", len(experiment_seq[0]))

    display(experiment_seq[0])
    display(training_seq[0])

    js_code = stimulus_sequence(experiment_seq[0], training_seq[0], to_html=True)
    # print(stimulus_seq)
    update_html_script("test_experiment.html")


# To use the theorist on the state object, we wrap it with the on_state functionality and return a
# Delta object.
# Note: The if the input arguments of the theorist_on_state function are state-fields like
# experiment_data, variables, ... , then using this function on a state object will automatically
# use those state fields.
# The output of these functions is always a Delta object. The keyword argument in this case, tells
# the state object witch field to update.


@on_state()
def theorist_on_state(experiment_data, variables, theorist):
    ivs = [iv.name for iv in variables.independent_variables]
    dvs = [dv.name for dv in variables.dependent_variables]
    x = experiment_data[ivs]
    y = experiment_data[dvs]
    return Delta(models=[theorist.fit(x, y)])


# ** Experimentalist ** #
# Here, we use a random pool and use the wrapper to create a on state function
# Note: The argument num_samples is not a state field. Instead, we will pass it in when calling
# the function


@on_state()
def experimentalist_on_state(variables, num_samples, experimentalist=pool):
    return Delta(conditions=experimentalist(variables, num_samples))


# Again, we need to wrap the runner to use it on the state. Here, we send the raw conditions.
@on_state()
def runner_on_state(conditions):
    # Here, we convert conditions into sweet bean code to send the complete experiment code
    # directly to the server

    coherence_ratios_list = list(conditions["coherence_ratio"])
    motion_directions_list = list(conditions["motion_direction"])
    conditions_to_send = conditions.copy()

    # global training_seq
    # experiment_timeline = trial_sequences(coherence_ratios_list, motion_directions_list, all_items_in_one_trial=True)[0]
    # js_code = stimulus_sequence(experiment_timeline, training_timeline=training_seq, to_html=False)

    experiment_seq = trial_sequences(
        coherence_ratios=[100],
        motion_directions=[0],
        num_repetitions=2,
        sequence_type="target",
    )

    training_seq = trial_sequences(
        coherence_ratios=[90],
        motion_directions=[45],
        num_repetitions=1,
        sequence_type="training",
    )

    print("len training sequence: ", len(training_seq[0]))
    print("len experiment sequence: ", len(experiment_seq[0]))

    display(pd.DataFrame(experiment_seq[0]).head())
    display(pd.DataFrame(training_seq[0]).head())

    js_code = stimulus_sequence(experiment_seq[0], training_seq[0], to_html=False)
    conditions_to_send["experiment_code"] = js_code

    # dev
    data_raw = experiment_runner()  # returns observations for each condition as jsPsych data
    print("## got raw data ##")
    print("data lenght", len(data_raw))
    print("data type", type(data_raw), "type of first element", type(data_raw[0]))
    # print("data_raw[0]", data_raw[0])

    # process the experiment data
    experiment_data = pd.DataFrame()
    _df = trial_list_to_experiment_data(data_raw)
    experiment_data = pd.concat([experiment_data, _df], axis=0)
    print("processed experiment_data:")
    display(experiment_data)
    return Delta(experiment_data=experiment_data)


def trial_list_to_experiment_data(trial_sequence):
    """
    Parse a trial sequence (from jsPsych) into dependent and independent variables
    independent: coherence_ratio, motion_direction
    dependent: d_prime
    """
    trial_sequence = pd.DataFrame(trial_sequence).fillna(pd.NA)
    # display(trial_sequence.head())

    # target cleaned up data:
    # index(actual trial index), coherence_ratio, motion_direction, response, bean_correct_key
    # inference: hit, miss, d_prime
    # final: index(actual trial index), coherence_ratio, motion_direction, d_prime

    # get all rok trials
    rok_trials = trial_sequence[trial_sequence["trial_type"] == "rok"]
    # get only relevant columns
    rok_trials = rok_trials.loc[:, ["bean_text", "coherence_movement", "coherent_movement_direction"]]
    rok_trials = rok_trials.rename(
        columns={
            "bean_text": "type",
            "coherence_movement": "coherence_ratio",
            "coherent_movement_direction": "motion_direction",
        }
    )
    # get rid of duplicated information (each 8 rows are one actual trial)
    rok_trials = rok_trials.reset_index(drop=True)
    rok_trials = rok_trials[::8]
    rok_trials = rok_trials.reset_index(drop=True)
    # display(rok_trials)

    # get all responses trails
    # all the html-keyboard-response where bean_correct_key in not null or empty or NA / NaN
    response_trials = trial_sequence[
        (trial_sequence["trial_type"] == "html-keyboard-response") & (trial_sequence["bean_correct_key"].notna())
    ]

    # get only relevant columns
    response_trials = response_trials.loc[:, ["response", "bean_correct_key"]]
    response_trials = response_trials.rename(columns={"bean_correct_key": "correct_response"})
    # put each 2 response trials after each others into one row
    # pair the responses
    responses = response_trials["response"].values.reshape(-1, 2).tolist()
    # make sure responses are floats
    responses = [[float(r) for r in response] for response in responses]
    # get rid of the duplicates
    response_trials = response_trials[::2]
    response_trials["response"] = responses
    # convert correct_response from a string list "[1,2]" to a list [1,2]
    response_trials["correct_response"] = response_trials["correct_response"].apply(lambda x: json.loads(x))
    response_trials = response_trials.reset_index(drop=True)
    # display(response_trials)

    # merge the two dataframes
    trials = pd.concat([rok_trials, response_trials], axis=1)

    # infer the hit and miss per trial
    # hit: number of elements in the
    # miss: number of elements in the correct_response that are not in the response
    def num_hits(array_1, array_2):
        sorted_array_1 = np.sort(array_1)
        sorted_array_2 = np.sort(array_2)
        return np.sum(sorted_array_1 == sorted_array_2)

    def num_misses(array_1, array_2):
        sorted_array_1 = np.sort(array_1)
        sorted_array_2 = np.sort(array_2)
        return np.sum(sorted_array_1 != sorted_array_2)

    trials["hit"] = trials.apply(lambda x: num_hits(x["response"], x["correct_response"]), axis=1)
    trials["miss"] = trials.apply(lambda x: num_misses(x["response"], x["correct_response"]), axis=1)
    display(trials)

    # get only target trials (type: "target")
    trials = trials[trials["type"] == "target"]

    # group the trails on condition (coherence_ratio, motion_direction)
    trials_grouped = trials.groupby(["coherence_ratio", "motion_direction"]).agg({"hit": "sum", "miss": "sum"})
    # calculate d_prime
    # d_prime: d_prime(hit, miss)
    # where hit and misses are aggregated over all trials with the same conditions
    trials_grouped["d_prime"] = trials_grouped.apply(lambda x: d_prime(x["hit"], x["miss"]), axis=1)
    trials_grouped = trials_grouped.reset_index()
    display(trials_grouped)

    # select only the experiment data columns (drop hit and miss and type)
    trials_grouped = trials_grouped.loc[:, ["coherence_ratio", "motion_direction", "d_prime"]]
    # display(trials_grouped)
    return trials_grouped


if __name__ == "__main__":

    # run_experiment_once()
    # to run the experiment once localy and download the data
    # -> either uncomment the above line
    # OR
    # -> run it from terminal
    # python
    # >>> from autora_workflow import run_experiment_once
    # >>> run_experiment_once()

    # *** Set up variables *** #
    # independent variable is coherence in percent (0 - 100)
    # dependent variable is rt in ms (0 - 10000)
    variables = VariableCollection(
        independent_variables=[
            Variable(name="coherence_ratio", allowed_values=np.linspace(0, 100, 100)),
            Variable(name="motion_direction", allowed_values=np.linspace(0, 360, 360)),
        ],
        dependent_variables=[Variable(name="d_prime", value_range=(0, 10000))],
    )

    # *** State *** #
    # With the variables, we can set up a state. The state object represents the state of our
    # closed loop experiment.

    state = StandardState(
        variables=variables,
    )

    # *** Components/Agents *** #
    # Components are functions that run on the state. The main components are:
    # - theorist
    # - experiment-runner
    # - experimentalist
    # See more about components here: https://autoresearch.github.io/autora/

    # ** Theorist ** #
    # Here we use a linear regression as theorist, but you can use other theorists included in
    # autora (for a list: https://autoresearch.github.io/autora/theorist/)

    theorist = LinearRegression()

    # ** Experiment Runner ** #
    # We will run our experiment on firebase and need credentials. You will find them here:
    # (https://console.firebase.google.com/)
    #   -> project -> project settings -> service accounts -> generate new private key

    firebase_credentials = {}

    # simple experiment runner that runs the experiment on firebase
    # experiment_runner = firebase_runner(firebase_credentials=firebase_credentials, time_out=100, sleep_time=5)
    # DEV
    experiment_runner = psudo_experiment_runner

    # Now, we can run our components
    # this is the cycle!
    print("## Start the cycle ##")
    for _ in range(3):
        print(f"## Iteration {_} ##")
        state = experimentalist_on_state(
            state, num_samples=2, experimentalist=pool
        )  # Collect 2 conditions per iteration
        print("## experimentalist done")
        state = runner_on_state(state)
        print("## runner done")
        state = theorist_on_state(state, theorist=theorist)
        print("## theorist done")



# Results (2 bonus points)

You may describe any results from your scientific discovery process here.