# ISC ablation experiments

This notebook contains bar graphs of the results of the ablation experiments, with some details about each experiment in particular preceeding each bar graph.

## Replication of paper

First, I replicated the results from Giallanza et al (2024).

In [4]:
from make_plot import make_model_plot

make_model_plot(
    [('ISC Model', 'data/isc_simulation_data_0200.csv')],
    include_human=True
)

## Fully ablated MLP

I next looked at the performance of a fully-connected MLP with one hidden layer and no pretraining, and was surprised to see that the results matched the ISC model.

In [5]:
models = [
    ('ISC Model', 'data/isc_simulation_data_0200.csv'),
    ('MLP Model', 'data/mlp_simulation_data_0200.csv'),
]

make_model_plot(
    models,
    include_human=False
)

## Varying batch sizes

Next, I decreased the batch size of the model to 1, anticipating that this would yield identical results for blocked and interleaved models (since training blocks of size 1 are effectively interleaved data). This decrease in batch sizes did not change the results, implying that the blocked/interleaved paradigms were encoded in some other way than the actual training curriculum.

In [6]:
models = [
    ('MLP Model', 'data/mlp_simulation_data_0200.csv'),
    ('MLP (Batch size 1)', 'data/batch1_simulation_data_0200.csv'),
]

make_model_plot(
    models,
    include_human=False
)

## Removing category context

I then noticed that the blocked/interleaved learning was encoded via some additional context given to the blocked (but not interleaved) models. As such, I wanted to see what would happen if this context was removed. I expected that the results would be identical across blocked and interleaved, and finally I turned out to be correct.

In [7]:
models = [
    ('MLP Model', 'data/mlp_simulation_data_0200.csv'),
    ('MLP (No context)', 'data/ablated_simulation_data_0200.csv'),
]

make_model_plot(
    models,
    include_human=False
)

## Temporal blocks

Out of curiosity, what happens when the ISC model is given the temporally blocked learning curriculum rather than the annotated categorically blocked curriculum (as in the original experiment)? Perhaps surprisingly, we once again see identical results, implying that the main driver of the qualitative change observed in Giallanza is coming from the annotated category context.

In [8]:
models = [
    ('ISC Model', 'data/isc_simulation_data_0200.csv'),
    ('ISC (Temporal)', 'data/isc_temporal_data_0200.csv'),
]

make_model_plot(
    models,
    include_human=False
)