In [1]:
from causality_simulation2 import *

# Add picture of truffula tree

# Part 1

The truffula tree bears fruit every year during summer. Hans, a botanist, wants to know whether the number of fruits that a truffula tree bears affects the neighbouring bee population. Out of 500 trees in his truffula orchard, he carefully records the average daily number of bees that land on each tree, as well as the total number of fruits during the whole fruit-bearing season.

Run the following two boxes. In the "Tree Group Assignment" plot, you will see a aerial view of his orchard, where each tree is represented by a blue dot.

In [9]:
x = np.linspace(0, 1000, 50)
y = np.linspace(0, 1000, 50)
grid = np.transpose([np.tile(x, len(y)), np.repeat(y, len(x))])
ids = sorted(np.random.choice(np.arange(50*50), 500))
coords = np.array([grid[i] for i in ids])
init_data = {
    'Longitude': coords[:,0],
    'Latitude': coords[:,1]
}

In [10]:
assignment_observation = {
    'name': 'Observation',
    'samples_str': '1-500'
}
experiment_observation = Experiment(truffula, init_data)
experiment_observation.assignment(config=[assignment_observation], hide_random=True)

Button(description='Randomise assignment', layout=Layout(width='180px'), style=ButtonStyle())

VBox(children=(HBox(children=(Label(value='Group 1', layout=Layout(width='70px')), Text(value='', description=…

Button(description='Add another group', style=ButtonStyle())

Button(description='Visualise assignment', style=ButtonStyle())

FigureWidget({
    'data': [{'hovertemplate': 'Latitude: %{x} <br>Longitude: %{y} <br>',
              'marker…

Run the following box to see how Hans set up his observational experiment. Note that he is not performing any intervention on the number of bees or the number of fruits. He simply sits back and watches nature unfold.

Click on "Perform experiment" to perform the experiment.

In [11]:
experiment_observation.setting(show=['Number of Bees', 'Number of Fruits'], disable='all')

Label(value='Group name: Observation, 500 samples')

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Bees', layout=Layout(width…

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Fruits', layout=Layout(wid…

Button(description='Perform experiment', style=ButtonStyle())

Label(value='Data from experiment collected!')

# Warning: "You have not performed the experiment yet!"

In the following aerial view of the orchard, each tree has been colour-coded according to the number of bees that Hans has recorded. If you wish to see the colour-coding according to the number of fruits instead, use the "Gradient" dropdown menu and select "Number of Fruits".

In [12]:
experiment_observation.plotOrchard(gradient='Number of Bees', show=['Number of Bees', 'Number of Fruits'])

VBox(children=(HBox(children=(Dropdown(description='Gradient: ', options=('Number of Bees', 'Number of Fruits'…

Run the following box to plot the data that Hans has recorded. Remember that he is trying to see if there is any relationship between the number of bees and the number of fruits. Select "Number of Fruits" for the "x-Axis Variable" and "Number of Bees" for the "y-Axis Variable".

In [13]:
experiment_observation.plot(show=['Number of Bees', 'Number of Fruits'])

VBox(children=(HBox(children=(Dropdown(description='x-Axis Variable: ', options=('Number of Bees', 'Number of …

RadioButtons(description='Group', layout=Layout(width='max-content'), options=('Observation', 'All'), value='O…



## Question 1

__What is the correlation (r) between the number of fruits and the number of bees?__

__Is the correlation statistically significant?__ Recall that the p-value tells you how likely that this correlation is merely a result of random chance. The smaller the p-value, the more statistically significant the correlation is.

__What does that tell you about the causal relationship between the number of fruits and the number of bees? In other words, what conclusion can we make about the hypothesis "having more truffula fruits causes neighbouring bee population to increase"?__

# Part 2

Hans then decides to perform an experiment. Over the course of the following summer, he randomly selects 250 out of the 500 truffula trees, and diligently plucks out all the fruits that are starting to grow on these trees. The other 250 trees are left alone. He continues to record the average daily number of bees that land on each of the 500 trees.

Run the following two boxes. In the "Tree Group Assignments" plot, trees that assigned to the control group are marked by blue dots, while those assigned to the intervention (no bees) group are marked by red squares.

In [15]:
assignment_nofruits_control = {
    'name': 'Control',
    'samples_str': ''
}
assignment_nofruits_intervene = {
    'name': 'Intervention (no fruits)',
    'samples_str': ''
}
assignment = [assignment_nofruits_control, assignment_nofruits_intervene]
experiment_nofruits = Experiment(truffula, init_data)
experiment_nofruits.assignment(config=assignment)

Button(description='Randomise assignment', layout=Layout(width='180px'), style=ButtonStyle())

VBox(children=(HBox(children=(Label(value='Group 1', layout=Layout(width='70px')), Text(value='', description=…

Button(description='Add another group', style=ButtonStyle())

Button(description='Visualise assignment', style=ButtonStyle())

FigureWidget({
    'data': [{'hovertemplate': 'Latitude: %{x} <br>Longitude: %{y} <br>',
              'marker…

Run the following box to see how Hans set up his experiment. In the control group, he performs no intervention, letting nature run its course. In the intervention group, he fixes the number of fruits to 0 (by nipping them in the bud).

Click on "Perform experiment" to perform the experiment.

In [16]:
intervention_nofruits_control = {
    'name': 'Control',
    'intervention': {
    }
}
intervention_nofruits_intervene = {
    'name': 'Intervention (no fruits)',
    'intervention': {
        'Number of Fruits': ['fixed', 0]
    }
}
intervention = [intervention_nofruits_control, intervention_nofruits_intervene]
experiment_nofruits.setting(config=intervention, show=['Number of Bees', 'Number of Fruits'], disable='all')

Label(value='Group name: Control, 250 samples')

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Bees', layout=Layout(width…

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Fruits', layout=Layout(wid…

Label(value='Group name: Intervention (no fruits), 250 samples')

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Bees', layout=Layout(width…

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Fruits', layout=Layout(wid…

Button(description='Perform experiment', style=ButtonStyle())

Label(value='Data from experiment collected!')

Run the following box to plot the data. Choose "Number of Bees" for the "x-Axis Variable" and "None (Distribution Only)" for the "y-Axis Variable". This way, we can see two separate histograms of the number of bees on each tree, one for the control group and one for the intervention (no fruits) group.

In [17]:
experiment_nofruits.plot(show=['Number of Bees', 'Number of Fruits'])

VBox(children=(HBox(children=(Dropdown(description='x-Axis Variable: ', options=('Number of Bees', 'Number of …

RadioButtons(description='Group', layout=Layout(width='max-content'), options=('Control', 'Intervention (no fr…



## Question 2

__Is there any strong correlation between removing the fruits and the number of bees?__ (You can answer yes, no, or can't really tell)

__What does that tell you about the causal relationship between the number of fruits and the number of bees? In other words, what conclusion can we make about the hypothesis "having more truffula fruits causes neighbouring bee population to increase"?__

# Part 3

Now it's your turn! Design an experiment to further investigate the causal relationship between the number of bees and the number of fruits that a truffula tree bears.

First run the following box. Then, create any number of experimental groups by clicking "Add another group". You can name the groups anything you like, and you can write down the IDs (1 through 500) of the trees you wish to assign to each group into the "Assigned samples" box. If you wish to randomly assign the trees evenly to all the groups, click on the "Randomise assignment" button. To save and visualise the assignment, click on "Visualise assignment".

In [18]:
experiment = Experiment(truffula, init_data)
experiment.assignment()

Button(description='Randomise assignment', layout=Layout(width='180px'), style=ButtonStyle())

VBox(children=(HBox(children=(Label(value='Group 1', layout=Layout(width='70px')), Text(value='', description=…

Button(description='Add another group', style=ButtonStyle())

Button(description='Visualise assignment', style=ButtonStyle())

FigureWidget({
    'data': [{'hovertemplate': 'Latitude: %{x} <br>Longitude: %{y} <br>',
              'marker…

Run the following box. For each group, set up the corresponding intervention (or no intervention). If you are confused, take another look at Hans' setup in Part 2 of this notebook.

Remember to click on "Perform experiment" once you've completed the setup!

In [19]:
experiment.setting(show=['Number of Bees', 'Number of Fruits'])

Label(value='Group name: Control, 250 samples')

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Bees', layout=Layout(width…

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Fruits', layout=Layout(wid…

Label(value='Group name: Intervention (no bees), 250 samples')

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Bees', layout=Layout(width…

HBox(children=(Label(value='', layout=Layout(width='20px')), Label(value='Number of Fruits', layout=Layout(wid…

Button(description='Perform experiment', style=ButtonStyle())

Label(value='Data from experiment collected!')

Run the following box to plot the data. Use the dropdown menus to select the appropriate "x-Axis Variable" and "y-Axis Variable".

In [8]:
experiment.plot(show=['Number of Bees', 'Number of Fruits'])

VBox(children=(HBox(children=(Dropdown(description='x-Axis Variable: ', options=('Number of Bees', 'Number of …

RadioButtons(description='Group', layout=Layout(width='max-content'), options=('Control', 'Intervention (no be…



## Question 3

__From the above experiment that you have designed and performed, what can you conclude about the causal relationship between the number of fruits and the number of bees? Please justify your conclusion using the definition of "causality" introduced in class.__

__(Optional) Can you come up with a situation (causal pathway) that is consistent with all the experimental data so far, but makes the conclusion you made above invalid?__ Be creative!