# Mosquito arms



In [1]:
# Don't change this cell; just run it.
import numpy as np  # The array library.

import pandas as pd
# Safe setting for Pandas.
pd.set_option('mode.copy_on_write', True)

import matplotlib.pyplot as plt
%matplotlib inline

## The data

You will analyze data from a study involving mosquitoes and beer. The paper describing the data is [Beer Consumption Increases Human
Attractiveness to Malaria
Mosquitoes](https://doi.org/10.1371/journal.pone.0009546).

The first author, Dr [Thierry Lefèvre](https://sites.google.com/site/thierryelefevre), kindly sent the original data.

He released the data and derivatives under the [CC-BY](https://creativecommons.org/licenses/by/4.0) license.  Specifically, you should attribute any copies of these data to Dr Thierry Lefèvre, and reference the paper above.

The processed data are in `./data/mosquito_beer.csv`.

Variables in that file are:

* `volunteer`: 43 levels corresponding to the id of the 43
  volunteers.
* `group`: 2 levels "beer" or "water" (= volunteers were
  assigned to either the beer (volunteer 1 to 25) or the water
  treatment (volunteer 26 to 43).
* `test`: 2 levels "after" or "before"  (the attractiveness of
  each volunteer was tested twice: before drinking and 15 min
  after drinking either water or beer).
* `nb_relased`: nb of released mosquitoes (n=50 for each test
  and group).
* `no_odour`: nb of caught mosquitoes in the "no_odour control
  trap".
* `volunt_odour`: nb of caught mosquitoes in the volunteer odour
  trap.
* `activated`: number of trapped mosquitoes (= `no_odour` +
  `volunt_odour`).
* `co2no`: CO2 concentration in the no odour trap.
* `co2od`: CO2 concentration in the volunteer odour trap.
* `temp`: body temperature of the volunteer.
* `trapside`: 2 levels (A or B) this is the side of the
  volunteer odour treatment in the Y-olfactometer (volunteer
  odour on the right side: A or on the left side: B)
* `datetime`: date / time of the corresponding test run.

To read in the data:

In [3]:
mosquitoes = pd.read_csv("./data/mosquito_beer.csv")
mosquitoes.head()

Unnamed: 0,volunteer,group,test,nb_released,no_odour,volunt_odour,activated,co2no,co2od,temp,trapside,datetime
0,subj1,beer,before,50,7,9,16,305.0,321.0,36.1,A,2007-08-28 19:00:00
1,subj2,beer,before,50,26,7,33,338.0,720.0,35.3,B,2007-08-28 21:00:00
2,subj3,beer,before,50,5,10,15,348.0,355.0,36.1,B,2007-09-15 19:00:00
3,subj4,beer,before,50,3,7,10,349.0,437.0,35.6,A,2007-09-25 17:00:00
4,subj5,beer,before,50,2,8,10,396.0,475.0,37.0,B,2007-09-25 18:00:00


## Experimental procedure

These variables were derived from a full experimental setup that was quite sophisticated. Here is the graphic from the paper:

![](experimental_setup.png)

For each trial, there were two tents.

* One tent was empty (the *control* tent).
* The other tent contained a person (the *volunteer* tent).
* A tube led from each tent to a corresponding *trap* box. Thus, there was a
  *control trap* box and a *volunteer trap* box.
* A tube from each trap box fed into an arm of a Y connector.
* The remaining, third arm of the Y led to a *downwind box* containing 50
  mosquitoes.
* At the beginning of the trial, the experimenters opened the *downwind box*
  of mosquitoes, so the mosquitoes could fly out into the Y connector, and
  thence, into either of the *trap* boxes.
* The number of mosquitoes who flew into the *control trap* box gives the
  values for the `no_odour` column.
* The number of mosquitoes who flew into the *volunteer trap* box gives the
  values for the `volunt_odour` column.
* The total number of mosquitoes who flew into either the trap box gives
  the values for the `activated` column.

## Research question

The authors studied **whether people who had drunk beer were more attractive to mosquitoes.**

You too will study this. Firts, you will first filter the data frame to contain only the "after" treatment rows. Each row corresponds to one person in the study. The number for each subject was the number of mosquitoes flying towards them. The subjects were from two groups: people who had just drunk beer, and people who had just drunk water. There were 25 subjects who had drunk beer, and therefore, 25 numbers of mosquitoes corresponding to the "beer" group. There were 18 subjects who had drunk water, and 18 numbers corresponding to the "water" group.

Get the numbers of mosquitoes flying towards the beer drinkers, and towards the water drinkers, after they had drunk their beer or water.

In [6]:
after_rows = mosquitoes[mosquitoes['test'] == 'after']
beer_rows = ... 
beer_activated = ...
water_rows = ... 
water_activated = 

(43, 12)

Check that there are 25 values in the beer group, and 18 in the water group:

In [None]:
print('Number in beer group:', len(beer_activated))
print('Number in water group:', len(water_activated))

We are interested in the difference between the means of these numbers, which you can check here:

In [None]:
observed_difference = np.mean(beer_activated) - np.mean(water_activated)
observed_difference

## Testing for a difference

Your task is to conduct a relevant statistical test to address the research question "does drinking beer make people more attractive to mosquitoes?"

For this, you should:
1) state your hypotheses
2) specify any assumptions
3) conduct the test
4) report the results
5) draw a conclusion

To achieve full marks, you should comment on how you could better answer the question, making use of  variables already provided in the dataset.

Add code/markdown cells as needed.