# Fruit fly simulation

This notebook uses simulation to test the null hypothesis that it is
equally likely that new fruit files are male or female.

#### 21.2.1.1 `paste` to concatenate strings

We are going to make a rather verbose label for the plot below. In order
to do this, we will need to compile a set of strings and numbers
together into a string.

One useful R function to do this is `paste`.

We pass `paste` the strings and values we want it to combine, as
arguments. It converts each argument to a string, and concatenates the
resulting strings, separating them with a space (by default).

For example, to make a new string that concatenates the strings
`'resampling'`, `'is'` and `'better'`, separated by spaces, we could
use:

In [None]:
paste('resampling', 'is', 'better')

If we wanted to insert a number as a string into this sequence, we can
send the number as an argument. `paste` converts the number to a string
before concatenation.

In [None]:
paste('resampling', 'is', 100, 'times', 'better')

`paste` is useful for compiling messages to print, or labels for plots,
as you will see below.

#### 21.2.1.2 Simulation for irradiation and sex of fruit-flies



In [None]:
# Set the number of trials
n_trials <- 10000

# set the sample size for each trial
sample_size <- 20

# An empty array to store the trials
scores <- numeric(n_trials)

# Do 10000 trials
for (i in 1:n_trials) {
    # Generate 20 simulated fruit flies, where each has an equal chance of being
    # male or female
    a <- sample(c('male', 'female'),
                size=sample_size,
                prob=c(0.5, 0.5),
                replace=TRUE)

    # count the number of males in the sample
    b <- sum(a == 'male')

    # store the result of this trial
    scores[i] <- b
}

# Produce a histogram of the trial results
title_of_plot <- paste("Number of males in", n_trials,
                       "samples of", sample_size,
                       "simulated fruit flies")
hist(scores, xlab='Number of Males', main=title_of_plot)

In the histogram above, we see that in about 12 percent of the trials,
the number of males was 14 or more, or 6 or fewer. Or instead of reading
the results from the histogram, we can calculate the result by tacking
on the following commands to the above program:

In [None]:
# Determine the number of trials in which we had 14 or more males.
j <- sum(scores >= 14)

# Determine the number of trials in which we had 6 or fewer males.
k <- sum(scores <= 6)

# Add the two results together.
m <- j + k

# Convert to a proportion.
mm <- m/n_trials

# Print the results.
print(mm)