###  Today's topics 
- Maths theory vs simulation  
- Functions - what are they?
- How do I write them?

- Using functions to solve problems
    - Pay gaps
    - Rate of inflation
    - Lost pay
- Looking toward the 3 girls problem


#### Note:  the first lecture from this week is on vLab

### First, today's data scientist...
<img src=KatherineJohnson.jpeg align=right width=400>

### In honour of Black History Month...
### Katherine Johnson

- Was in high school at age 13 (usual age 15-16)
- After university, she was one of the first three black students to attend West Virginia's graduate schools
- National Advisory Committee for Aeronautics (NACA) Langlely Laboratory with Dorothy Vaughn
- Analyzed data from flight tests
- After Sputnik, began working for NASA which evolved from NACA
- Did calculations for John Glenn's trajectory when he orbited the earth
- Did calcuations for synching Lunar Module (moon lander) and Command Module (orbiting module) for Apollo moon missions
- Worked on space shuttle and Landsat satellites (lots of earth data)  
    - see https://landsat.gsfc.nasa.gov/
<br>

- See her story in the movie "Hidden Figures"
<br>

Johnson will have used _exact_ (or analytic) calculations for the most part (I think)  
Her work is an example where this may be quite important.  Computers only started to be used
part way through her career.

> As a part of the preflight checklist, Glenn asked engineers to “get the girl”—Johnson—to run the same numbers through the same equations that had been programmed into the computer, but by hand, on her desktop mechanical calculating machine. “If she says they’re good,’” Katherine Johnson remembers the astronaut saying, “then I’m ready to go.” Glenn’s flight was a success, and marked a turning point in the competition between the United States and the Soviet Union in space. (https://spacecenter.org/women-in-stem-katherine-johnson/)

From the same source (and connecting to our gender pay-gap discussion):

> In 1953, a “computer” was someone who performed computations, or mathematical formulas. So, Johnson joining a computer group at NACA meant she was working as someone who could perform the precise calculations needed for spaceflight. The term computer being used for a human had been in use since the 17th century.

> At some point in the 19th century, work as a computer became gendered. This was party to reduce costs. During inequitable times, women were paid far less for this essential work than a man in a similar role would be.




### Major theme of the course: Maths vs simulation
### Or "Do it with formulas" vs "Do it with code"

- Answering questions about data:  Two ways

- First way, **Maths theory**
    - Understand the mathematical properties of the data
    - Turn the question into a set of equations
    - Solve the equations


- Second way, **Simulation (using code)**
    - Break the process for answering the question into steps
        - E.g.
        - Shuffle mosquito numbers for beer/water drinkers
        - Calcuate means for shuffled groups
        - Take the difference
        - Store the difference
    - Make the computer do this many times
    - Plot the answer compared to the answer from the real world

###  Both ways have advantages and disadvantages

<img src=CentralLimitProof.png align=right width=400>

- Maths theory
<br>

- Advantages
    - proof
    - certainty
    - often an exact answer that never changes
    
- Disadvantages
    - Lots of work to get to the point where you understand the theory
    - Years of study
    - Harder for others to understand
    

### Advantages of simulation

<img src=Babbage.png align=right width=400>

- Easier to understand
- Can be adapted to changing situations
- Can be realized in code (simpler than maths theory)
- Can be communicated more easily (often with visualizations)
<br>

Disadvantages
- The answer changes slightly each time 
- Seems less **certain**
    - but certainty may be an illusion
 
# There is definitely a role for both
- sometimes you need to be certain
- sometimes you can't be certain even if you use formulas (because the world isn't certain)



### To develop our examples today we will look at the follownig problem
### - How long will it take the gender pay gap to close?
### - How much effect does low inflation have on pay?

We can use either maths or 'simulation' for these

### Gender pay gap - Basic facts

https://www.ons.gov.uk/employmentandlabourmarket/peopleinwork/earningsandworkinghours/bulletins/genderpaygapintheuk/2024

2017 - mean pay 18.4% lower for women
<br>

2023 - mean pay 13.1% lower for women
<br>

Solve using maths in code below

In [1]:
# Use this window for your maths solution
gap_in_2017 = 18.4
gap_in_2024 = 13.1

avg_one_year_change = (gap_in_2017 - gap_in_2024) / (2024 -2017)

# Now we have a choice -- the algebra way or the successive subtraction way
# Algebra first
# Question:  How many avg_one_year_gaps are there in 13.1?

years_to_parity = gap_in_2024 / avg_one_year_change
print("Change in the gap that happens every year: ",avg_one_year_change)
print("Number of years to equality after 2024: ",years_to_parity)


Change in the gap that happens every year:  0.757142857142857
Number of years to equality after 2024:  17.301886792452834


We can simulate the decrease year by year:

In [2]:
# Use this window for your 'simulation' solution

# set the starting conditions first
current_gap = 13.1
year_counter = 0
current_year = 2024

In [3]:
# do this over and over until the current gap is <= 0
current_gap = current_gap - avg_one_year_change
year_counter = year_counter+1
current_year = current_year+1

print("Current gap: ",current_gap," year: ",year_counter," current year: ",current_year)
avg_one_year_change

Current gap:  12.342857142857143  year:  1  current year:  2025


0.757142857142857

As you can see above, the values change each time you run the code block. **This demonstrates a potential problem that if you accidently run a block twice, the results may change.** Therefore, in your own project, run the whole notebook from start to finish to derive final results, and avoid running a single block twice.

### Are there some things you would like the computer to do for you?

### Next problem - pay and inflation

### _functions_ can help

### Functions - What are they?

<img src=cake.jpg align=right width=300>

Functions are like a recipe for getting a result that always involves the same set of calculations  

Recipes
- ingredients
- steps
- result!  (CAKE!)

Advantage of a recipe?
- Does the same thing each time
- Does it right
- Good CAKE!

Functions
- ingredients - our inputs (called _arguments_)
- steps - the calcuations that produce our result - this is what is inside the function (a series of steps)
- result - what you send back (good cake!)


### Your first function: A simple python function to calculate the sum of two numbers

In [4]:
def sum_two(value1, value2):
    answer = value1 + value2
    return(answer)


In [5]:
# Now try it below - does it work?
# use: sum_two([number],[number])
print(sum_two(1, 3))


4


### Explanation

```python
def sum_two(value1,value2):
```

_def_
- _def_ means you are now going to define a function
- every function starts with _def_
<br>

_sum\_two_
- _sum\_two_ is the function name
- you use this name to make the function do its work
- i.e. you use this name to "call" the function
<br>

_(value1, value2)_
- the variables inside the two () are the inputs/arguments
- in this example these are the two numbers you want to sum
<br>

_answer = ..._
- the _answer = ..._  part are the steps
- you can have more than one line here
<br>

_return(answer)_
- This sends back the result that you have calculated

### Using the function

```python
my_sum = sum_two(10,20)
```

That is a 'toy' example where the calculation (steps) are not very complex.  You can imagine, however, a set of steps that are much more complicated.  
<br>


## 'Scope' - What is it? and what does this have to do with functions?

- **Scope is about where a variable is visible (where they exist and can be seen/changed)**
<br>

- REMEMBER: A function is always 'called' from a main program (or another function)

- The main program is the outer-most world
- A function is like a locked room with no windows

- A function is its own world _inside_ the main program world
  <br>
  
- The variables in the function are not visible to the main program
- However, variables in the main program are visible to the function. **This are very likely to cause confusion and unexpected errors, therefore not recommended**
<br>

Good practice is to keep the function self-contained and depend on the input arguments only
- The 'arguments' to the function are the entrance into the function world
- The 'return' value(s) is the exit out to the outside world

In [6]:
# Example 1, the main programme cannot access variables inside the function.

def example_function_1 (value1, value2):
    value3 = 2/ (1/value1 + 1/value2)
    return(value3)

a = 1
b = 2

print ('function 1 output = ', example_function_1 (a, b))
print ('value 1 = ' , value1) # there will be an error because the main programme can't access value1


function 1 output =  1.3333333333333333


NameError: name 'value1' is not defined

In [7]:
# Example 2. A bad design: the function takes values from a global variable outside it. 
# This is allowed but should be avoided whenever possible.


def example_function_2 (value1, value2):
    output = value1 + value2 + a
    return(output)

a = 1

print ('output of example function 2 = ', example_function_2 (3, 4))
print ('a = ', a)


output of example function 2 =  8
a =  1


### Write a function to calculate how much pay you lose to inflation

Inputs: pay, inflation rate
Outputs: pay after inflation

If starting pay is 100 and the inflation rate is 5%, after one year it is worth less.  How much less?

In [8]:
# Write your function here
def value_after_inflation(starting_pay, inflation_rate):
    pay_after_inflation = starting_pay / (1+inflation_rate)
    return(pay_after_inflation)

In [9]:
# Test your function here. Fill in the ... part below and fix any bugs

# set some starting values
start_pay = 100 # save this unchanged for percent loss calculation below
infl_rate  = 0.05

pay_after_infl = value_after_inflation(start_pay, infl_rate )
loss = start_pay - pay_after_infl

print("after one year of inflation of " , infl_rate , "there is a loss of ", round(loss, 2), "out of ", start_pay )
# round() function is used to round to 2 decimals


after one year of inflation of  0.05 there is a loss of  4.76 out of  100


### How much pay do you lose as a result of inflation? - Use real numbers

#### Retail price index

- numbers that change every year
- how much a basket of goods costs
- increases each year with inflation

https://www.statista.com/statistics/374890/rpi-rate-forecast-uk/


- From these values you can calculate the rate of inflation for each year (we show how below)
- Here are the rates of inflation calculated from the RPI values

#### Yearly rate of inflation from RPI

<img src=rpi_inflation_rate_2024.png align=center>

### Now we want to answer a more complicated question

Given the inflation rate above

- If you had £100 at the start of 2020, how much was your money worth at the begining of 2023?

###  steps

- Use the function we created previously: value_after_inflation (starting_pay, inflation_rate)
- To get the worth of your money at the next year
- Run the function consecutively for 3 times
<br>


In [10]:
# fill in the ... part below and run this block of codes
value_2020 = 100
value_2021 = value_after_inflation(value_2020, 0.015)
value_2022 = value_after_inflation(value_2021, 0.04)
value_2023 =  value_after_inflation(value_2022, 0.116)

print(value_2023)

84.88606931321058


### Summary
- We have been developing ideas about simulation (contrasted with mathematical formulas)
- One useful tool for simulation will be functions, we have learned to write them
    - Break a simulation into steps, write functions for logical steps in the process
- Practice writing functions and solving problems with repeated operations
    - _I wish there was some way to automate that_ (hang on to that thought)
- Preparing to analyze the 3 girls problems