# Self-study try-it activity 3.2: Analysing the Imperial College London maths department salaries

Manipulating normal variables in probability theory involves understanding how changes to the mean and variance affect the distribution. Some key manipulations are:

- Shifting the mean

- Scaling the variance

- Combining normal variables

- Standardising

## Shifting the mean

If you add a constant 'c' to a normal variable 'X' with mean 'μ' and variance $σ^2$, the new mean becomes 'μ+c', but the variance remains unchanged. In continuation with the video, if the salaries of all staff are increased by £12,000, what is the new mean (µ') and new standard deviation (σ')?

**Let's review the mathematical solution.**

The salaries at Imperial College London's Maths department are normally distributed with a mean(μ) = £70000 and a standard deviation (σ) = 30000. If everyone in the department gets a £12000 raise, what is the new mean and standard deviation?

When everyone gets a £12,000 raise, the mean of the salaries increases by £12,000. Therefore, the new mean is:

    μ′ = μ + 12000 = 70000 + 12000 = 82000

Adding a constant to all values shifts the distribution but does not change its spread. Therefore, the standard deviation remains unchanged:

    σ′ = σ = 30000

Let's visualise this calculation.

Hint:

The x-axis uses the `np.linspace` function to create an array of evenly spaced values over a specified range (say 100). The start value is  **mean-original -3*std-dev-original**, and the end value is  **mean-original + 3 * std-dev-original** . The `3 std-dev` is chosen on both sides because, in a normal distribution, 99.7% of data points fall within three STD of the mean.  

The y-axis is computed using probability density function(PDF):

$$
f(x) = \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left( - \frac{(x - \mu)^2}{2\sigma^2} \right)
$$


In [None]:
#import the necessary libraries
import numpy as np
import matplotlib.pyplot as plt


In [None]:

# Original parameters
mean_original = 70000
std_dev_original = 30000

# New parameters after the raise
mean_new = mean_original + 12000
std_dev_new = std_dev_original

# Generate x values for plotting
x_original = np.linspace(mean_original - 3*std_dev_original, mean_original + 3*std_dev_original, 100)
x_new = np.linspace(mean_new - 3*std_dev_new, mean_new + 3*std_dev_new, 100)

# Calculate y values (probability density)
y_original = np.exp(-((x_original - mean_original) / std_dev_original)**2 / 2) / (std_dev_original * np.sqrt(2 * np.pi))
y_new = np.exp(-((x_new - mean_new) / std_dev_new)**2 / 2) / (std_dev_new * np.sqrt(2 * np.pi))

# Normalize y values to ensure area under the curve is 1 (not strictly necessary for visualization)
# y_original = y_original / np.sum(y_original)
# y_new = y_new / np.sum(y_new)

# Plotting
plt.figure(figsize=(10, 6))
plt.plot(x_original, y_original, label='Original Distribution')
plt.plot(x_new, y_new, label='Distribution After Raise')
plt.title('Effect of £12,000 Raise on Salary Distribution')
plt.xlabel('Salary (£)')
plt.ylabel('Probability Density')
plt.legend()
plt.grid(True)
plt.show()

print(f"Original Mean: £{mean_original}")
print(f"Original Standard Deviation: £{std_dev_original}")
print(f"New Mean: £{mean_new}")
print(f"New Standard Deviation: £{std_dev_new}")


## Scaling the variance

If you multiply a normal variable 'X' by a constant 'a', the new mean becomes 'aμ' and the new variance becomes $a^2 \sigma^2$. In continuation with the video, if the salaries of all staff are doubled, what would be the impact on new mean and new standard deviation?



**Let's review the mathematical solution.** 

The salaries at Imperial College London's Maths department are normally distributed with a mean(μ) = £70000 and standard deviation (σ) = 30000. If everyone in the department gets their salary doubled, what is the new mean and standard deviation?

When each salary is doubled, the new mean and standard deviation are affected as follows:

    New mean(μ') doubles and is calculated as: $μ'$ = 2 x μ = 2 * 70000 = 140000

    New standard deviation(σ') = 2 * σ = 2 * 30000 = 60000

Let's visualise this calculation.

 HINT: Provide the scaling factor and compute the new mean and new standard deviation and use it for plotting.

In [None]:
# Original parameters
mean_original = 70000
std_dev_original = 30000

# Scaling factor
scaling_factor = 2
#Compute the mean, standard deviation and graph


## Combining normal variables

If 'X1' and 'X2' are independent normal variables with means ('µ1') and ('µ2') and standard deviations ('σ1') and ('σ2') then the sum of independent normal variables 'X1 + X2' will have a mean given by 'µ1 + µ2' and standard deviation given by 'σ1 + σ2'.

In continuation with the video, if the salaries of all staff in the maths department and English department are added, then what are the new mean (µ') and new standard deviation (σ')?

In [None]:
# Original parameters
mean_maths = 70000
mean_english = 50000
std_dev_maths = 30000
std_dev_english = 20000

#Compute mean, standard deviation and graph



## Standardising

Standardisation, also known as z-score transformation, is used to standardise a normal variable 'X' with mean 'µ' and standard deviation 'σ' into a standard normal variable 'Z' with mean 0 and variance 1 using

$$
z = \frac{x - \mu}{\sigma}.
$$

In continuation with the video, the salaries at Imperial College London's Maths department are normally distributed with a mean(μ) = £70000 and standard deviation (σ) = 30000. The z-score value of salary for a person with a salary £60000 is as follows:

    z-score = (60000-70000)/30000 = -0.334

In [None]:
# Original parameters
mean_original = 70000
std_dev_original = 30000

# Range of values for which to calculate Z-scores
value_range = np.arange(50000, 90001, 1000)  # Increment by £1,000

#compute mean, standard deviation and graph