# Digging deeper: Generating and plotting synthetic data

If you finished Module 02 and are up for a challenge, this <b>extra</b> activity demonstrates how we might produce <i>synthetic</i> data, which is more fun to visualize. This activity uses some advanced approaches like [writing and executing functions](https://www.geeksforgeeks.org/python/python-functions/), which you won't need just yet in this course but may serve you well down the line. 

## Functions

In Python, functions are reusable blocks of code that perform specific tasks.You can define a function using the `def` keyword, followed by the function's name and parentheses, which can include <b>parameters</b> (inputs) that the function will use. After defining the function, you can "call" it by using its name followed by parentheses, optionally passing in arguments (values) for the parameters. Functions can also return a value using the `return` statement, allowing you to capture the result and use it elsewhere in your code.

Here is an example of a super-simple function:

In [None]:
def add_value(data, value):
    return data + value

In [None]:
x = np.arange(0,20,1)

# Do you remember what np.arange() does?

x_new = add_value(x, 3)

print(x_new)

Consider how you can use the `random` module within `numpy` to artificially generate noise in an existing signal

In [None]:
x = np.arange(0,20,1)
y = np.arange(10,30,1)

def add_noise(data, magnitude):
    # the * in front of np.shape unpacks the tupe...don't worry about that
    return data + (np.random.rand(*np.shape(data)) * magnitude)

y_noise = add_noise(y, 1)

In [None]:
print(y_noise)

What happens if you change the `magnitude` argument in the `add_noise` function?

In [None]:
# Your answer here

Now I am going to get tricky here and model annual temperature as a sine wave for us to play with. 

In [None]:
def annual_temperature(day, amplitude, avg_temp, hottest_day_of_year):
    """
    Here's some docs for this function:
    This simulates annual temperature! You can change these parameters:

    amplitude = 10  # Annual range of temperature in C
    avg_temp = 15  # In C
    hottest_day_of_year = 201  # July 20th is the hottest day of the year
    """
    return amplitude * np.sin((2 * np.pi / 365) * (day - hottest_day_of_year + 365/4)) + avg_temp

Let's spin this neat ""model"" up!

In [None]:
day = np.arange(1,366,1) # list of days of the year

amplitude = 10  # Annual range of temperature in C
avg_temp = 15  # In C
hottest_day_of_year = 201  # July 20th is the hottest day of the year

temperature_data = annual_temperature(day, amplitude, avg_temp, hottest_day_of_year)


Now you have a big long array of temperatures!

In [None]:
temperature_data

## Mini-assignment

Using the examples above, let's make the synthetic temperature data a little more realistic by introducing random noise to the daily temperature. Generate a `daily_temperature` array with realistic-looking variability!

In [None]:
# Your code here

And now, generate a year's worth of random precipitation data. Get creative - is there a seasonality to your synthetic data?

In [None]:
# Your code here
precipitation = add_noise(np.ones_like(day), 3)

In [None]:
import matplotlib.pyplot as plt

plt.plot(day, temperature_data)

## Mini-assignment

Create a [scatter](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html) plot of your realistic temperature data (with the noise) over time with points colored by precipitation. Most plotting functions allow you to specify a `c` axis that colors certain datapoints to be a third data axis for data-rich plots. When you do that you'll want to add a [colorbar](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.colorbar.html) so your viewers know what they're looking at. 

I'll get you started:

In [None]:
# Specify the "c" keyword to add precipitation data as colors!
plt.scatter(???, ???, c=???)


# Call the "colorbar()" class to add a colorbar!
plt.colorbar(label="precipitation")

# Set the title of the plot
plt.title('??')

# Label the x-axis
plt.xlabel('??')

# Label the y-axis
plt.ylabel('???')

## Mini-assignment

Make two separate plots for the simple and noisy temperature data. 

In [None]:
#Create a fig and ax object
fig, ax = plt.subplots(1,2, figsize=(10,6), sharey=True)

# Now you have an ax object that has two objects in it
# ax[0] is the zeroeth (first) element, ax[1] is the first element, etc. 

# Plot something on the ax object
ax[0].plot(day, ???,  color='???', marker='o', linestyle='-', markersize=4)

# Plot something on the ax object
ax[1].plot(day, ???,  color='???', marker='o', linestyle='-', markersize=4)

# # Set the title of the axis

# # Label the x-axes

# # Label the y-axis


In [None]:
# Note that they can be displayed on the same plot
# It just depends on what you want to show

# Create a fig and ax object
fig, ax = plt.subplots(figsize=(10, 6))

# Plot something on the ax object
ax.plot(day, ???, color='???', marker='o', linestyle='-', markersize=4)
ax.plot(day, ???, color='???', marker='o', linestyle='-', markersize=4)