# Outcomes

By the end of this notebook, you will be able to...
* Calculate statistical features of a data set in Python.
* Produce new data sets from an existing data set in Python.
* Add a new data set to a database.
* Use basic mathematical operations in Python.
* Use existing functions in Python.

# Read in our data

First let's import some data. This code cell is a copy-paste of what we did in CIT 1.1, with the URL changed to https://docs.google.com/spreadsheets/d/1Z4pTRxwg1ZCAcaN5DNRuL4oLCR1MsD29gHv8toSZBKY/edit?usp=sharing, where we have some position, velocity, and acceleration data for a 500-gram cart going up and down a ramp inclined by $6^\circ$.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
database = pd.read_excel("https://docs.google.com/spreadsheets/d/1Z4pTRxwg1ZCAcaN5DNRuL4oLCR1MsD29gHv8toSZBKY/export")
database.plot.scatter('time','position')
database.plot.scatter('time','velocity')
database.plot.scatter('time','acceleration')

# Getting statistics of your data

You're probably accustomed to using Excel to calculate various statistics (average, standard deviation, etc.) of lab data. We can perform all these operations using the `numpy` library, which we'll import in the code cell below. 

The code cell below currently prints the average of our database's acceleration values using the `np.average()` function. The function accepts an array of numbers as an input and outputs the average of those numbers. Add code to use each of the following functions with our acceleration values as an input. Run the code after each addition.

1. `np.std()` This calculates the standard deviation. 
2. `np.max()` This identifies the maximum value.
3. `np.min()` This identifies the minimum value. 
4. How might you set up a line of code to calculate the ratio of the minimum value to the maximum value (min / max)?
5. How might you set up a line of code to calcualte the percent deviation (standard deviation / average * 100)?
6. How could you use these statistics in a lab activity to help students evaluate whether the acceleration of the cart was constant?

In [None]:
import numpy as np
print( np.average(database['acceleration']) )






In [None]:
#@title Click to see solution
print( np.std(database['acceleration']) )
print( np.max(database['acceleration']) )
print( np.min(database['acceleration']) )
max_a = np.max(database['acceleration'])
min_a = np.min(database['acceleration'])
ratio = min/max
print(ratio)
average = np.average(database['acceleration'])
standard_deviation = np.std(database['acceleration'])
print(standard_deviation / average * 100)

# Generating New Data

Often we need to caclulate new information from an existing data set. For example, our position data begins at an arbitrary point along the track. Suppose we wanted to **offset** the position data by the initial position of $0.039$. Python can take care of this in one line of code: If we enter `database['displacement'] = database['position']-number`, then two things happen:
1. Python takes each element in the array `database['position']` and **subtracts** `number` from it.
2. Python **stores** these offset values in a **new array** called `database['displacement']`.

We've implemented this process in the code cell below. Add a `print()` function to line 2 to show the contents of the new array `database['displacement']`. Did the offset work? Add a command to line 3 to plot `database['displacement']` versus `database['time']` to see.

In [None]:
database['displacement'] = database['position']-0.039


Now let's try making more arrays to help us explore what's going on in this experiment. Add code to the code cell above to carry out each of the following. You can add new lines to the code cell with the Enter key, just like in a document.

1. Calculate the **kinetic energy** of the cart ($K = \frac{1}{2} m v^2$). You can square the velocity data by using `database['velocity']**2`. Note that `**` is Python's way of raising a number to a power. You'll also need the mutiplication operator `*` and the division operator `/`. Store these kinetic energy values in a new array called `database['kinetic energy']`. Make a graph of kinetic energy versus time.
2. Calculate the **height** of the cart by multiplying the displacement array times the sine of the ramp angle. (You can access the sine function using `np.sin(angle)`, with `angle` replaced by the ramp angle in radians.  Store these kinetic energy values in a new array called `database['height']`. Make a graph of the height versus time.
3. Calculate the **potential energy** of the cart ($U = mgh$). Store these potential energy values in a new array called `database['potential energy']`. Make a graph of the potential energy versus time.
4.  Now calculate and graph the total energy. What do you notice about the total energy when comparing it to the kinetic and potential energy?
5. You could accomplish all of these tasks in a spreadsheet. In what ways do you think it is easier to carry out these tasks in Python? In what ways do you find it more challenging?

[Add your answers here]

# References

This notebook is based on Brian Lane's [LetsCodePhysics tutorials](https://www.youtube.com/c/LetsCodePhysics/featured) and Adam LaMee's [CODINGinK12 tutorials](https://adamlamee.github.io/CODINGinK12/)