# Mathematical Operations on Pandas Frames
***

If you haven't read the guides to [Pandas](0%20Quick%20Start%20Guide%20-%20Pandas.ipynb) and [NumPy](../2%20Python%20Basics/1%20Quick%20Start%20Guide%20-%20NumPy.ipynb), it would be useful to do so now! We will be using lots of mathematical operations from **NumPy** on pandas data frames.

First, let's load in some test data.

In [9]:
import numpy  as np                                # Import numpy for the math functions
import pandas as pd                                # Import pandas
data  = pd.read_csv("../data/pandas-masses.csv")   # Load in the csv file
data  = data[ data["Object"] == "golf ball"]       # This data frame has other junk too, let's look at just the golf balls
data.head(2)                                       # Display the top 2 rows

Unnamed: 0,Object,Trial,Volume (cm3),dV (cm3),Mass (g),dM (g),Extra notes
0,golf ball,1.0,41.0,1.0,62.0,1.0,measured
1,golf ball,2.0,39.0,1.0,58.0,1.0,measured


## Simple NumPy operations on a column
***

Normally, you don't want to do some math operation to an entire data frame (although you can). Instead, you usually want to [choose some column (see selection guide!)](2%20Indexing%20%26%20Slicing.ipynb) and do some math to that column.

Some operations you can do are:
* `np.power(column_name, x)` raises all values to the power of `x`
* `np.sin(), np.cos(), ...` : a whole suite of trig operations
* `np.exp()` and `np.log()`, `np.log10()`, ...

For example, we want to figure out the radius of each object in the data frame (assuming it's a sphere), using $R = \sqrt[3]{\frac{3V}{4\pi}}$ and **save it to the data frame**.

In [10]:
# First, save the volumes column to a new variable for simplicity
volumes = data["Volume (cm3)"]

# Now, calculate the radii using the formula
radii   = np.power( 3*volumes / (4*np.pi) , 1/3)

# Save the radii to a new column in the data frame, called Radius (cm)
data["Radius (cm)"] = radii

# Display the data frame again
data.head(2)

Unnamed: 0,Object,Trial,Volume (cm3),dV (cm3),Mass (g),dM (g),Extra notes,Radius (cm)
0,golf ball,1.0,41.0,1.0,62.0,1.0,measured,2.139103
1,golf ball,2.0,39.0,1.0,58.0,1.0,measured,2.10374


You can see, we added a new column, called **Radius (cm)**!

## Simple statistical operations
***

Pandas provides a few simple functions to calculate some common statistics on a column (or the entire data frame, if you want). The way to use them is `columnName.function()` e.g. `radii.mean()`

* `mean()` : the average
* `median()`
* `std()`: calculates the (**[unbiased](../6%20SciPy%20%26%20Statistics/3%20Unbiased%20STDEV.ipynb)**) standard deviatioin 
* `sem()`: the (**[unbiased](../6%20SciPy%20%26%20Statistics/3%20Unbiased%20STDEV.ipynb)**) standard error, or standard deviation of the mean, given by $\sigma_{\bar{x}} = \sigma / \sqrt{N} $

For example, now we calculate average radii of the objects in the data frame.

In [12]:
# Save the radii column to a new variable (not really necessary because we already did this above)
radii    = data["Radius (cm)"]

mean_r   = radii.mean()
median_r = radii.median()
std_r    = radii.std()
stdom_r  = radii.sem()

print("Mean radius = %2.2f +/- %2.2f cm" % (mean_r, stdom_r))
print("Standard Deviation = %2.2f cm" % std_r)
print("Median = %2.2f cm" % median_r)

Mean radius = 2.12 +/- 0.02 cm
Standard Deviation = 0.03 cm
Median = 2.12 cm
