Basic Mathematical Operations
==============================
<a href="PyDataWorkshop.Github.io">Main Page</a> | <a href="index.html"> numpy </a>

<ol> 
<li> <code>sum</code> and <code>cumsum</code>
<li> <code>prod</code> and <code>cumprod</code>
<li> the <code>diff</code> function
<li> Logarithms and Exponentials
<li> Rounding functions
</ol>


In [16]:
from __future__ import division
from __future__ import print_function
# Preliminaries
import scipy
import scipy.stats as stats
import sys
from pylab import *
from numpy import *
# End Imports


### 1.1 `sum` and `cumsum`
sum sums elements in an array. By default, it will sum all elements in the array, and
so the second argument is normally used to provide the axis to use { 0 to sum down
columns, 1 to sum across rows. 

`cumsum` produces the cumulative sum of the values in
the array, and is also usually used with the second argument to indicate the axis to use.

In [17]:
x = randn(3,4)
print(x)

[[-0.25600702  1.98814712 -0.34082453 -0.90064304]
 [ 2.37244575 -0.48624868 -1.37991357 -0.55191162]
 [-0.46229085 -0.73612159 -0.12632763 -0.49540312]]


In [18]:
print( sum(x) ) # all 12 elements

print( sum(x, 0) ) # Summation of rows, 4 elements

print( sum(x, 1) ) # Summation of columns, 3 elements

# Answers printed here in order

-1.37509875383
[ 1.65414789  0.76577685 -1.84706572 -1.94795778]
[ 0.49067254 -0.0456281  -1.82014319]


In [19]:
cumsum(x)

array([-0.25600702,  1.73214011,  1.39131558,  0.49067254,  2.86311829,
        2.37686962,  0.99695605,  0.44504444, -0.01724641, -0.753368  ,
       -0.87969563, -1.37509875])

In [20]:
cumsum(x,0) # Down rows

array([[-0.25600702,  1.98814712, -0.34082453, -0.90064304],
       [ 2.11643874,  1.50189845, -1.72073809, -1.45255466],
       [ 1.65414789,  0.76577685, -1.84706572, -1.94795778]])

sum and cumsum can both be used as function or as methods. When used as methods,
the first input is the axis so that sum(x,0) is the same as x.sum(0).

### 1.2. `prod` and `cumprod`

 - `prod` and `cumprod` behave similarly to `sum` and `cumsum` except that the product and cumulative product are returned. 
 - `prod` and `cumprod` can be called as function or methods.

In [21]:
print(prod(x))
print(prod(x,0))
print(prod(x,1))


0.00292335800859
[ 0.28077824  0.7116337  -0.05941294 -0.24625268]
[-0.15623701 -0.87856954  0.02129717]


### 1.3 the `diff` function

- `diff` computes the finite difference of a vector (also array) and returns an **n-1** element
vector when used on an n element vector. 
- `diff` operates on the last axis by default, and
so diff(x) operates across columns and returns `x[:,1:size(x,1)]` or `x[:,: size(x,1)1]`
for a 2-dimensional array.
- `diff` takes an optional keyword argument axis so that `diff(x, axis=0)` will operate
across rows. 
- `diff` can also be used to produce higher order differences (e.g. double
difference).

In [22]:
diff(x) # Same as diff(x,1)

array([[ 2.24415414, -2.32897165, -0.55981851],
       [-2.85869443, -0.89366489,  0.82800195],
       [-0.27383075,  0.60979396, -0.36907549]])

In [23]:
diff(x,0)  # doesnt do anything. The output here is just x itself.

array([[-0.25600702,  1.98814712, -0.34082453, -0.90064304],
       [ 2.37244575, -0.48624868, -1.37991357, -0.55191162],
       [-0.46229085, -0.73612159, -0.12632763, -0.49540312]])

In [24]:
diff(x, axis = 0)

array([[ 2.62845277, -2.4743958 , -1.03908904,  0.34873142],
       [-2.8347366 , -0.24987292,  1.25358594,  0.05650849]])

In [25]:
diff(x, 2, axis=0) # Double difference, columnbycolumn
#  Check what precisely this means

array([[-5.46318937,  2.22452288,  2.29267497, -0.29222293]])

### 1.4  Some Mathematical Functions

- `exp`

exp returns the element-by-element exponential for an array.

- `log`

log returns the element-by-element natural logarithm (*ln(x)*)for an array.

- `log10`

log10 returns the element-by-element base-10 logarithm for an array.

In [26]:
print(exp(1))
print(exp(2))
print(log(5))
print(log10(5))

2.71828182846
7.38905609893
1.60943791243
0.698970004336


### 1.5 Rounding

Functions: `around` and `round`
    
`around` rounds to the nearest integer, or to a particular decimal place when called with
two arguments.

In [27]:
x = randn(3)
x


array([ 0.48193971,  0.08574694,  0.40960621])

In [28]:
around(x)


array([ 0.,  0.,  0.])

In [29]:
around(x, 2)

array([ 0.48,  0.09,  0.41])

- `around` can also be used as a method on an ndarray, except that the method is
named `round`. 
- For example, `x.round(2)` is identical to `around(x, 2)`. 
- The change of names is needed to avoid conflicting with the Python built-in function `round`.

#### `floor` and `ceil`
- `floor` rounds down to the next smallest integer.
- `ceil` rounds up to the next highest integer (i.e the ceiling).

In [30]:
x = randn(3)
print(x)

print(floor(x))
print(ceil(x))

[-0.09669392  0.94108466  0.11908116]
[-1.  0.  0.]
[-0.  1.  1.]
