# A brief introduction to Numpy
[Numpy](http://numpy.scipy.org/) is the fundamental library for scientific computing in Python. It contains list like objects that work like arrays, matrices, and data tables. This is how scientists typically expect data to behave. Numpy also provides linear algebra, Fourier transforms, random number generation, and tools for integrating C/C++ and Fortran code.

If you primarily want to work with tables of data, [Pandas](pandas.ipynb), which depends on Numpy, is probably the module that you want to start with.

## Numpy Array Basics
#### Creating a Numpy array

In [None]:
!pip install numpy

In [None]:
import numpy as np

example_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
example_array

#### Indexing an array

In [None]:
example_array[1, 1]

#### Slicing an array

In [None]:
example_array[:, 0]

In [None]:
example_array[1, :]

In [None]:
example_array[1:3, 1:3]

#### Subsetting an array

In [None]:
array1 = np.array([1, 1, 1, 2, 2, 2, 1])
array2 = np.array([1, 2, 3, 4, 5, 6, 7])
array2[array1==1]

In [None]:
array3 = np.array(['a', 'a', 'a', 'b', 'b', 'b', 'b'])
array2[(array1==1) & (array3=='a')]

## Math
### Arrays
Math on arrays is vectorized and behaves exactly like most scientists would expect

In [None]:
array1 = np.array([1, 1, 1, 2, 2, 2, 1])
array2 = np.array([1, 2, 3, 4, 5, 6, 7])

array1 * 2 + 1

In [None]:
array1 * array2

### Linear algebra (matrices)
Linear algebra is done using a different data structure called a matrix.

In [None]:
matrix1 = np.matrix([[1, 2, 3], [4, 5, 6]])
matrix2 = np.matrix([1, 2, 3])
matrix1 * matrix2.transpose()

## Importing and Exporting Data
The numpy function genfromtxt is a powerful way to import text data.
It can use different delimiters, skip header rows, control the type of imported data, give columns of data names, and a number of other useful goodies. See the [documentation](http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html) for a full list of features of run help(np.genfromtxt) from the Python shell (after importing the module of course).

### Basic Import and Export
#### Import
Basic imports using Numpy will treat all data as floats.
If we're doing a basic import we'll typically want to skip the header row (since it's generally not composed of numbers.

In [None]:
data = np.genfromtxt('./data/examp_data.txt', delimiter=',', skip_header=1)
data

#### Export

In [None]:
np.savetxt('./data/examp_output.txt', data, delimiter=',')

### Importing Data Tables
Lots of scientific data comes in the form of tables, with one row per observation, and one column per thing observed.
Often the different columns to have different types (including text).
The best way to work with this type of data is in a Structured Array.

#### Import
To do this we let Numpy automatically detect the data types in each column using the optional argument ``dtype=None``.
We can also use an existing header row as the names for the columns using the optional arugment ``Names=True``.

In [None]:
data = np.genfromtxt('./data/examp_data_species_mass.txt', dtype=None, names=True, delimiter=',')
data

#### Export
The easiest way to export a structured array is to treat it like a list of lists and export it using the csv module using a function like this.

In [None]:
def export_to_csv(data, filename):
    outputfile = open(filename, 'wb')
    datawriter = csv.writer(outputfile)
    datawriter.writerows(data)
    outputfile.close()

Structured Arrays
-----------------
If we import data into a Structured Array we can do a lot of things that we often want to do with scientific data.

#### Selecting columns by name

In [None]:
data = np.genfromtxt('./data/examp_data_species_mass.txt', dtype=None, names=True, delimiter=',')
print(data)
data['species']

#### Subset columns based on the values in other columns

In [None]:
data['mass'][data['species'] == 'DM']

In [None]:
data['mass'][(data['species'] == 'DM') & (data['site'] == 1)]

## Random number generation
#### Random uniform (0 to 1)

In [None]:
np.random.rand(3, 5)

#### Random normal (mean=0, stdev=1)

In [None]:
np.random.randn(4, 2)

#### Random integers

In [None]:
min = 10
max = 20
np.random.randint(min, max, [10, 2])