In [1]:
import numpy as np
import pandas as pd
import string

### Creating a Series using a numpy array

Create a Series using `np.arange(10)`

0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int32

### Creating a Series using a dictionary

Using the `string` Python library, create a Series as follows:

```
A    Z
B    Y
C    X
D    W
E    V
F    U
G    T
H    S
I    R
J    Q
K    P
L    O
M    N
N    M
O    L
P    K
Q    J
R    I
S    H
T    G
U    F
V    E
W    D
X    C
Y    B
Z    A
```

In [None]:
# Your code goes here

In [12]:
letters_reversed = list(string.ascii_uppercase)
letters_reversed.sort(reverse=True)
letters = list(string.ascii_uppercase)
data = dict(zip(letters, letters_reversed))

In [18]:
# Your code goes here

### Creating a DataFrame using Series objects

Using the examples above, create a DataFrame using three Series objects, one for each column in the table. 


||Order	|?Vowel	|Example of Fruit|
|-|-|-|-|
|A	|1|	True|	Apple|
|B	|2	|False	|Banana|
|C|	3|	False|	Cantaloupe|

### Accessing specific cells, series, or slices of a dataframe

Access the `Order` Series in the `example` DataFrame to receive the following output:

```
A    1
B    2
C    3
Name: Order, dtype: object
```

In [19]:
# Your code goes here

Can you access row `A` by using `"A"` as a scripting parameter?

In [20]:
# Your code goes here

To get all the columns from a particular row, you can use `DataFrame.loc[row_index, :]`. Get all of the columns from row `"A"`

In [None]:
# Your code goes here

Order                   1
?Vowel               True
Example of Fruit    Apple
Name: A, dtype: object

The `:` symbol says "give me everything you've got".

You can provide lists or sets of columns or rows that you want to access: `DataFrame[[col1, col2, ...]]`. Get all rows of the `"Order"` and `"?Vowel"` columns:

In [20]:
# Your code goes here

You can also specify a specific subset of rows or columns that you want to access: `DataFrame.loc[row_start:row_end, some_other_column]`. Get rows `"A"` and `"B"` from column `"Order"`:

In [21]:
# Your code goes here

The `.iloc` function allows you to go by counts of indexers - this can be useful but it often ends up confusing if you have indexing that isn't integers so beware! To access all columns of the 0th row: `DataFrame.iloc[0,:]`. Access all columns of the 0th row for the `example` DataFrame:

In [22]:
# Your code goes here

### Boolean Indexing

This is a quick introduction to this but because dataframes are built on numpy arrays, it's quite possible to pass a boolean array as a scripting parameter and just get the values you need. 

The `example` DataFrame has three rows. You can access a given row by passing a list with `True` in the corresponding index.

In [None]:
# Access the first row only
example[[True, False, False]]

Unnamed: 0,Order,?Vowel,Example of Fruit
A,1,True,Apple


In [23]:
# Access the second row only


In [24]:
# Access the third row only


In [25]:
# Access all rows


In [26]:
# Access the first two rows


You can also use the values in a Series object for Boolean indexing. In the following example, I access all rows where the `?Vowel` Series has a `True` value.

In [None]:
example[example["?Vowel"]]

Unnamed: 0,Order,?Vowel,Example of Fruit
A,1,True,Apple


The function `np.logical_not` can be used to access rows where `example["?Vowel"]` is `False`

In [27]:
# Access all rows where the ?Vowel Series has a False value


The important thing is that the boolean array needs to be the same length as the dataframe (i.e. the number of rows) so you'll know which values are in and which are out.  Here's a simple example from numpy array to illustrate.  

In [None]:
arr = np.arange(10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
# Is each element divisible by 2?
arr%2==0

array([ True, False,  True, False,  True, False,  True, False,  True,
       False])

In [None]:
# Access all elements divisible by 2
arr[arr%2==0]

array([0, 2, 4, 6, 8])

# Bare Minimum Introduction to Scientific Computing

## Numpy

Numpy is a comprehensive mathmatical library ("numerical python") with matrix math and all kinds of goodies.  The standard way to import numpy is

`import numpy as np`

It's so common, all the searches you do on stack overflow will refer to numpy as 'np'.

### ndarrays

The basic unit of numpy arrays ar ndarrays - that is, _n-dimensional_ arrays. The constructors can be single values, empty arrays, arrays filled with the same value, or lists or lists of lists or lists of lists of lists. They index different dimensions as tuples.

In [None]:
a = [[1,2,3],[4,5,6]]
arr = np.array(a)
arr

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
arr[0,2]

3

That's the first row, third element in that row (or third column if you prefer).

This continues if you have 3 or more dimensions.  3 dimensions are common for image data.  Your first indexer will be the x location of a pixel, the second indexer will be the y location of the pixel, and then the third indexer will indicate what layer you're on.  A typical CMYK image will have the shape (w,h,4).



#### Useful attributes of the numpy array

In [None]:
arr.shape

(2, 3)

In [None]:
arr.max()

6

In [None]:
arr.min()

1

In [None]:
arr.mean()

3.5

In [None]:
arr.mean(axis=1) #This is the mean of each row.

array([2., 5.])

In [None]:
arr.mean(axis=0) #This is the mean of each column.

array([2.5, 3.5, 4.5])

Another helpful feature of numpy is that you can do arthmetic operations on it as if they were just values

In [None]:
arr + 1

array([[2, 3, 4],
       [5, 6, 7]])

In [None]:
arr**2

array([[ 1,  4,  9],
       [16, 25, 36]], dtype=int32)

If you have two arrays of the same dimensionality, you can do element wise calculations on them.

In [None]:
arr2 = np.array([[3,2,1],[7,8,9]])
arr+arr2

array([[ 4,  4,  4],
       [11, 13, 15]])

In [None]:
arr*arr2

array([[ 3,  4,  3],
       [28, 40, 54]])

If you want to do matrix math, however, you need to call functions that specify that.  The default for numpy is to treat the values as iterable elemtns, not whole matrices.

In [None]:
a = np.array([2,3,5])
b = np.array([7,11,13])
a*b #element-wise each value multiplied by the value with the same index in the second array

array([14, 33, 65])

In [None]:
a.dot(b) #dot multiplication of vectors. Numpy treats the second one as being size (3,1),

112

### Functions I use a lot

#### np.zeros and np.ones



Numpy arrays are immutable.  If you're creating data on the fly, you're continually creating and trashing new references.  It's often a good practice to create the container that your data are going into ahead of time.  Creating an array of zeros and then using some process to fill them up is faster than repeatedly updating (if you don't know what size your data is going to be ahead of time, create lists of lists and then cast to numpy).

An easy way to do that in np.zeros.  np.zeros just takes a tuple shape parameter and it'll give you a numpy array with those dimensions initalized with the value that's appropriate for the data type (default is float)

In [None]:
np.zeros((4,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

Now you have a 4x4 matrix that you can put values in.

np.zeros also takes a kwarg for data type

In [None]:
np.zeros((4,4),dtype=bool)

array([[False, False, False, False],
       [False, False, False, False],
       [False, False, False, False],
       [False, False, False, False]])

Related function:  np.zeros_like

np.zeros_like takes an another array as its parameter, takes the shape parameter of _that_ array and makes you an array of the same shape filled with your default value.

In [None]:
np.zeros_like(arr)

array([[0, 0, 0],
       [0, 0, 0]])

there is also a np.ones function that does everything zeros does but fills with, you guessed it, 1.    This really only looks even a tiny bit unexpected when you do boolean data types, so I'll show that

In [None]:
np.ones((4,4),dtype=bool)

array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])