# NumPy

## By Scott Parr

<a id="0"></a> <br>
 # Contents  
 
 1. [Variables](#1)
 1. [NumPy vs Python Arrays](#2)
 1. [Creating Arrays from Scratch](#3)
 1. [Multidimensional Arrays](#4)
 1. [ Indexing Arrays](#5)
 1. [Slicing Arrays](#6)
 1. [Multi-Dimensional Subarrays](#7)
 1. [Subarrays as No-Copy Views](#8)
 1. [Reshaping Arrays](#9)
 1. [NumPy Aggregations](#10)
 1. [Concatenating and Splitting Arrays](#11)
 1. [UFuncs](#12)
 1. [Fancy Indexing](#13)
 1. [Sorting Arrays](#14)
 1. [Dictionaries](#15)
 1. [NumPy and Structured Arrays](#16)


[back to top](#Contents)

<a id="1"></a> 
# 1.  NumPy Basics

The fundamental package for scientific computing with Python

Numpy is a library built on top of Python

It is also known as "Numerical Python"

numpy.org/learn/

### More than they appear

python objects contain more info than initially appears

for example....

In [6]:
import numpy as np

In [7]:
# there is a lot of information stored that we can see running help method
x = 1000
help(x)

Help on int object:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Built-in subclasses:
 |      bool
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      True if self else False
 |

In [8]:
#as we saw, we can make lists of all types- all int, all strings, or heterogeneous

L = list(range(10))
print(L)
type(L)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


list

In [9]:
L2 = [str(c) for c in L]
print(L2)
type(L2[0])

['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


str

In [10]:
L3 = [True, "2", 3.0, 4]
[type(i) for i in L3]

[bool, str, float, int]

[back to top](#Contents)

<a id="2"></a> 

# 2. NumPy vs Python Arrays

Python has fixed-type arrays

Both Python arrays and NumPy arrays can be created from lists

Python arrays are created using the array module and method

NumPy arrays are n-dimensional 

If there are multiple data types in an array, the array will be "upcast" or converted to the most flexible data 
type

Notice the decimals introduced in npa4

npa4 is now upcast to a generic object data type

Arrays can only have one type of data so it will upcast 

we can also set the data type manually with arguments

We can also create arrays within arrays, called multidimensional arrays

In [11]:
# this imports Python array

import array

L = list(range(10))
A = array.array('i', L)
A

array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]:
# this is a numpy array

npa = np.array(L)
npa

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
npa2 = np.array([1, 4, 2 ,5 ,3])
npa2

array([1, 4, 2, 5, 3])

In [14]:
npa4 = np.array([4.1, 3, 6])
npa4

array([4.1, 3. , 6. ])

In [15]:
npa3 = np.array(["a", 1, 3.2])
npa3

array(['a', '1', '3.2'], dtype='<U32')

In [16]:
np.array([1, 2, 3, 4], dtype = 'float')

array([1., 2., 3., 4.])

In [17]:
z = np.array([[1, 2, 3], [4, 5, 6]])
print(z)

[[1 2 3]
 [4 5 6]]


[back to top](#Contents)

<a id="3"></a> 
# 3. Creating Arrays from Scratch

There are many ways to create NP arrays.  Here are some examples:

np.zeros()

np.ones()

np.full()

np.arange()

np.linspace()

np.random()

np.eye()




In [18]:
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [19]:
np.ones((3, 5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [20]:
np.full((3, 5), 3.1415)

array([[3.1415, 3.1415, 3.1415, 3.1415, 3.1415],
       [3.1415, 3.1415, 3.1415, 3.1415, 3.1415],
       [3.1415, 3.1415, 3.1415, 3.1415, 3.1415]])

[back to top](#Contents)

<a id="4"></a> 

# 4.  Multidimensional Arrays

we can make arrays that are multidimensional 

here are two and three dimensional arrays

In [22]:
z = np.array([[1,2,3], [4,5,6]])
print(z)

[[1 2 3]
 [4 5 6]]


In [23]:
z2 = np.array([[1,2,3], [4,5,6], [7,8,9]])
print(z2)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [28]:
x3 = np.random.randint(10, size = (3,4))
x4 = np.random.randint(10, size = (3, 4, 5))

In [29]:
# 2-dimensional: 4 elements in 3 different arrays 

print(x3)

[[4 3 9 1]
 [4 0 2 9]
 [1 5 5 1]]


In [30]:
# 3-dimensional: 3 arrays, 4X5 dimension
print(x4)

[[[2 7 6 6 1]
  [5 0 9 6 1]
  [7 8 5 4 7]
  [3 6 3 2 8]]

 [[6 0 7 7 5]
  [2 4 6 1 0]
  [7 6 5 1 3]
  [1 4 2 4 4]]

 [[6 2 1 3 8]
  [3 9 6 8 2]
  [1 6 8 4 1]
  [7 7 7 0 4]]]


In [35]:
# checking dimensions, shape, size.  Just playing with attributes.

print("x4 ndim:", x4.ndim)
print("x4 shape:", x4.shape)
print("x4 size:", x4.size)


x4 ndim: 3
x4 shape: (3, 4, 5)
x4 size: 60


In [34]:
x4.size

60

We can leave out optional text, like size = 

In [36]:
np.random.random((3,3))

array([[0.90664122, 0.34753752, 0.27479456],
       [0.51558815, 0.92744609, 0.96436822],
       [0.09837894, 0.56931608, 0.38696337]])

A note about randomness in python

No values are truely random

to get a reproducible value, you can set your random seed

set a random seed to get the same values everytime you run the random function

this will only impact the cell you are working in

In [37]:
np.random.seed(100)
np.random.random((3,3))

array([[0.54340494, 0.27836939, 0.42451759],
       [0.84477613, 0.00471886, 0.12156912],
       [0.67074908, 0.82585276, 0.13670659]])

[back to top](#Contents)

<a id="5"></a> 

# 5.  Indexing Arrays

You can index one dimensional arrays just like lists 

Multidimensional arrays can be indexed, too

Use comma-separated tuples to do so

the first one references the array you want, the second, which item


In [40]:
np.random.seed(0)
x1 = np.random.randint(10, size = 6)
x2 = np.random.randint(10, size = (3,4))
x3 = np.random.randint(10, size = (3,4,5))
print(x1)



[5 0 3 3 7 9]


In [39]:
# indexing 1D array

print(x1[0])
print(x1[4])
print(x1[-1])

5
7
9


In [42]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [43]:
# indexing 2D array

x2[2, 3]

7

In [44]:
x3

array([[[8, 1, 5, 9, 8],
        [9, 4, 3, 0, 3],
        [5, 0, 2, 3, 8],
        [1, 3, 3, 3, 7]],

       [[0, 1, 9, 9, 0],
        [4, 7, 3, 2, 7],
        [2, 0, 0, 4, 5],
        [5, 6, 8, 4, 1]],

       [[4, 9, 8, 1, 1],
        [7, 9, 9, 3, 6],
        [7, 2, 0, 3, 5],
        [9, 4, 4, 6, 4]]])

In [45]:
# indexing 3D array

x3[2, 3, 4]

4

[back to top](#Contents)

<a id="6"></a> 

# 6. Slicing Arrays

slicing arrays is a lot like slicing lists, with a couple additions

One-dimensional arrays are the exact same

but 


One new aspect: steps

You can add to your slice to skip

name of array [start: stop: step]

if unspecified start = 0, stop = end of array, and step = 1

(so if start and stop are unspecified, we are looking at the whole list)

In [48]:
x = np.arange(10)
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [49]:
print(x[:5])
print(x[5:])
print(x[4:7])

[0 1 2 3 4]
[5 6 7 8 9]
[4 5 6]


In [51]:
# slicing whole array but in steps of 2 

x[::2]

array([0, 2, 4, 6, 8])

In [53]:
# same as above but starting at index 1
x[1::2]

array([1, 3, 5, 7, 9])

In [55]:
# from index 1 to 5 by 2 steps

x[1:5:2]

array([1, 3])

In [57]:
# going by steps of 3

x[::3]

array([0, 3, 6, 9])

[back to top](#Contents)

<a id="7"></a> 

# 7. Multi-Dimensional Subarrays

the same as one-dimensional arrays, but with a comma separating elements

the sytax is: name of array [slicing for array, slicing for items in array]

or, which arrays we are looking at, where are we looking

In [58]:
np.random.seed(0)
x2 = np.random.randint(10, size = (3, 4))
print(x2)
           

[[5 0 3 3]
 [7 9 3 5]
 [2 4 7 6]]


In [59]:
x2[:2, :3]

array([[5, 0, 3],
       [7, 9, 3]])

In [60]:
x2[1:2, :3]

array([[7, 9, 3]])

In [61]:
x2[:1, :]

array([[5, 0, 3, 3]])

In [62]:
x2[0, :]

array([5, 0, 3, 3])

In [63]:
x2[0:1, 0:]

array([[5, 0, 3, 3]])

In [65]:
# as you can see, there are often multiple ways to get the same results
x2[0:1, 0:4]

array([[5, 0, 3, 3]])

In [66]:
np.random.seed(1)
j = np.random.randint(10, size = (3,4))
print(j)

[[5 8 9 5]
 [0 0 1 7]
 [6 9 2 4]]


In [67]:
j[1:2, 1:3]

array([[0, 1]])

In [72]:
j[1, 1:3]

array([0, 1])

In [69]:
j[2, :3]

array([6, 9, 2])

In [70]:
j[2:3, 0:3]

array([[6, 9, 2]])

In [73]:
j[:2, 1:]

array([[8, 9, 5],
       [0, 1, 7]])

In [74]:
j[0:2, 1:4]

array([[8, 9, 5],
       [0, 1, 7]])

In [77]:
j[:, 0]

array([5, 0, 6])

In [76]:
j[0:3, 0:1]

array([[5],
       [0],
       [6]])

[back to top](#Contents)

<a id="8"></a> 

# 8. Subarrays as No-Copy Views

slices (subarrays) are not copies of your array, they are views

this means if you alter a subarray, it will alter the original

You can alter values in arrays just like you alter lists

How/when would this be helpful?

When you have a large data set but you just want to look at and manipulate a small part of it

In [78]:
np.random.seed(0)
x2 = np.random.randint(10, size = (3, 4))
print(x2)

[[5 0 3 3]
 [7 9 3 5]
 [2 4 7 6]]


In [79]:
x2_sub = x2 [:2, :2]
print(x2_sub)

[[5 0]
 [7 9]]


In [80]:
x2_sub[0, 0]=99


In [81]:
# the original is now altered

x2

array([[99,  0,  3,  3],
       [ 7,  9,  3,  5],
       [ 2,  4,  7,  6]])

### Copies of Arrays

If instead of risking the chance of altering the original, you can use the .copy() method 


In [83]:
x2_sub_copy = x2[:2, :2].copy()
print(x2_sub_copy)

[[99  0]
 [ 7  9]]


In [84]:
x2_sub_copy[0, 0]=42
print(x2_sub_copy)

[[42  0]
 [ 7  9]]


In [85]:
print(x2)

[[99  0  3  3]
 [ 7  9  3  5]
 [ 2  4  7  6]]


## Challenge!

Create a random 4x4 array

change the value of 2nd value of 3rd array

change the value of 4th value of 2nd array

create a copy of third array

reshape the array into a column vector

In [86]:
np.random.seed(0)
x = np.random.randint(5, size =(4, 4))
x

array([[4, 0, 3, 3],
       [3, 1, 3, 2],
       [4, 0, 0, 4],
       [2, 1, 0, 1]])

In [87]:
x[2, 1] = 5
x

array([[4, 0, 3, 3],
       [3, 1, 3, 2],
       [4, 5, 0, 4],
       [2, 1, 0, 1]])

In [88]:
x[1, 3] = 20
x

array([[ 4,  0,  3,  3],
       [ 3,  1,  3, 20],
       [ 4,  5,  0,  4],
       [ 2,  1,  0,  1]])

In [90]:
x_copy = x[2].copy()
x_copy

array([4, 5, 0, 4])

In [92]:
x_copy.reshape((4, 1))


array([[4],
       [5],
       [0],
       [4]])

[back to top](#Contents)

<a id="9"></a> 

# 9. Reshaping Arrays

We can reshape arrays with the reshape method



In [94]:
array = np.arange(1, 10)
print(array)

[1 2 3 4 5 6 7 8 9]


In [96]:
grid = array.reshape((3, 3))
print(grid)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [98]:
# but arrays must match size

# this wouldn't work 

#array2 = np.arange(1, 10)
#grid2 = array2.reshape((3, 4))
#print(array2)

We can also reshape a 1D array into a multi-dimensional 2D row or column matrix

notice the extra [    ] in the output

In [100]:
x = np.array([1, 2, 3])
print(x)

[1 2 3]


In [101]:
# this will make an array within an array (so called multi-dimensional)

x.reshape((1, 3))

array([[1, 2, 3]])

In [104]:
# changing the axis this way does nothing, as expected.  

x[np.newaxis, :]

array([[1, 2, 3]])

In [103]:
# changing it this way takes our 1 row array and makes it 3 arrasys with 1 element each
# also known as a column vector

x[:, np.newaxis]

array([[1],
       [2],
       [3]])

In [106]:
# this does same thing (3 arrays, with 1 element in each)

x.reshape(3, 1)

array([[1],
       [2],
       [3]])

In [110]:
x2 = np.array([1, 2, 3])
print(x2)
x2[: ,np.newaxis]

[1 2 3]


array([[1],
       [2],
       [3]])

In [111]:
x2[np.newaxis, :]

array([[1, 2, 3]])

In [112]:
x[: ,np.newaxis]

array([[1],
       [2],
       [3]])

[back to top](#Contents)

<a id="10"></a> 

# 10. NumPy Aggregations

The first step in data analysis is exploratory data analysis

This includes visualization and descriptive statistics 

start by creating a random variable:

And time how long it takes to run the python sum function on it vs numpy's sum function

NumPy is much faster 

For big data sets, this is a big advantage 

In [115]:
big_array = np.random.rand(1000000)

%timeit sum(big_array)

124 ms ± 12.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [116]:
%timeit np.sum(big_array)

1.18 ms ± 313 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [166]:
np.sum(big_array)

495815

In [168]:
#same function, different anotation 

big_array.sum()

495815

Same with these.  

min() and max() vs. np.min() and np.max()

mean() and np.mean()

std() and np.std()

etc. 




### Multidimensional aggregates 

Doing the sum function yields one result with a multidim array bc by default the function adds all the numbers from all the arrays and sums

if you wanted you can specifiy the the axis


axis 1 is summing each row
axis 0 is summing each column



In [117]:
import numpy as np
np.random.seed(0)
m = np.random.random((3, 4))
print(m)

[[0.5488135  0.71518937 0.60276338 0.54488318]
 [0.4236548  0.64589411 0.43758721 0.891773  ]
 [0.96366276 0.38344152 0.79172504 0.52889492]]


In [118]:
m.sum()

7.478282790980994

In [124]:
m.sum(axis = 1)

array([2.41164943, 2.39890912, 2.66772424])

In [125]:
np.sum(m)

7.478282790980994

In [126]:
np.sum(m, axis = 1)

array([2.41164943, 2.39890912, 2.66772424])

In [127]:
m.sum(axis = 0)

array([1.93613106, 1.744525  , 1.83207563, 1.9655511 ])

In [128]:
np.sum(m, axis = 0)

array([1.93613106, 1.744525  , 1.83207563, 1.9655511 ])


[back to top](#Contents)

<a id="11"></a> 


# 11. Concatenating and Splitting Arrays 


### Concatenating 

concatention = joining

In [129]:
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
np.concatenate([x, y])

array([1, 2, 3, 4, 5, 6])

In [130]:
z = [7, 8, 9]
np.concatenate([x, y, z])

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [131]:
# with a subordinate array structure, they stack on bottom, because that is how it is structured

grid = np.array([[1, 2, 3],
                [4, 5, 6]])
np.concatenate([grid, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [132]:
# can also stack arrays of mixed dimensions 

x = np.array([7, 8 ,9])
np.vstack([x, grid])

array([[7, 8, 9],
       [1, 2, 3],
       [4, 5, 6]])

In [133]:
# and if they have the same horizontal dimension we can do this:

y = np.array([[99],
            [99]])
np.hstack([grid, y])

array([[ 1,  2,  3, 99],
       [ 4,  5,  6, 99]])

In [135]:
# but this will not work 

#np.vstack([x, y])

### Splitting 

splitting = opposite of concatenation 

We use np.split()

pass the array and the indices to split 

In [136]:
x= [0,1,2,3,4,5,6,7,8,9]
x1, x2 = np.split(x, [3])
print(x1, x2)

[0 1 2] [3 4 5 6 7 8 9]


In [137]:
x3, x4, x5 = np.split(x, [3, 5])
print(x3, x4, x5)

[0 1 2] [3 4] [5 6 7 8 9]


We can also split vertically or horizontally 

In [138]:
grid = np.arange(16).reshape((4, 4))
grid

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [139]:
upper, lower = np.vsplit(grid, [2])
print(upper)

[[0 1 2 3]
 [4 5 6 7]]


In [140]:
print(lower)

[[ 8  9 10 11]
 [12 13 14 15]]


In [141]:
left, right = np.hsplit(grid, [2])
print(left)

[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]


In [142]:
print(right)

[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]



[back to top](#Contents)

<a id="12"></a> 

# 12. UFuncs


Analyzing arrays with Python = very slow, python is flexible which makes it slow

NumPy speeds it up 

provides vectorized operations through universal functions, UFuncs

lets see how slow loops can be by computing reciprocals two ways 



In [143]:


def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values [i]
    return output

values = np.random.randint(1, 10, size = 5)
print(values)
compute_reciprocals(values)
        

[9 2 6 9 5]


array([0.11111111, 0.5       , 0.16666667, 0.11111111, 0.2       ])

In [146]:
big_array = np.random.randint(1 ,100, size = 10000)
%timeit compute_reciprocals(big_array)

24.5 ms ± 693 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [147]:
# Now using a ufunc

print(compute_reciprocals(values))
print(1.0 / values)
%timeit (1.0 / big_array)

[0.11111111 0.5        0.16666667 0.11111111 0.2       ]
[0.11111111 0.5        0.16666667 0.11111111 0.2       ]
24.7 µs ± 1.38 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


The Ufunc is much faster!

When Python runs a loop on an array, it checks the type of each element and looks up what function to use for that variable

When ufuncs are used, each item in the array has the same operation applied to it- no need to check the type

What unfucs do is much closer to faster, compiled code than to the classic Python interpreter


[back to top](#Contents)

<a id="13"></a> 

# 13. Fancy Indexing 

We have already indexed in a few ways 



In [148]:

rand = np.random.RandomState(42)

x = rand.randint(100, size = 10)
print(x)

[51 92 14 71 60 20 82 86 74 74]


In [149]:
print(x[0])
print(x[-1])
print(x[5:])

51
74
[20 82 86 74 74]


### New ways!

In [150]:
[x[3], x[7], x[2]]

[71, 86, 14]

In [152]:
ind = [3, 7 ,2]
x[ind]

array([71, 86, 14])

In [154]:
#When we use fancy indexing, we can also change the shape of the index
#The key is having square brackets following an object


ind = np.array([[3, 7],
               [2, 5]])
y = x[ind]
y

array([[71, 86],
       [14, 20]])

In [155]:
rand=np.random.RandomState(6)
j=rand.randint(100, size = 20)
print(j)

[10 73 99 84 79 80 62 25  1 75 77 57 26 33 68 33  8  2 76 84]


In [156]:
[j[3], j[11]]

[84, 57]

In [157]:
ind = [3, 11]
j[ind]

array([84, 57])

In [158]:
ind = np.array([[3, 11],
               [-2, 7]])
j[ind]

array([[84, 57],
       [76, 25]])

### Combined Indexing

we can combine multiple aspects of indexing 

In [159]:
X = np.arange(12).reshape ((3, 4))
X

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [160]:
# finds second array and then indexes 2, 0, 1
X[2, [2, 0, 1]]


array([10,  8,  9])

In [161]:
X[1:, [2, 0, 1]]

array([[ 6,  4,  5],
       [10,  8,  9]])


[back to top](#Contents)

<a id="14"></a> 

# 14.  Sorting Arrays

Python has built-in sorting functions, but they're not as efficient as NumPy


In [162]:
# numpy sort function

x = np.array([2, 1, 4, 3, 5])

np.sort(x)

array([1, 2, 3, 4, 5])

In [163]:
# python sort method

x.sort()
print(x)

[1 2 3 4 5]


### Sorting rows or columns 

the axis argument lets us sort columns or rows

axis = 0, sorts by columns

axis = 1, sorts by rows 

In [169]:
rand = np.random.RandomState(42)
X =rand.randint(0, 10, (4, 6))
print(X)

[[6 3 7 4 6 9]
 [2 6 7 4 3 7]
 [7 2 5 4 1 7]
 [5 1 4 0 9 5]]


In [170]:
np.sort(X, axis = 0)

array([[2, 1, 4, 0, 1, 5],
       [5, 2, 5, 4, 3, 7],
       [6, 3, 7, 4, 6, 7],
       [7, 6, 7, 4, 9, 9]])

In [171]:
np.sort (X, axis = 1)

array([[3, 4, 6, 6, 7, 9],
       [2, 3, 4, 6, 7, 7],
       [1, 2, 4, 5, 7, 7],
       [0, 1, 4, 5, 5, 9]])

[back to top](#Contents)

<a id="15"></a> 

# 15. Dictionaries 

Another basic Python data structure 

What is a dictionary- as in the book/webpage?

Word + definition 

in Python dictionary they are referred to as key:vlaue pairs

Dicionaries are created using curly braces {} and colons :

Addtional key:value pairs can be added with commas

In [177]:
dict_1 = {
    "name": "Scott"
}

dict_1

{'name': 'Scott'}

In [173]:
dict_2= {
    "amount": 1
}
dict_2

{'amount': 1}

In [178]:
dict_3 = {
    "name": "Scott", 
    "age": 34
}
dict_3

{'name': 'Scott', 'age': 34}

You can index dictionaries with the key, not with the numerical index

to add a value, simply specify the key and the value 

to delete an entry use the del() funtion 

We can update a value by referring to the key's name 

In [179]:
dict_3["name"]

'Scott'

In [180]:
dict_3["Univeristy"] = "Eastern University"
dict_3

{'name': 'Scott', 'age': 34, 'Univeristy': 'Eastern University'}

In [181]:
del(dict_3["Univeristy"])
dict_3

{'name': 'Scott', 'age': 34}

In [182]:
dict_3["University"] = "Eastern"

dict_3["University"] = "Eastern University"
dict_3

{'name': 'Scott', 'age': 34, 'University': 'Eastern University'}


### Looping through dicionaries 

If you loop through a dictionary, it goes by keys


In [183]:
for i in dict_3:
    print(i)

name
age
University


In [184]:
#To loop through values, you can index them or use the values method

for i in dict_3:
    print(dict_3[i])

Scott
34
Eastern University


### Nested dictionaries 

dictionaries within dictionaries 

In [185]:
family = {
    'father': {
        'name': 'Greg',
        'age' : 35
    },
    'mother': {
        'name':'Heather',
        'age': 33
    },
    'daughter1': {
        'name': 'Mary',
        'age' : 12
    }
    
} 

family

{'father': {'name': 'Greg', 'age': 35},
 'mother': {'name': 'Heather', 'age': 33},
 'daughter1': {'name': 'Mary', 'age': 12}}

[back to top](#Contents)

<a id="16"></a> 

# 16. NumPy and Structured Arrays

Sometimes data are not homogenous 

Numpy arrays normally only hold a single data type.  Structured arrays give us more flexibility.


In [186]:
name = ['Manute Bol', "Shaquille O'Neil", 'Michael Jordan', 'Muggsy Bogues']
height = [91, 85, 78, 63]
weight = [201.0, 324.0, 216.0, 134.0]

In [187]:
data = np.zeros(4, dtype = {'names':('name', 'height', 'weight'),
                           'formats': ('U20', 'i4', 'f8')})
print(data.dtype)

[('name', '<U20'), ('height', '<i4'), ('weight', '<f8')]


we're going to need to learn data types

U20 = unicode string length 20

i4 = 4-byte integer

f8 = 8-byte float

In [188]:
data['name'] = name
data['height'] = height
data['weight'] = weight
print(data)

[('Manute Bol', 91, 201.) ("Shaquille O'Neil", 85, 324.)
 ('Michael Jordan', 78, 216.) ('Muggsy Bogues', 63, 134.)]


Now we can index rows and columns

In [189]:
print(data['name'])
print(data[0])

['Manute Bol' "Shaquille O'Neil" 'Michael Jordan' 'Muggsy Bogues']
('Manute Bol', 91, 201.)


In [190]:
data[data['weight'] > 215]

array([("Shaquille O'Neil", 85, 324.), ('Michael Jordan', 78, 216.)],
      dtype=[('name', '<U20'), ('height', '<i4'), ('weight', '<f8')])

In [192]:
# another example 

data = np.zeros(3, dtype = {'names':('movie', 'release', 'stars'),
                           'formats': ('U20', 'i4', 'f8')})
movie = ['Tombstone', 'Batman Forever', 'Willow']
release = [1993, 1995, 1988]
stars = [4.5, 5.0, 4.0]

data['movie'] = movie
data['release'] = release
data['stars']= stars 

print(data)

[('Tombstone', 1993, 4.5) ('Batman Forever', 1995, 5. )
 ('Willow', 1988, 4. )]
