# Numpy Introduction
Numpy is a powerful library for numerical computing in Python. It provides support for large,

## Lists Recap
that the Python list is pretty powerful. A list can hold any type and can hold different types at the same time. You can also change, add and remove elements. This is wonderful, but one feature is missing, a feature that is super important for aspiring data scientists as yourself. When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.

In [1]:
# Import the numpy package as np
import numpy as np

baseball = [180, 215, 210, 210, 188, 176, 209, 200]

# Create a numpy array from baseball: np_baseball
np_baseball = np.array(baseball)

# Print out type of np_baseball
print(np_baseball)

[180 215 210 210 188 176 209 200]


In [3]:
# Import numpy
import numpy as np
height_in = [74, 74, 72, 72, 75, 71, 73, 70]
# Create a numpy array from height_in: np_height_in
np_height_in = np.array(height_in)

# Print out np_height_in
print(np_height_in)

# Convert np_height_in to m: np_height_m
np_height_m = np_height_in * 0.0254

# Print np_height_m
print(np_height_m)

[74 74 72 72 75 71 73 70]
[1.8796 1.8796 1.8288 1.8288 1.905  1.8034 1.8542 1.778 ]


In [4]:
import numpy as np
weight_lb = [180, 215, 210, 210, 188, 176, 209, 200]
np_weight_lb = np.array(weight_lb)
np_height_in = np.array(height_in)

# Print out the weight at index 50
print(np_weight_lb[:])

# Print out sub-array of np_height_in: index 100 up to and including index 110
print(np_height_in[0:4])

[180 215 210 210 188 176 209 200]
[74 74 72 72]


In [5]:
import numpy as np

baseball = [[180, 78.4],
            [215, 102.7],
            [210, 98.5],
            [188, 75.2]]

# Create a 2D numpy array from baseball: np_baseball
np_baseball = np.array(baseball)

# Print out the type of np_baseball
print(type(np_baseball))

# Print out the shape of np_baseball
print(np_baseball.shape)

<class 'numpy.ndarray'>
(4, 2)


In [6]:
import numpy as np

# Create a 2D numpy array from baseball: np_baseball
np_baseball = np.array(baseball)

# Print out the shape of np_baseball
print(np_baseball.shape)

(4, 2)


In [9]:
import numpy as np


np_baseball = np.array([
    [180, 78.4, 25],
    [215, 102.7, 30],
    [210, 98.5, 28],
    [188, 75.2, 26]
])

# Create a new array: updated
updated = np.array([
    [1, 0.5, 1],
    [0, -1.2, 0],
    [2, 0.3, -1],
    [1, 0.8, 0]
])

# Print out addition of np_baseball and updated
print(np_baseball + updated)

# Create numpy array: conversion
conversion = np.array([0.0254, 0.453592, 1])

# Print out product of np_baseball and conversion
print(np_baseball * conversion)

[[181.   78.9  26. ]
 [215.  101.5  30. ]
 [212.   98.8  27. ]
 [189.   76.   26. ]]
[[ 4.572     35.5616128 25.       ]
 [ 5.461     46.5838984 30.       ]
 [ 5.334     44.678812  28.       ]
 [ 4.7752    34.1101184 26.       ]]


### Numpy Statistics
Numpy provides a wide range of statistical functions that can be applied to arrays. These functions are

In [10]:
import numpy as np

# Create np_height_in from np_baseball
np_height_in = np_baseball[:,0]

# Print out the mean of np_height_in
print(np.mean(np_height_in))

# Print out the median of np_height_in
print(np.median(np_height_in))

198.25
199.0


In [12]:
avg = np.mean(np_baseball[:,0])
print("Average: " + str(avg))

# Print median height
med = np.median(np_baseball[:,0])
print("Median: " + str(med))

# Print out the standard deviation on height
stddev = np.std(np_baseball[:,0])
print("Standard Deviation: " + str(stddev))

# Print out correlation between first and second column
corr = np.corrcoef(np_baseball[:, 0], np_baseball[:, 1])
print("Correlation: " + str(corr))

Average: 198.25
Median: 199.0
Standard Deviation: 14.635146053251399
Correlation: [[1.         0.95865738]
 [0.95865738 1.        ]]
