## 2D NumPy Arrays
You can create a 2D numpy array from a regular Python list of lists. Let's try to create one numpy array for all height and weight data of your family, like this. If you print out np_2d now, you'll see that it is a rectangular data structure: Each sublist in the list, corresponds to a row in the two dimensional numpy array.

### Your First 2D NumPy Array
Before working on the actual MLB data, let's try to create a 2D numpy array from a small list of lists.

In this exercise, baseball is a list of lists. The main list contains 4 elements. Each of these elements is a list containing the height and the weight of 4 baseball players, in this order. baseball is already coded for you in the script.

In [1]:
# Create baseball, a list of lists
baseball = [[180, 78.4],
            [215, 102.7],
            [210, 98.5],
            [188, 75.2]]

# Import numpy
import numpy as np

# Create a 2D numpy array from baseball: np_baseball
np_baseball = np.array(baseball)

# Print out the type of np_baseball
print(type(np_baseball))

# Print out the shape of np_baseball
print(np_baseball.shape)

<class 'numpy.ndarray'>
(4, 2)


### Baseball data in 2D form
You have another look at the MLB data and realize that it makes more sense to restructure all this information in a 2D numpy array. This array should have 1015 rows, corresponding to the 1015 baseball players you have information on, and 2 columns (for height and weight).

The MLB was, again, very helpful and passed you the data in a different structure, a Python list of lists. In this list of lists, each sublist represents the height and weight of a single baseball player. The name of this embedded list is baseball.

Can you store the data as a 2D array to unlock numpy's extra functionality?

In [2]:
# baseball is available as a regular list of lists

# Import numpy package
import numpy as np

# Create a 2D numpy array from baseball: np_baseball
np_baseball = np.array(baseball)

# Print out the shape of np_baseball
print(np_baseball.shape)

(4, 2)


## Subsetting 2D NumPy Arrays
If your 2D numpy array has a regular structure, i.e. each row and column has a fixed number of values, complicated ways of subsetting become very easy. Have a look at the code below where the elements "a" and "c" are extracted from a list of lists.

### regular list of lists
x = [["a", "b"], ["c", "d"]]

[x[0][0], x[1][0]]

### numpy
import numpy as np

np_x = np.array(x)

np_x[:, 0]

For regular Python lists, this is a real pain. For 2D numpy arrays, however, it's pretty intuitive! The indexes before the comma refer to the rows, while those after the comma refer to the columns. The : is for slicing; in this example, it tells Python to include all rows.

The code that converts the pre-loaded baseball list to a 2D numpy array is already in the script. The first column contains the players' height in inches and the second column holds player weight, in pounds. Add some lines to make the correct selections. Remember that in Python, the first element is at index 0!

In [7]:
# baseball is available as a regular list of lists

# Import numpy package
import numpy as np

# Create np_baseball (2 cols)
np_baseball = np.array(baseball)

# Print out the 4th row of np_baseball
print(np_baseball[3,:])

# Select the entire second column of np_baseball: np_weight_lb
np_weight_lb = np_baseball[:,1]

# Print out height of 3th player
print(np_baseball[2,0])

[188.   75.2]
210.0


## 2D Arithmetic
Remember how you calculated the Body Mass Index for all baseball players? numpy was able to perform all calculations element-wise (i.e. element by element). For 2D numpy arrays this isn't any different! You can combine matrices with single numbers, with vectors, and with other matrices.

Execute the code below in the IPython shell and see if you understand:

import numpy as np

np_mat = np.array([[1, 2],

                   [3, 4],
                   
                   [5, 6]])
                   
np_mat * 2

np_mat + np.array([10, 10])

np_mat + np_mat

np_baseball is coded for you; it's again a 2D numpy array with 3 columns representing height (in inches), weight (in pounds) and age (in years).

In [16]:
import numpy as np

# Define two 2D numpy arrays
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Add the two arrays
c = a + b

# Subtract the two arrays
d = a - b

# Multiply the two arrays element-wise
e = a * b

# Divide the two arrays element-wise
f = a / b

print("a + b = \n", c)
print("a - b = \n", d)
print("a * b = \n", e)
print("a / b = \n", f)

a + b = 
 [[ 6  8]
 [10 12]]
a - b = 
 [[-4 -4]
 [-4 -4]]
a * b = 
 [[ 5 12]
 [21 32]]
a / b = 
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [17]:
# Example list of lists
baseball = [[180, 78.4], [215, 102.7], [210, 98.5], [188, 75.2]]

# Define updated as a numpy array
updated = np.array(baseball)

# Import numpy package
import numpy as np

# Create np_baseball (3 cols)
np_baseball = np.array(baseball)

# Print out addition of np_baseball and updated
print(np_baseball + updated)

# Create numpy array: conversion
conversion = np.array([0.0254, 0.453592, 1, 0]).reshape(4,1)

# Print out product of np_baseball and conversion
print(np_baseball*conversion)

[[360.  156.8]
 [430.  205.4]
 [420.  197. ]
 [376.  150.4]]
[[  4.572       1.99136  ]
 [ 97.52228    46.5838984]
 [210.         98.5      ]
 [  0.          0.       ]]
