### Import Libraries

In [3]:
import numpy as np  # Numerical Python

In [1]:
presidents_heights = [
    189, 170, 189, 163, 183, 171, 185, 168, 173, 183, 173, 173, 175, 178, 183, 193, 178, 173, 174, 183, 183, 180,
    168, 180, 170, 178, 182, 180, 183, 178, 182, 188, 175, 179, 183, 193, 182, 183, 177, 185, 188, 188, 182, 185, 191
]

how many presidents are taller than 188cm?

In [2]:
cnt = 0
for height in presidents_heights:
    if height > 188:
        cnt +=1
print(cnt)

5


numpy way:

In [5]:

heights_arr = np.array(presidents_heights)
print((heights_arr > 188).sum())

5


An array class in Numpy is called an `ndarray` or **n-dimensional array**:

In [7]:
print(type(heights_arr))

<class 'numpy.ndarray'>


### Size and Shape

In [6]:
print(heights_arr.size)

45


Attribute `size` in numpy is similar to the built-in method `len` in python that is used to compute the length of iterable python objects like str, list, dict, etc.

In [10]:
print(len(heights_arr))

45


Shape tells us the dimension:

In [9]:
print(heights_arr.shape)

(45,)


The output is a **tuple**, recall that the built-in data type tuple is immutable whereas a list is mutable, containing a single value, indicating that there is only one dimension, i.e., axis 0. Along axis 0, there are 45 elements (one for each president) Here, heights_arr is a 1d array. 

### Reshape

In [11]:
presidents_ages = [
    57, 61, 57, 57, 58, 57, 61, 54, 68, 51, 49, 64, 50, 48, 65, 52, 56, 46,
    54, 49, 51, 47, 55, 55, 54, 42, 51, 56, 55, 51, 54, 51, 60, 62, 43, 55,
    56, 61, 52, 69, 64, 46, 54, 47, 70
]

In [12]:
heights_and_ages = presidents_heights + presidents_ages 
# convert a list to a numpy array
heights_and_ages_arr = np.array(heights_and_ages)
print(heights_and_ages_arr.shape)

(90,)


This produces one long array. It would be clearer if we could align height and age for each president and reorganize the data into a 2 by 45 matrix where the first row contains all heights and the second row contains ages. To achieve this, a new array can be created by calling `numpy.ndarray.reshape` with new dimensions specified in a tuple:

In [21]:
heights_and_ages_arr = heights_and_ages_arr.reshape((2,45))

In [22]:
print(heights_and_ages_arr)

[[189 170 189 163 183 171 185 168 173 183 173 173 175 178 183 193 178 173
  174 183 183 180 168 180 170 178 182 180 183 178 182 188 175 179 183 193
  182 183 177 185 188 188 182 185 191]
 [ 57  61  57  57  58  57  61  54  68  51  49  64  50  48  65  52  56  46
   54  49  51  47  55  55  54  42  51  56  55  51  54  51  60  62  43  55
   56  61  52  69  64  46  54  47  70]]


Numpy can calculate the shape (dimension) for us if we indicate the unknown dimension as -1. For example, given a 2darray `arr` of shape (3,4), arr.reshape(-1) would output a 1darray of shape (12,), while arr.reshape((-1,2)) would generate a 2darray of shape (6,2).

### Data Type

Another characteristic about numpy array is that it is **homogeneous**, meaning each element must be of the same data type.

In [14]:
print(heights_arr.dtype)

int64


#### Type coercion

In [17]:
heights_float = [
    189.0,  # we mixed a float number in
    170, 189, 163, 183, 171, 185, 168, 173, 183, 173, 173, 175, 178, 183,
    193, 178, 173, 174, 183, 183, 180, 168, 180, 170, 178, 182, 180, 183, 178, 182,
    188, 175, 179, 183, 193, 182, 183, 177, 185, 188, 188, 182, 185, 191
]
heights_float_arr = np.array(heights_float)
print(heights_float_arr)
print("\n")
print("Type of heights_float_arr:", heights_float_arr.dtype)

[189. 170. 189. 163. 183. 171. 185. 168. 173. 183. 173. 173. 175. 178.
 183. 193. 178. 173. 174. 183. 183. 180. 168. 180. 170. 178. 182. 180.
 183. 178. 182. 188. 175. 179. 183. 193. 182. 183. 177. 185. 188. 188.
 182. 185. 191.]


Type of heights_float_arr: float64


Numpy supports several data types such as `int` (integer), `float` (numeric floating point), and `bool` (boolean values, True and False). The number after the data type, ex. `int64`, represents the **bitsize** of the data type.

### Indexing

We can use array indexing to select individual elements from arrays. Like Python lists, numpy **index starts from 0**.

To access the height of the **3rd** president Thomas Jefferson in the 1darray `heights_arr`:

In [18]:
print(heights_arr[2])

189


In a **2darray**, there are two axes, axis 0 and 1. Axis 0 runs downward down the **rows** whereas axis 1 runs horizontally across the **columns**. 

In the 2darrary heights_and_ages_arr, recall that its dimensions are (2, 45). To find Thomas Jefferson’s age at the beginning of his presidency you would need to access the second row where ages are stored:

In [23]:
print(heights_and_ages_arr[1,2])

57


In 2darray, the **row is axis 0** and the **column is axis 1**, therefore, to access a 2darray, numpy first looks for the position in rows, then in columns. So in our example heights_and_ages_arr[1,2], we are accessing row 2 (ages), column 3 (third president) to find Thomas Jefferson’s age.