# Introduction to Numpy

### Outline
* [Getting started](#getting-started)
* [ndarray](#ndarray)
* [Creating single dimentional arrays with a python list](#single-dimentional-arrays-with-python-list)
* [Array attributes](#array-attributes)
    * dtype
    * ndim
    * size
    * shape
    * itemsize
    * data
* [Creating multi-dimentional arrays with python sequences](#multi-dimentional-arrays-with-python-sequences)
* [More creational routines](#more-creational-routines)
   * zeros
   * ones
   * arange
* Manupulating arrays 
    * reshape
* More routines
    * linspace
    * random
* Basic Operations
    * Scalar addition
    * Scalar multiplication
    * Element-wise addition
    * Element-wise subtraction
    * Element-wise multiplication
    * Matrix dot product
* Universal functions
* Aggregate functions
    * sum
    * min 
    * max 
    * mean 
    * std
* Indexing
* Slicing
* Iteration
* Conditions and boolean arrays
* Shape manipulation
    * reshape
    * ravel
* Array manipulation
    * joining arrays
        * vstack
        * hstack
        * column_stack
        * row_stack
    * spliting arrays
        * hsplit
        * vsplit
        * split
* Vectorization
* Broadcasting
* Structured Arrays
* Reading and writing data on files
    * saving data as binary
    * loading binary data
    * reading tabular data
    * 
 

<a id="getting-started"></a>
### Getting Started

By convention, when imported numpy is typically aliased as np.

In [17]:
import numpy as np

<a id="ndarray"></a>
### ndarray

An [ndarray](https://numpy.org/devdocs/reference/arrays.ndarray.html#) is a multidimensional homogeneous array with a predetermined number of items.

* homogeneous meaning that all items are of the same [dtype](https://numpy.org/doc/1.17/reference/arrays.dtypes.html?highlight=dtype) and size.
* the data type is specified by another NumPy object called dtype (data-type).
* each ndarray is associated with only one data-type.

<a id="single-dimentional-arrays-with-python-list"></a>
### Creating single dimentional arrays with a python list

Arrays can be constructed using [creational routines](https://numpy.org/devdocs/reference/generated/numpy.array.html#numpy.array)

[data source](https://www.metacritic.com/movie/serenity/critic-reviews) for scores.

In [18]:
serenity_movie_critic_scores = [
    100, 91, 90, 90, 88, 88, 80, 80, 80, 80, 
     80, 80, 78, 75, 75, 75, 75, 75, 75, 75, 
     75, 75, 75, 75, 70, 70, 70, 70, 70, 70,
     63, 50, 50, 50
]

[numpy.array](https://numpy.org/devdocs/reference/generated/numpy.array.html#numpy.array)

In [19]:
x = np.array(serenity_movie_critic_scores, np.int32)

In [20]:
x

array([100,  91,  90,  90,  88,  88,  80,  80,  80,  80,  80,  80,  78,
        75,  75,  75,  75,  75,  75,  75,  75,  75,  75,  75,  70,  70,
        70,  70,  70,  70,  63,  50,  50,  50], dtype=int32)

In [21]:
type(x)

numpy.ndarray

<a id="array-attributes"></a>
### Array Attributes

[Array attributes](https://numpy.org/doc/1.17/reference/arrays.ndarray.html#array-attributes) reflect information that is intrinsic to the array itself. 

In [22]:
# Data-type of the array’s elements.
x.dtype

dtype('int32')

In [23]:
# Number of array dimensions.
x.ndim

1

In [24]:
# Number of elements in the array.
x.size

34

In [25]:
# Tuple of array dimensions.
x.shape

(34,)

In [26]:
# Length of one array element in bytes.
x.itemsize

4

**Knowledge check**  
When we created the array, we specified that each item should have a datatype of np.int32.  
How would the value of the itemsize attribute change if we changed the datatype to np.int64?  
Try it out!

In [27]:
# Python buffer object pointing to the start of the array’s data.
x.data

<memory at 0x110e8f390>

<a id="multi-dimentional-arrays-with-python-sequences"></a>
### Creating Multi Dimentional Arrays With Python Sequences

Matrix resources:
* [Khan Academy - Introduction to matrices](https://www.khanacademy.org/math/precalculus/x9e81a4f98389efdf:matrices/x9e81a4f98389efdf:mat-intro/v/introduction-to-the-matrix)
* [Wikipedia - Matrix](https://en.wikipedia.org/wiki/Matrix_(mathematics))

Let's create a 2D Matrix containing the quaterly US sales for Harley Davidson Motorcycles.  

|  Year  | Q1     | Q2     | Q3     |
| ------ | ------ | ------ | ------ |
| 2018   | 29309  | 46490  | 36220  |
| 2019   | 28091  | 42762  | 34903  |

data source:
* [2019 Q1](https://investor.harley-davidson.com/static-files/34d087e4-95d5-45ff-9098-fe8c09bee292)
* [2019 Q2](https://investor.harley-davidson.com/static-files/2a5df0f5-6ea5-4860-803e-bc31828e0526)
* [2019 Q3](https://investor.harley-davidson.com/static-files/51bb4f70-77c6-4526-9c87-046c3a7c0f5e)



In [53]:
# We could have also elected for a list of lists or a tuple as tuples.
hd_us_sales = [(29309, 46490, 36220), (28091, 42762, 34903)]

In [44]:
hd_us_sales

[(29309, 46490, 36220), (28091, 42762, 34903)]

In [62]:
# Notice here that we did not specify the dtype when creating the array. 
# Numpy  will automatically determined as the minimum type required to hold the objects in the sequence.
X =  np.array(hd_us_sales)

In [46]:
X

array([[29309, 46490, 36220],
       [28091, 42762, 34903]])

In [54]:
type(X)

numpy.ndarray

Once again, let's have a look at the attributes associated with this ndarray.

In [49]:
X.dtype

dtype('int64')

In [50]:
X.ndim

2

In [51]:
X.size

6

In [52]:
X.shape

(2, 3)

X has rank 2 since it has two axes. Each axe has 3 elements.

In [60]:
np.linalg.matrix_rank(X)

2

<a id="more-creational-routines"></a>
### More Creational Routines

There are times when we want to create arrays initilized with default values.

#### Zeros
Let's create an array of a given shape and size filled with zeros.  
The [zeroes creation routine](https://numpy.org/devdocs/reference/generated/numpy.zeros.html#numpy.zeros) expects a shape. It also allows for two additional parameters.

numpy.zeros(shape, dtype=float, order='C')

In [64]:
x = np.zeros((2, 3), dtype=int)

In [65]:
x

array([[0, 0, 0],
       [0, 0, 0]])

#### Ones

Simlarly, the [ones creational routine](https://numpy.org/devdocs/reference/generated/numpy.ones.html#numpy.ones) will generate an array where every element's value is one.

In [66]:
x = np.ones((4, 4))

In [67]:
x

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

#### Range

The [arange creational routine](https://numpy.org/devdocs/reference/generated/numpy.arange.html#numpy.arange) is useful when we need to create an array with evenly space values within a given interval.

numpy.arange([start, ]stop, [step, ]dtype=None)

In [70]:
x = np.arange(10)
y = np.arange(20, 30)
z = np.arange(40, 50, 2)

In [69]:
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [71]:
y

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29])

In [72]:
z

array([40, 42, 44, 46, 48])