# NumPy - Lesson 01

## Lesson Objectives
* Understand what is a NumPy **array**
* Experiment with array **builtin methods**
* Use array **methods and attributes**

<img src="https://raw.githubusercontent.com/numpy/numpy/181f273a59744d58f90f45d953a3285484c72cba/branding/logo/primary/numpylogo.svg" width="25%" height="25%" />

* NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
* It is a **Linear Algebra** Library and almost all of the libraries used in Data Science rely on NumPy as one of their main building blocks.

---

## Import Package

* Colab offers a session with a set of packages already installed. To check which packages are installed type in a code cell **!pip freeze** and run it. In case NumPy is not installed, you may type and run in a code cell **!pip install numpy**
* NumPy should be already incluced in this set of packages. You will just need to import it. 

In [None]:
import numpy as np

---

## Array

* An **array** is the foundation of NumPy, it is defined as a grid of values, either numbers or not. It comes as vector or matrix, where vector is a 1-d array, and matrix is 2-d (or n dimension) array. A matrix can also have have 1 row x 1 column.
* An array is **useful** because it helps to organize data. With this structure, elements can easily be sorted or searched.
* We can create a 1-d array based on a list:


In [None]:
my_list = [7,9,88,4621]
my_list

* To create an array from a list, use **np.array()** and in the argument parse the list
* Please note below 1 bracket before the first item. That indicates it is a **1-d array**.

In [None]:
arr = np.array(my_list)
arr

* Just a side note
  * You don't have necessarily to pass the list as a variable at np.array() function, 
  * You can write the list directly if you prefer, for example: np.array([7,9,88,4621])
  * Both will create the same array

* An array can **handle** numbers (integer, float etc), strings (text), timestamps (dates).

In [None]:
my_list = ['text','label_example',55,150,'final_text_example']
arr = np.array(my_list)
arr

* If your list has more than 1 dimension, you can create a **2-d array**, or matrix.
* Please note the 2 brackets before the first item. That indicates it is a 2-d array.


In [None]:
my_matrix = [[10,80,77], [99,99,99], ["this is good","string1","example"]]
my_matrix

In [None]:
arr = np.array(my_matrix)
arr

---

## Built-in methods to generate arrays

* You can **generate data for your arrays** using built-in methods
* You can quickly create an **evenly spaced** array of numbers using np.arange(). 
    * You may provide 3 arguments: the start, stop and step size of values interval. 
    * Stop argument is not inclusive. 
    * Play around with different step to see the effect. You may try and see the effect with step as 0.5, 1, 2, and 5.

In [None]:
arr = np.arange(start=1,stop=9,step=1)
arr

In [None]:
arr = np.arange(start=1,stop=9,step=0.5)
arr

In [None]:
arr = np.arange(start=1,stop=9,step=5)
arr

* You can also create an **array of zeros using np.zeros()**. Just pass the shape of the desired array. The example below has a shape of 2 x 3 and is a 2-d array
* Please note the 2 brackets before the first item. That indicates it is a 2-d array.

In [None]:
arr = np.zeros((2,3))
arr

* The example below has shape of 2 x 2 x 3 and is a 3-d array of zeros
* Please note the 3 brackets before the first item. That indicates it is a 3-d array.

In [None]:
arr = np.zeros((2,2,3))
arr

* You could also create an **array of all ones using np.ones()**. The example below is a 2-d array, 2 x 5 dimension.

In [None]:
arr = np.ones((2,5))
arr

* You can also create **identity matrix using np.eye()**, that is a square matrix that has ones along its main diagonal and zeros everywhere else.
* The example below is an identity matrix of shape 4 x 4.

In [None]:
arr = np.eye(4)
arr

* A similar function to np.arange() is **np.linspace()**, but instead of step argument, there is num argument. **Num** is the number of samples that need to be retrieved in that interval. 
  * You will provide as argument the start, end and **how many points you want in between**. 
  * Stop argument here is inclusive

In [None]:
arr = np.linspace(start=10,stop=50,num=5)
arr

* If you want 10 numbers even spaced from 10 to 50, you may change num to 10.

In [None]:
arr = np.linspace(start=10,stop=50,num=10)
arr

* You can also create an array of a given shape with **random values from 0 to 1 using np.random.rand()**
    * The arguments are the shape you are interested

In [None]:
arr = np.random.rand(2,2)  # 2-d array
arr

In [None]:
arr = np.random.rand(3,4,2) # 3-d array
arr

* You can also create an array of given shape from a **"standard normal" distribution using random.randn()**
  * The argument is the shape you are interested
  * A standard normal distribution is a normal distribution with a mean of zero and standard deviation of 1. We will get back to that in future sections of the course

In [None]:
arr = np.random.randn(8,2) # 2-d array with 8 rows and 2 columns
arr

In [None]:
arr = np.random.randn(25) # 1-d array with 25 elements
arr

* You can also create random integers setting the interval and size using **np.random.randint()**
  * The arguments for interval are: low (inclusive) and high (exclusive). Size is the output shape, it can be an integer or tuple.

In [None]:
arr = np.random.randint(low=10,high=50,size=5) # 1-d array, 5 elements
arr 

In [None]:
arr = np.random.randint(low=250,high=888,size=(4,3)) # 2-d array, 4 rows x 3 columns
arr 

---

* You may be interested to generate "constant" random values
  * **Run the example below multiple times**
  * **Note that the random values will change**

In [None]:
arr = np.random.randint(low=10,high=50,size=5)
arr

* You need to set **numpy seed** in order to get constant random values
  * In a jupyter notebook code cell, you just have to add **np.seed()** before defining your array(s)
  * The argument is seed, a integer. You can set any integer
  * **Run multiple times the cell below and note the array values will be random and the same**

In [None]:
np.random.seed(seed=123)
arr = np.random.randint(low=10,high=50,size=5)
arr

---

## Array Methods

* You can reshape the array without changing the data within it using  **.reshape()** method.
  * the example below shows a 1-d array, with 40 elements. 

In [None]:
np.random.seed(seed=0)
arr = np.random.randint(low=1,high=150,size=40)
arr

* You can reshape as 4 rows x 10 columns, transforming into a 2-d array
  * the argument is the shape you are interested

In [None]:
arr = arr.reshape(4,10)  # 2-d array, 4 x 10
arr

In [None]:
arr.reshape(1,-1) #####

* There can be an opposite situation where you have a multidimensional array and want to transform to a 1-d array. You can use **flatten()** for it
* Consdider a 2-d array

In [None]:
np.random.seed(seed=0)
arr = np.random.randint(low=1,high=150,size=(2,5))
arr

* You can flatten, you will noticed it became a 1-d array

In [None]:
arr_flatten = arr.flatten()
arr_flatten

* Alternatively, you can get the same effect using **.reshape()**, with argument as -2

In [None]:
arr_reshape = arr.reshape(-1)
arr_reshape

---

* Min and Max values can be acessed using .min() and .max() methods.

In [None]:
arr.max()

In [None]:
arr.min()

* You can determine the index position of the minimum or maximum value in the darray along a particular axis using the argmin() and argmax() methods

In [None]:
arr.argmax()

In [None]:
arr.argmin()

---

## Array Attributes

*  You can check the shape and type of a NumPy array using, respectively, the attributes **.shape** and **.dtype**
* Consider a 2-d array (5x2) made using arange()

In [None]:
arr = np.arange(start=1,stop=11,step=1).reshape(5,2)

* You can check its shape and dtype

In [None]:
print(
    f"* Array:\n {arr} \n\n"
    f"* Array shape: \n {arr.shape} \n\n"
    f"* Array type: \n {arr.dtype}"
    )

---

## Challenges

* Instruct the student that these challenges will be unit tested?
* What to consider to write not so easy and not so hard challenges?

### 1

* Define the numpy seed=1.
* In a variable called **arr**, Create a 2-d array, 4 x 4, with random integers, where the lowest value is 1 and the max 100 (inclusive).
* In a print() statement, display the array

In [None]:
# Place here your code




In [None]:
np.random.seed(seed=1)
arr = np.random.randint(low=1,high=101,size=(4,4))
print(arr)

### 2

* Using your NumPy knowledge, and python knowledge with print statment function and displaying variables using f-string, create the following statement:
  * **"The max value for the array is 80 and its index location is 6. The min value for the array is 2 and its index location is 9."**

In [None]:
# Place here your code




In [None]:
print(
    f"The max value for the array is {arr.max()} and its index location is {arr.argmax()}. "
    f"The min value for the array is {arr.min()} and its index location is {arr.argmin()}."
    )

---

## Well done! Proceed to the next lesson!



<img src="https://www.learningpeople.com/static/1e2d1b1046220ce9fa0222b5790b86bc/b3853/code-institute.png" width="20%" height="20%" />