***
## 3.1 Numpy Introduction

***
### Python3.1 Numpy Introduction
### Python3.2 Numpy DataTypes, Functions, and Random Module
### Python3.3 Numpy Iterating Over Arrays
### Python3.4 Numpy Manipulating Arrays
### Python3.5 Numpy Operations
### Python3.6 Numpy File Input and Output and Data Processing
### Python3.7 Numpy-Sort, Argsort, Nonzero, and Extract Functions
### Python3.8 Numpy BreakoutGroupExercises
### Python3.8 Numpy BreakoutGroupExercises - Solutions
***

## Numpy Introduction - Table of Contents 
### 1. Introduction to Numpy
### 2. Core Python vs. Numpy Comparison:
#### 1). A Simple Calculation Example in Core Python vs. Numpy
#### 2). Functionalities in Core Python vs. Numpy
#### 3). A Quick Numerical Programming Performance Comparison in Core Python vs. Numpy
#### 4). A Few Common Numpy Functions
### 3. Why Use Numpy 
### 4. Help
***

### 1. Introduction to Numpy


**`NumPy or Numpy, which stands for Numerical Python`**, is a Linear Algebra library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed. Almost all of the data science libraries in PyData Ecosystem rely on NumPy as one of their main building blocks: 

- `Numpy` Introduces objects for multidimensional arrays and matrices, as well as functions that allow to easily broadcast and perform advanced mathematical and statistical operations on those objects 

- `Numpy` provides vectorization of mathematical operations on arrays and matrices which significantly improves the performance 

- `Numpy` has built-in linear algebra, statistical distributions, trigonometric, and random number capabilities, etc. 
    - In the `Numpy` package the terminology used for vectors, matrices, and higher-dimensional data sets is **Array**

- Numpy array is alternative to Python List and the calculations over entire arrays so **Easy and Fast**

- Many other python libraries are built on Numpy 
  - http://www.numpy.org/
  - https://docs.scipy.org/doc/numpy-1.11.0/numpy-user-1.11.0.pdf


#### Operations using Numpy:

Using Numpy, a developer can perform the following operations:

- Mathematical and logical operations on arrays.

- Fourier transforms and routines for shape manipulation.

- Operations related to linear algebra. NumPy has in-built functions for linear algebra and random number generation.

#### Numpy – A Replacement for MatLab

NumPy is often used along with packages like Pandas, SciPy (Scientific Python Library) and Matplotlib (Data Visualization Library). This combination is widely used as a replacement for MatLab, a popular platform for technical computing. However, Python alternative to MatLab is now seen as a more modern and complete programming language.

#### Numpy Arrays are the main way we will use and come in two ways:

- Vectors: 1 dimensional

- Matrices: 2 dimentional

(Note: a matrix can have one row or one column)

![image.png](attachment:image.png)

### 2. Core Python vs. Numpy Comparison:
- In `Core Python` (Python without external libraries), working with collections of numbers requires the use of loops, list comprehensions, or map/filter/reduce functions. 
- In `Numpy`, collections of numbers are the default and easy to work with since it is a package that provides high-performance vector, matrix, and higher-dimensional data structures for Python. 

#### Installation Instructions
It is highly recommended you install Python using the Anaconda distribution to make sure all underlying dependencies (such as Linear Algebra libraries) all sync up with the use of a conda install. If you have Anaconda, install Numpy by going to your terminal or command prompt and typing:
```markdown
conda install numpy

pip install numpy
```
If you do not have Anaconda and can not install it, please refer to Numpy's official documentation on various installation instructions.

In [34]:
# ! pip install numpy

#### 1). A Simple Calculation Example in Core Python vs. Numpy

In [35]:
# In Core Python:
height = [1.81, 1.79, 1.90, 2.0]
weight = [65.4, 34, 63.6, 6]

In [36]:
Ratio = weight / height    # Don't work

TypeError: unsupported operand type(s) for /: 'list' and 'list'

In [37]:
# In Numpy 
# import numpy library:
from numpy import *  # Less common way to import
import numpy as np   # "np" is the standard abbreviation for numpy - this is the common way to import

In [38]:
# Create Numpy array based on list (cast a normal Python list as a Numpy array)
np_height = np.array(height)
np_height

array([1.81, 1.79, 1.9 , 2.  ])

In [39]:
np_weight = np.array(weight)
np_weight

array([65.4, 34. , 63.6,  6. ])

In [40]:
Ratio = np_weight / np_height   # Element-wise calculations work in Numpy
Ratio

array([36.13259669, 18.99441341, 33.47368421,  3.        ])

In [41]:
Ratio[0]

36.13259668508287

In [42]:
# Numpy Subsetting
Ratio > 33

array([ True, False,  True, False])

In [43]:
Ratio[Ratio > 33]

array([36.13259669, 33.47368421])

In [44]:
Ratio[Ratio > 33][0]

36.13259668508287

#### 2). Functionalities in Core Python vs. Numpy

`Numpy` is great for doing vector arithmetic. If you compare its functionality with regular Python lists, however, some things have changed.

1) First of all, **`Numpy` arrays cannot contain elements with different types.** If you try to build such a list, some of the elements' types are changed to end up with a homogeneous list. This is known as **type coercion**.

2) Second, the typical arithmetic operators, such as `+, -, * `and `/` have **a different meaning for regular `Python lists` and `Numpy arrays`**:
- `Python list + Python list` --> concatenation
- `Numpy array + Numpy array` --> summation
- `Python list + Numpy array` --> summation (Python list convert to Numpy array)

In [45]:
# NumPy arrays: contain only one type
np.array([1.0, "is", True])

array(['1.0', 'is', 'True'], dtype='<U32')

In [46]:
python_list = [1, 2, 3]
numpy_array = np.array([1, 2, 3])

In [47]:
python_list + python_list

[1, 2, 3, 1, 2, 3]

In [48]:
# Different types: different behavior
numpy_array + numpy_array

array([2, 4, 6])

In [49]:
numpy_array2 = np.array([1, 2, 3, 4])

In [50]:
numpy_array + numpy_array2

ValueError: operands could not be broadcast together with shapes (3,) (4,) 

In [51]:
# Mixed types--> convert to array
python_list + numpy_array

array([2, 4, 6])

#### 3). A Quick Numerical Programming Performance in Core Python vs. Numpy

In the early 90s, some programmers wanted to use Python for their scientific work, but couldn't do so for several reasons:
1. Python is extremely slow, compared to faster languages. Numpy mitigates this problem by compiling important code in C, but exposing a Python API
2. Numeric code, which can consist of multi-dimensional matrices, had no counterpart in Python. As we have seen above, Numpy can handle an array of numbers just fine (later we will see examples of multiple dimensions)
3. Python, at the time, had no collection of high quality functions to operate on arrays or matrices of numbers. Numpy is that collection of functions of numbers.


In [52]:
core_python = list(range(0,10000))
numpy_python = np.arange(0,10000)

In [53]:
core_python[0:10]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [54]:
numpy_python

array([   0,    1,    2, ..., 9997, 9998, 9999])

In [55]:
%timeit sum(core_python)
%timeit np.sum(numpy_python)

402 µs ± 16.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
6.22 µs ± 58.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


#### Note: Numpy is an order of magnitude much faster than core Python!

#### 4). A Few Common Numpy Functions

In [56]:
# a vector: the argument to the array function is a Python list
v = np.array([1,2,3,4])
v

array([1, 2, 3, 4])

In [57]:
# a matrix: the argument to the array function is a nested Python list
M = np.array([[1, 2], [3, 4]])
M.shape

(2, 2)

The v and M objects are both of the type ndarray that the Numpy provides.

In [58]:
type(v), type(M)

(numpy.ndarray, numpy.ndarray)

The difference between the v and M arrays is only their shapes. We can get information about the shape of an array by using the ndarray.shape property.

#### The shape attribute for numpy arrays returns the dimensions of the array. If Y has `n` rows and `m` columns, then:

- Y.shape is `(n,m)`

- Y.shape[0] is `n`: will display number of rows

- Y.shape[1] is `m`: will display number of columns

In [59]:
v.shape, M.shape

((4,), (2, 2))

In [60]:
np.shape(v), np.shape(M)

((4,), (2, 2))

The number of elements in the array is available through the **ndarray.size** property:

In [61]:
v.size, M.size

(4, 4)

Equivalently, we could use the function numpy.shape() and numpy.size()

In [62]:
np.shape(M)

(2, 2)

In [63]:
np.size(M)

4

In [64]:
x=np.arange(0,36)
x.reshape(3,12)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]])

In [65]:
x[::]

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35])

In [66]:
x[::6]

array([ 0,  6, 12, 18, 24, 30])

### 3. Why Use Numpy 

1) The Numpy library is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays.

2) While Numpy structures look similar to standard Python lists, Numpy `ndarray` is a much more efficient way of storing and manipulating “numerical data” than the built-in Python data structures. 

3) The broadcasting capabilities are also extremely useful for quickly applying functions to the datasets we are working on Libraries written in lower-level languages, such as C, can operate on data stored in Numpy ‘ndarray’ without copying any data.

4) `Numpy arrays` vs. `Python lists` in Core Python

- Python lists are very general. They can contain any kind of object. They are `dynamically typed`. They do not support mathematical functions such as matrix and dot multiplications, etc. Implementing such functions for Python lists would not be very efficient due to the dynamic typing.

- Numpy arrays are `statically typed and homogeneous`. The type of the elements is determined when the array is created. Because of the static typing, fast implementation of mathematical functions such as multiplication and addition of numpy arrays can be implemented in a compiled language (C and Fortran are used). Numpy arrays are `memory efficient`.


### 4. Help

In [67]:
import numpy as np
np.info(np.ndarray.dtype)

Data-type of the array's elements.

Parameters
----------
None

Returns
-------
d : numpy dtype object

See Also
--------
numpy.dtype

Examples
--------
>>> x
array([[0, 1],
       [2, 3]])
>>> x.dtype
dtype('int32')
>>> type(x.dtype)
<type 'numpy.dtype'>


## Further reading

- http://numpy.scipy.org
- http://scipy.org/Tentative_NumPy_Tutorial
- http://scipy.org/NumPy_for_Matlab_Users - A Numpy guide for MATLAB users.

#### Note: The course materials are developed mainly based on personal experience and contributions from the Python learning community
Referred Books: 
- Learning Python, 5th Edition by Mark Lutz
- Python Data Science Handbook, Jake, VanderPlas
- Python for Data Analysis, Wes McKinney    

Copyright ©2023 Mei Najim. All rights reserved. 