In [2]:
# Enable code formatting using external plugin nb_black
%reload_ext nb_black

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

![Numpy](numpy.png)

# NumPy Tutorial - PART 1

**Ref: [Python NumPy Tutorial for Beginners](https://youtu.be/QUT1VHiLmmI) by [freeCodeCamp.org](https://www.freecodecamp.org/)**

[NumPy](https://numpy.org/) is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. **NumPy is faster than traditional Python Lists**.

Why NumPy is faster than Lists?

1. NumPy uses Fixed Type.
2. No type checking when iterating through objects.
3. NumPy uses contiguous memory which can take benefit of:
 * SIMD (_Single Instruction Multiple Data_) vector processing.
 * Effective Cache Utilization.

### Applications of NumPy 

1. Matematics (MATLAB Replacement)
2. Plotting (Matplotlib)
3. Backend of (Pandas, Connect 4, Digital Photography).
4. Used very much in Machine Learning.
5. Helps in understanding [Tensor](https://www.tensorflow.org/) library which is similar to NumPy.

> **Note: [SciPy](https://scipy.org/) has lot more mathematical functions compared to Numpy**.

## Load NumPy library

Run below command if NumPy is not already installed.
```cmd
pip install numpy
```
or
```cmd
pip3 install numpy
```

In [2]:
import numpy as np

print("NumPy version:", np.__version__)

np.set_printoptions(linewidth=130)  # Line width: Maximum 130 chacters in the output, post which it will continue in next line.

NumPy version: 1.20.3


## The Basics

### Create NumPy arrays

We can create NumPy arrays with elements of **same or different datatypes**.

In [22]:
# Create an array of integers.
i = np.array([1, 2, 3])
i

array([1, 2, 3])

<IPython.core.display.Javascript object>

In [23]:
# Create an array of floating point numbers.
f = np.array([0.1, 2.9, 3.14, 0.004])
f

array([0.1  , 2.9  , 3.14 , 0.004])

<IPython.core.display.Javascript object>

In [24]:
# Create an array of boolean values.
b = np.array([True, False, True, True, False])
b

array([ True, False,  True,  True, False])

<IPython.core.display.Javascript object>

In [25]:
# Create an array of mixed datatypes.
mixed = np.array([3.14, True, 1, "Hello"])
mixed

array(['3.14', 'True', '1', 'Hello'], dtype='<U32')

<IPython.core.display.Javascript object>

### Metadata of NumPy arrays

#### Print _Dimension_,  _Shape_, _Datatype_ of a NumPy array.

In [26]:
a = np.array([2, 4, 6, 8])

print("Dimention:", a.ndim)  # ndim :- Number dimensions
print("Shape:", a.shape)
print("Datatype:", a.dtype)  # dtype :- datatype

Dimention: 1
Shape: (4,)
Datatype: int32


<IPython.core.display.Javascript object>

In [27]:
b = np.array(
    [
        [1.0, 2.0, 3.0, 4.0],
        [5.0, 6.0, 7, 8.0],
        [9.0, 10.0, 11.0, 12.0],
    ]
)

print("Dimention:", b.ndim)  # Number dimensions
print("Shape:", b.shape)
print("Datatype:", b.dtype)  # datatype

Dimention: 2
Shape: (3, 4)
Datatype: float64


<IPython.core.display.Javascript object>

We can explicitly specify the datatype during initialization of an array. This helps in improving performance, for example if we are sure that we are dealing with small range of integers then we can explicitly specify the the datatype as `int16` or `int8`.

In [28]:
int16_arr = np.array([123, 8723, 907], dtype="int16")
print("Datatype of `int16_arr` array:", int16_arr.dtype)

dflt_arr = np.array([1, 2, 3])
print("Default datatype for integer array:", dflt_arr.dtype)

Datatype of `int16_arr` array: int16
Default datatype for integer array: int32


<IPython.core.display.Javascript object>

Datatype of the array depends on the largest number in the array. For example an array with at least one element greater than **2147483647** can force NumPy to create array of datatype `int64` instead of default `int32`.

In [29]:
int32_arr = np.array([1, 2, 2147483647])
print("Datatype of int32_arr:", int32_arr.dtype)

int64_arr = np.array([1, 2, 2147483648])
print("Datatype of int64_arr:", int64_arr.dtype)

Datatype of int32_arr: int32
Datatype of int64_arr: int64


<IPython.core.display.Javascript object>

#### Print _size_ (of each element in the array) and _total size_ (of an array)

In [30]:
a = np.array([1, 100, 2147483647, 987, 0])

print("Datatype of each element in array `a`:", a.dtype)
print("Total number of elements in array `a`:", a.size)
print("Memory allocated to each element in array `a`:", a.itemsize, "Bytes")
print("Memory taken by array `a`:", a.nbytes, "Bytes")  # nbytes :- Number of bytes

Datatype of each element in array `a`: int32
Total number of elements in array `a`: 5
Memory allocated to each element in array `a`: 4 Bytes
Memory taken by array `a`: 20 Bytes


<IPython.core.display.Javascript object>

1. `itemsize` value depends on the datatype of the element.
  1. `itemsize` value will be 2 Bytes for `int16`.
  2. `itemsize` value will be 4 Bytes for `int32`.
2. Total memory taken by a NumPy array can also be obtained using `a.size * a.itemsize`

In [31]:
print("Memory taken by `a` array:", a.size * a.itemsize, "Bytes")

Memory taken by `a` array: 20 Bytes


<IPython.core.display.Javascript object>