![](https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/121px-NumPy_logo_2020.svg.png)

# Numpy Introduction

## What is numpy ?

- NumPy is a Python library (collection of modules).
- NumPy is used for working with arrays.
- NumPy is short for "Numerical Python".

## Why we use numpy ?


- In Python we have lists that serve the purpose of arrays, but they are slow to process.
- NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
- The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.
- Arrays are very frequently used in data science, where speed and resources are very important

## Why numpy is faster than python list ?
- **An array is a collection of homogeneous data-types** that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
![](https://res.cloudinary.com/practicaldev/image/fetch/s--3HyFj1_a--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_880/https://dev-to-uploads.s3.amazonaws.com/i/767ysmx0hpyq62ba89uh.png)
- The NumPy package breaks down a task into multiple fragments and then processes all the fragments **parallelly**. This is achieved by concept called "vectorization":
![](https://lappweb.in2p3.fr/~paubert/ASTERICS_HPC/images/vectorization.png)
- The NumPy package integrates C, C++, and Fortran codes in Python. These programming languages have very little execution time compared to Python
    - **Now why c++ is faster than python** ? *C++ is faster than Python because it is statically typed, which leads to a faster compilation of code. Python is slower than C++, it supports dynamic typing, and it also uses the interpreter, which makes the process of compilation slower*
    
    C++ (typed language):
    ```cpp
    int myNum = 15;
    double myFloatNum = 5.99;
    string myText = "Hello";
    ```
    Python (dynamic typing):
    ```python
    myNum = 15
    myFloatNum = 5.99
    myText = "Hello"
    ```
    - NumPy is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++

This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

[complete tutorial is in w3schools.com](https://www.w3schools.com/python/numpy/default.asp)

## Installation

for installing numpy package, open cmd and type:
```
pip install numpy
```
- Note: With anaconda, important packages like `numpy`, `pandas` etc comes pre-installed, so no need to do so
- This is only helpful when you want to install external packages which doesn't come with anaconda

# Getting started

In [3]:
#import numpy
import numpy as np

In [4]:
arr = np.array([1, 2, 3, 4, 5])
print(arr)

[1 2 3 4 5]


In [1]:
arr2 = [1, 2, 3, 4, 5]
arr2

[1, 2, 3, 4, 5]

In [5]:
type(arr)

numpy.ndarray

Checking the version:

In [6]:
np.__version__

'1.23.1'

# Creating arrays

- NumPy is used to work with arrays. The array object in NumPy is called `ndarray`.
- We can create a NumPy `ndarray` object by using the `array()` function

[More on this and practice material](https://www.w3schools.com/python/numpy/numpy_creating_arrays.asp)

In [7]:
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>


We can also use tuple to intiate the array:

In [9]:
arr = np.array((1, 2, 3, 4, 5))
print(arr)
print(type(arr))

[1 2 3 4 5]
<class 'numpy.ndarray'>


## Dimensions in Arrays

nested array: are arrays that have arrays as their elements

<img src="https://i.stack.imgur.com/Tbe9W.png" width=500 height=500 align="left"/>

### 0-D arrays

In [11]:
arr = np.array(42)
print(arr)
print(arr.shape)

42
()


### 1-D Arrays

In [12]:
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(arr.shape)

[1 2 3 4 5]
(5,)


### 2D arrays

In [7]:
arr = np.array([
    [1, 2, 3, 4, 5],
    [1, 2, 3, 4, 5]
])
print(arr)
print(arr.shape)

[[1 2 3 4 5]
 [1 2 3 4 5]]
(2, 5)


### 3D arrays
<img src="https://media.geeksforgeeks.org/wp-content/uploads/3D-array.jpg" height=500 width=500>

In [14]:
arr = np.array([
    [
        [1, 2, 3], 
        [4, 5, 6]
    ], 
    [
        [1, 2, 3], 
        [4, 5, 6]
    ]
])
print(arr)
print(arr.shape)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]
(2, 2, 3)


### Checking the dimension of given array

In [15]:
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)

0
1
2
3


In [16]:
arr = np.array([1, 2, 3, 4], ndmin=5)

print(arr)
print('number of dimensions :', arr.ndim)

[[[[[1 2 3 4]]]]]
number of dimensions : 5


# Array Indexing

Accessing the elements of a given array

## indexing 1D array

In [8]:
arr = np.array([1, 2, 3, 4])
print(arr[0])

1


In [9]:
arr = np.array([1, 2, 3, 4])
print(arr[1])

2


In [10]:
print(arr[2] + arr[3])

7


## Indexing 2D array
<img src="https://media.geeksforgeeks.org/wp-content/uploads/two-d.png" width=500 height=500 align="left"/>

In [27]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print(arr)
print('2nd element on 1st row: ', arr[0, 1])

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
2nd element on 1st row:  2


Access the element on the 2nd row, 5th column:

In [29]:
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('5th element on 2nd row: ', arr[1, 4])

5th element on 2nd row:  10


## Accessing 3D array

![](https://www.pythoninformer.com/img/numpy/3d-array.png)

In [12]:
arr = np.array([[[10, 11, 12], [13, 14, 15], [16, 17, 18]],
               [[20, 21, 22], [23, 24, 25], [26, 27, 28]],
               [[30, 31, 32], [33, 34, 35], [36, 37, 38]]])
print(arr)
print(arr[2, 2, 1])

[[[10 11 12]
  [13 14 15]
  [16 17 18]]

 [[20 21 22]
  [23 24 25]
  [26 27 28]]

 [[30 31 32]
  [33 34 35]
  [36 37 38]]]
37


# Array slicing

- Slicing in python means taking elements from one given index to another given index.
- We pass slice instead of index like this: `[start:end]`
- We can also define the step, like this: `[start:end:step]`

## Slicing list

In [32]:
a = [1,2,3,4,5,6,7]
a[1:3]

[2, 3]

In [19]:
a = [1,2,3,4,5,6,7]
a[::2]

[1, 3, 5, 7]

## Slicing 1D array

In [34]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])

[2 3 4 5]


In [35]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[4:])

[5 6 7]


In [22]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[3:1:-1])

[4 3]


slicing with `step`:

In [23]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:-1:2])

[2 4 6]


In [24]:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[::2])

[1 3 5 7]


## Slicing 2D arrays

![](https://www.pythoninformer.com/img/numpy/2d-array-col-1.png)

In [26]:
v = np.array([[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]])
print(v)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [27]:
v[:,1]

array([2, 5, 8])

In [28]:
v[1,:]

array([4, 5, 6])

In [29]:
v[:]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## Slicing 3D arrays

![](https://www.pythoninformer.com/img/numpy/3d-array-col-1.png)

i = array number
k = column number
j = row number

to index:

```python
a = array[i,j,k]
```

In [30]:
## i=array no., j=row, k=column

a3 = np.array([[[10, 11, 12], [13, 14, 15], [16, 17, 18]],
               [[20, 21, 22], [23, 24, 25], [26, 27, 28]],
               [[30, 31, 32], [33, 34, 35], [36, 37, 38]]])
a3[1,2,:]

array([26, 27, 28])

![](https://www.pythoninformer.com/img/numpy/3d-array-col-3.png)

In [31]:
a3[:, 1, 2]

array([15, 25, 35])

## Practice more

1. [For more practice with quiz](https://www.w3schools.com/python/numpy/numpy_array_slicing.asp)
2. [Detailed explanation on array indexing](https://www.pythoninformer.com/python-libraries/numpy/index-and-slice/)

# Data Types

**An array is a collection of homogeneous data-types**

Datatype list:

- i - integer
- b - boolean
- u - unsigned integer
- f - float
- c - complex float
- m - timedelta
- M - datetime
- O - object
- S - string
- U - unicode string

In [32]:
print("34 \u20ac")

34 €


In [33]:
a1 = [1,2,"3",4.0]
print(type(a1))

<class 'list'>


In [66]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

int32


In [67]:
arr = np.array([1, 2, 3, 4.5])
print(arr.dtype)

float64


In [35]:
arr = np.array(['apple', 'banana', 'cherry', "34 \u20ac"])
print(arr.dtype)

<U6


In [36]:
arr = np.array([1, 2, 3, 4], dtype='U')
print(arr)
print(arr.dtype)

['1' '2' '3' '4']
<U1


In [None]:
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)

## Converting dtypes

In [72]:
arr = np.array(['1', '2', '3'], dtype='i')
arr

array([1, 2, 3], dtype=int32)

**Caution**

Array cannot have different types (cannot be heterogeneous):

```python
arr = np.array(['a', '2', '3'], dtype='i')
```
output:

```python
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-70-4101754d9810> in <module>
----> 1 arr = np.array(['a', '2', '3'], dtype='i')

ValueError: invalid literal for int() with base 10: 'a'
```

In [37]:
arr = np.array([1,2.2,3], dtype='i')
arr

array([1, 2, 3], dtype=int32)

In [38]:
arr = np.array([1,2.2135731099032809531,3.141092751209712097512], dtype='i')
arr

array([1, 2, 3], dtype=int32)

In [39]:
np.array(['b', 2, 3])

array(['b', '2', '3'], dtype='<U11')

# Reshaping arrays

In [79]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)
print(newarr.shape)
newarr

(4, 3)


array([[ 1,  2,  3],
       [ 4,  5,  6],
       [ 7,  8,  9],
       [10, 11, 12]])

In [80]:
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
newarr

[array([1, 2]), array([3, 4]), array([5, 6])]

# Initializing random arrays

Random number does NOT mean a different number every time. Random means something that can not be predicted logically.

To generate a random integer:

In [58]:
np.random.randint(1000)

187

To generate a random 1D array:

In [73]:
np.random.randint(100, size=10)

array([35, 76, 98, 86, 14,  3, 38, 52, 28, 41])

To generate a random 2D array:

In [76]:
np.random.randint(100, size=(5,5))

array([[90, 49, 74,  7, 55],
       [43, 65, 20, 91, 48],
       [97, 64, 41,  9, 91],
       [31, 31,  8,  1, 62],
       [81, 58, 86, 80, 15]])

In [92]:
np.random.randint(100, size=(3,5,5))

array([[[77, 55, 34, 87, 38],
        [14, 85, 73, 48, 45],
        [25, 97,  6, 21, 98],
        [92, 52, 91, 41, 83],
        [62,  3, 19, 87, 41]],

       [[60, 95, 10, 97, 90],
        [76, 63, 41,  0,  8],
        [89, 81, 41, 22, 57],
        [19, 38, 73, 67, 38],
        [46, 93, 55, 95, 14]],

       [[47, 51, 69, 52, 29],
        [74, 48, 26, 28, 59],
        [73, 14, 44, 66, 81],
        [ 3, 66, 85, 55, 67],
        [71, 76, 41, 66, 18]]])

Generating random float arrays:

In [104]:
np.random.rand(5)

array([0.47473668, 0.56557257, 0.48246194, 0.37295702, 0.79305699])

In [111]:
np.random.rand(3,5,3)

array([[[4.70558205e-01, 1.14832974e-01, 3.19021841e-01],
        [6.08018361e-01, 1.20036855e-01, 7.97444571e-03],
        [2.37359181e-01, 2.57073945e-01, 6.62765491e-01],
        [1.24385708e-01, 8.69726307e-01, 7.46877695e-01],
        [2.85212015e-01, 1.36933217e-01, 3.00344580e-01]],

       [[5.94558286e-02, 2.04563821e-01, 9.97846820e-01],
        [9.23912322e-01, 4.64192564e-01, 9.64979563e-01],
        [3.63043001e-01, 4.20511382e-01, 2.16511999e-01],
        [8.44756740e-01, 4.16081741e-01, 2.90563994e-01],
        [6.49237771e-01, 1.37498993e-02, 8.65207093e-01]],

       [[1.34090644e-02, 9.64139488e-01, 9.43209546e-01],
        [6.24644639e-01, 5.68131624e-01, 7.20865288e-02],
        [7.67304913e-01, 5.31824875e-01, 4.26950878e-02],
        [8.97762494e-01, 4.83112070e-01, 7.63493891e-04],
        [8.18896381e-01, 7.26503164e-02, 2.78470399e-01]]])

Choosing randomly from given array:

In [125]:
np.random.choice([1,2,3,4,5,6,7,8,9,0])

4

Creating a 2D array from given set of values:

In [136]:
np.random.choice([1,2,3,4,5,6,7,8,9,0], size=(5,5))

array([[6, 7, 1, 1, 0],
       [4, 2, 5, 2, 8],
       [5, 2, 9, 3, 8],
       [9, 7, 2, 7, 4],
       [8, 0, 3, 5, 8]])

# For loop vs vectorization

This exercise will demonstrate the advantages of using numpy over normal for loops for computation

Finding the square of a given array:

In [138]:
from tqdm.notebook import tqdm

Using `for loop` to compute:

In [131]:
type(10e6)

float

In [146]:
%%time
arr = np.random.rand(10000000)
for i in tqdm(arr):
    a = i**2

  0%|          | 0/10000000 [00:00<?, ?it/s]

Wall time: 6.82 s


Using numpy `array` for computation:

In [145]:
[1,2,3,4,5,6,7,8,9,0]**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [144]:
arr = np.array([1,2,3,4,5,6,7,8,9,0])
arr**2

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81,  0])

In [143]:
%%time
a = arr**2

Wall time: 37.9 ms


In [150]:
arr1 = np.array([1,2,3,4,5,6,7,8,9,0])
arr2 = np.array([1,2,3,4,5,6,7,8,9,0])
arr1*arr2

array([ 1,  4,  9, 16, 25, 36, 49, 64, 81,  0])

In [152]:
arr1 = np.array([1,2,3,4,5,6,7,8,9,10])
np.log(arr1)

array([0.        , 0.69314718, 1.09861229, 1.38629436, 1.60943791,
       1.79175947, 1.94591015, 2.07944154, 2.19722458, 2.30258509])

In [157]:
arr = np.random.rand(300,500,300)
results = arr**2

In [156]:
np.cos(arr)

array([[[0.96531683, 0.9968438 , 0.99930644],
        [0.79016318, 0.77912918, 0.90842127],
        [0.72624874, 0.91795737, 0.71155859],
        [0.97216517, 0.98991975, 0.97888034],
        [0.58588503, 0.71943422, 0.94399197]],

       [[0.60411838, 0.92320472, 0.68522676],
        [0.83372487, 0.89851928, 0.99961378],
        [0.55702871, 0.68745823, 0.72427966],
        [0.87975913, 0.61657978, 0.90193196],
        [0.60071562, 0.9145836 , 0.74248405]],

       [[0.71595594, 0.87908406, 0.87130765],
        [0.8851659 , 0.54620814, 0.5533441 ],
        [0.59421684, 0.93925522, 0.88626956],
        [0.91388132, 0.80649009, 0.54392109],
        [0.78960269, 0.82786572, 0.99926178]]])

# Searching in arrays

In [85]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
np.where(arr == 1)

(array([0], dtype=int64),)

In [84]:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
np.where(arr == 4)

(array([3, 5, 6], dtype=int64),)

In [160]:
marks = np.array([34,22,50,60])
names = np.array(["A", "B", "C", "D"])

index = np.where(marks>30)
print(index)
names[index]

(array([0, 2, 3], dtype=int64),)


array(['A', 'C', 'D'], dtype='<U1')

Finding all even numbers:

In [161]:
arr = np.array([10, 20, 30, 40, 50, 60, 70])
indices = np.where(arr%20==10)
print(indices)
print(arr[indices])

(array([0, 2, 4, 6], dtype=int64),)
[10 30 50 70]
