This article originally appeared at [raturi.in]().

Title:
Description:

Table of contents

- [What is NumPy](#)
    - [Difference between NumPy and List](#)
    - [Why use NumPy: Computation time](#)
- [Installing Numpy](#)
- [Basics Operations of NumPy](#)
    - [Array Creation](#)
    - [Basic Operations](#)
    - [Indexing, Slicing, Iterating](#)

## What is NumPy
<img alt="ndarry" height="400" src="./img/numpy-3.png" title="Numpy Description" width="600"/>

**[NumPy](https://numpy.org/), which stands for 'Numerical Python', is an [opensource](https://github.com/numpy/numpy) library that allows users to store large amounts of data using less memory and perform extensive operations (mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation, etc) easily using homogenous, one-dimensional, and multidimensional arrays**.

The basic data structure of NumPy is a [ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html?highlight=ndarray#numpy.ndarray), similar to a list.

> 💡 An array in numpy is a data structure organized like a grid of rows and columns, containing values of same data type that can be indexed and manipulated efficiently as per the requirement of the problem.

## Difference between NumPy and Python standard List
Three most important differences between NumPy arrays and standard [Python sequences](https://docs.python.org/library/stdtypes.html#sequence-types-list-tuple-range) are:

|                   | NumPy Array                            | Python Sequences (list, tuple, range) |
|-------------------|----------------------------------------|---------------------------------------|
| **Creation Size** | Fixed size                             | Python list can grow dynamically      |
| **Datatype**      | Elements are of same datatype          | Elements can be of multiple datatypes |
| **Speed**         | Fast as its partially written in [C](https://en.wikipedia.org/wiki/C_(programming_language)) | Slower compared to NumPy              |


## Why use Numpy: Computation time

A [python](https://raturi.in/blog/category/python-tutorials/) list can very well perform all the operations that NumPy arrays perform; it is simply a fact that NumPy arrays are faster and convenient when it comes to large complex computations.

Let's add two matrix of 9 million elements each to see the computation time.

In [27]:
import time
import numpy as np

# python standard list
list_A = [i for i in range(1,9000000)]
list_B = [j**2 for j in range(1,9000000)]

t0 = time.time()
sum_list = list(map(lambda x, y: x+y, list_A, list_B))
t1 = time.time()
list_time = t1 - t0
print ("Time taken by Python standard list is ",list_time)

# numpy array
array_A = np.arange(1,9000000)
array_B = np.arange(1,9000000)

t0 = time.time()
sum_numpy =  array_A + array_B
t1 = time.time()
numpy_time = t1 - t0
print ("Time taken by NumPy array is ",numpy_time)

print("The ratio of time taken is {}".format(list_time//numpy_time))

TypeError: 'list' object is not callable

You can notice that numpy is alot faster than the list. Below is a table to show difference between python standard list and numpy computation speed on different operations.


| Size of each matrix | Type of operation  | Time taken by list | Time taken by numpy | Ratio (List Time / Numpy Time) |
|---------------------|--------------------|--------------------|---------------------|--------------------------------|
| 9 million           | Addition (+)       | 0.56s              | 0.017s              | 32.0                           |
| 9 million           | Subtraction (-)    | 0.61s              | 0.016s              | 36.0                           |
| 9 million           | Multiplication (*) | 0.69s              | 0.016s              | 42.0                           |
| 9 million           | Division (/)       | 0.51s              | 0.022s              | 23.0                           |

From the above table, we can conclude that numpy is alot faster than python standard list. In real word when the data is in billions and the operation are more complex than this ratio will be even bigger.

## Installing NumPy

To start working with numpy, you need to install it and you can't go wrong if you follow instructions from [numpy official website](https://numpy.org/install/).

## Basics of Numpy
As a prerequisites, you will need to know beginner level [python](https://raturi.in/blog/category/python-tutorials/). See this [Python tutorial](https://docs.python.org/tutorial/) for refreshing your concepts.

<img alt="ndarry" height="400" src="./img/numpy-1.png" title="Numpy Description" width="600"/>

In the above image array is an object of [ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html?highlight=ndarray#numpy.ndarray) class of NumPy library.

Whenever you work with a dataset, first step is to get an idea about the dataset array. Four important [attributes](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy-ndarray) of NumPy array to get information about the dataset are:

- [.ndim](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.ndim.html#numpy.ndarray.ndim): returns number(int) of dimensions (axis) of the array.
- [.shape](https://numpy.org/doc/stable/reference/generated/numpy.shape.html#numpy.shape): returns a tuple of **n** rows and **m** column (n,m).
- [.size](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.size.html#numpy.ndarray.size): returns a number(int) of total elements in the array.
- [.dtype](https://numpy.org/doc/stable/reference/generated/numpy.dtype.html#numpy.dtype): returns an object of **numpy.dtype** that describes the type of elements in the array.

Below is a code snippet of the attributes described above.


In [None]:
array = np.array([[1,2,3],[4,5,6]]) # Creating numpy array from list

print("Dimension: ",array.ndim, type(array.ndim))
print("Shape: ",array.shape, type(array.shape))
print("Size: ",array.size, type(array.size))
print("Datatype: ",array.dtype, type(array.dtype))
print("Itemsize: ",array.itemsize, type(array.itemsize))
print("Data: ",array.data, type(array.data))

### Array Creation
<img alt="ndarry" height="400" src="./img/numpy-2.png" title="Numpy Description" width="600"/>

A numpy array is created by passing an array like data structure such as [python's](https://raturi.in/blog/category/python-tutorials/) list or a tuple.

Let's create a 0-D, 1-D, 2-D, and a 3-D [array](https://numpy.org/devdocs/user/absolute_beginners.html#what-is-an-array) from a list.

- 0-D array: ```np.array(11)```
- 1-D array: ```np.array([1, 2, 3, 4, 5])```
- 2-D array: ```np.array([[1, 2, 3], [4, 5, 6]])```
- 3-D array: ```np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])```

In [None]:
array_0D = np.array(11)
array_1D = np.array([1, 2, 3, 4, 5])
array_2D = np.array([[1, 2, 3], [4, 5, 6]])
array_3D = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(array_0D)
print(array_1D)
print(array_2D)
print(array_3D)

Like python standard list, here are 7 ways to create a numpy array.

- [.array(\[1,2,3\])](https://numpy.org/doc/stable/reference/generated/numpy.array.html): Returns array from list
- [.array((1.1,2.2,3.3))](https://numpy.org/doc/stable/reference/generated/numpy.array.html): NumPy array from tuple
- [.zeros((2,3))](https://numpy.org/doc/stable/reference/generated/numpy.zeros.html): Returns array filled with zeros (2 rows, 3 columns)
- [.ones((2,3))](https://numpy.org/doc/stable/reference/generated/numpy.ones.html): NumPy array filled with ones (2 rows, 3 columns)
- [.empty((2,4))](https://numpy.org/doc/stable/reference/generated/numpy.empty.html): Returns array of arbitary data of given shape and type
- [.arange((2,10,2))](https://numpy.org/doc/stable/reference/generated/numpy.arange.html): Returns evenly spaced values within a given range. Similar to python [range()](https://docs.python.org/library/stdtypes.html?highlight=range#range)
- [.linspace((2,4,9))](https://numpy.org/doc/stable/reference/generated/numpy.linspace.html): Return evenly spaced 9 numbers between 2 and 4

In [None]:
array_list = np.array([1,2,3], dtype=int) # From List
array_tuple = np.array((1.1,2.2,3.3)) # From Tuple
array_zeroes = np.zeros((2,3)) # Array of zeroes: 2 rows and 3 columns
array_ones = np.ones((2,3)) # Array of ones: 2 rows and 3 columns
array_empty = np.empty((2,4)) # Array of zeroes: 2 rows and 3 columns
array_arange = np.arange(2,10,2) # Similar to python range()
array_linspace = np.linspace(2,4,9) # Array of 9 numbers between 2 and 4

Just like **dtype=int** paramater, you can make use of others parmeters like **copy**, **order**, **subok**, **ndim**, **like**. You can explore other numpy array [parameters](https://numpy.org/doc/stable/reference/generated/numpy.array.html#numpy-array).

Let's practice some methods to create arrays

> 💡 Tip: Use **help** to see syntax when required



In [None]:
help(np.zeros)

Create a 1D array of ones

In [None]:
arr = np.ones(9)
print(arr)
print(arr.dtype)

Notice that, by default, numpy creates data type of float64. Let's provide [dtype](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html) explicitly.

In [None]:
arr = np.ones(9, dtype=int)
print(arr)
print(arr.dtype)

Create 4x3 array of zeroes

In [None]:
arr = np.ones((4,3), dtype=int)
print(arr)

Create array of integers between 3 to 7

In [None]:
arr = np.arange(4,7)
print(arr)

Create an array of integers from 5 to 20 with a step of 2

In [None]:
arr = np.arange(5,21,2)
print(arr)

Create an array of random integers of size 10

In [None]:
arr = np.random.randint(5,size=10)
print(arr)

Create an array of random integers between 6 and 9 of size 10

In [None]:
arr = np.random.randint(7,9,size=10)
print(arr)

Create 2x3 2D array of random numbers

In [None]:
arr = np.random.random([2,3])
print(arr)

Create an array of size 10 between 1.5 and 2

In [None]:
arr = np.linspace(1.5,2,10)
print(arr)

That's all for the basic ways of creating arrays. You can also explore these other 4 ways to create arrays as well

- [.full()](https://numpy.org/doc/stable/reference/generated/numpy.full.html): Create a constant array of any number ‘n’
- [.tile()](https://numpy.org/doc/stable/reference/generated/numpy.tile.html): Create a new array by repeating an existing array for a particular number of times
- [.eye()](https://numpy.org/doc/stable/reference/generated/numpy.eye.html): Create an identity matrix of any dimension
- [.random.randint()](https://numpy.org/doc/stable/reference/random/generated/numpy.random.randint.html): Create a random array of integers within a particular range

### Basic Operations

NumPy can perform variety of operation, the very basics one include, addition, subtraction, and multiplication. Below are the few basic operations that can be done in numpy without using loops.

**Create** a NumPy array to store marks of 5 students.

In [None]:
marks = [1, 2, 3, 4, 5]
marks_np = np.array(marks)
print(marks_np)

**Add** marks of 5 subjects of two different students.

In [None]:
marks_A = [10,20,10,20,14]
marks_B = [23,12,43,12,43]

marks_np_A = np.array(marks_A)
marks_np_B = np.array(marks_B)

total = marks_np_A + marks_np_B # Add using + operator
print(total)

**Convert** weight of 5 students from kg to gram

In [None]:
weight = [45, 55, 53, 63, 60] # In KG
weight_np = np.array(weight)

weight_in_gram = weight_np * 1000 # 1kg = 1000gm
print(weight_in_gram)

**Calculate** the BMI of 5 students. To calculate BMI we need

- Two arrays of height and weight
- Apply the formulae **weight_in_kg / (height_in_m ** 2)**

In [28]:
heights_in_inch = [71,72,73,74,75]
weights_in_lbs = [195, 180, 250, 230, 200]

First let's convert height from inch to meter and weight lbs to kg

In [29]:
height_in_m = np.array(heights_in_inch) * 0.0254
weight_in_kg = np.array(weights_in_lbs) * 0.453592

Now, we have converted array into the right units, let's calculate BMI

In [30]:
BMI = weight_in_kg / (height_in_m ** 2)
print("BMI",BMI)

BMI [27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]


Here is a list of 5 common basic functions in numpy ndarray:

- [.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html#numpy.sum): returns sum of elements over a given axis
- [.min](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.min.html): return minimum number along a given axis.
- [.max](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.max.html): return maximum number along a given axis.
- [.cumsum](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.cumsum.html): return cumulative sum of elements along a given axis.
- [.mean](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.mean.html): return average of elements along a given axis.

NumPy also provides universal functions like **sin**, **cos**, and **exp**, these are also called **[ufunc](https://numpy.org/doc/stable/reference/ufuncs.html)**.

### Indexing, Slicing and Iterating

In [31]:
bmi_first_element = BMI[0] #First Element
bmi_last_element = BMI[1] # second element
bmi_first_five_elements = BMI[0:5] # elements 1-5
bmi_last_five_elements = BMI[-1:] # elements 1-5 from the last

Filter BMI array where BMI > 23

In [32]:
# Conditional Filter

BMI_filtered = BMI[BMI > 23]
print(BMI_filtered)

[27.19667848 24.41211827 32.98315848 29.52992539 24.99800911]


Now you know the basics to work with a numpy array and you should be able to create arrays and perform operations on it.

