# NumPy

- NumPy is a Python library used for working with arrays.
- 
It also has functions for working in domain of linear algebra, fourier transform, and matrices.- NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

- NumPy stands for Numerical Python.

- NumPy (Numerical Python) is an open source Python library that’s widely used in science and engineering.

-   The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional ndarray, and a large library of functions that operate efficiently on these data structures. ython.

## Installation of NumPy

- If you have Python and PIP already installed on a system, then installation of NumPy is very easy.
- 
Install it using this command:

In [3]:
!pip install numpy



## How to import NumPy

- Once NumPy is installed, import it in your applications by adding the `import` keyword:

In [2]:
import numpy

In [3]:
import numpy

arr = numpy.array([1,2,3,4,5,6,7])

arr

array([1, 2, 3, 4, 5, 6, 7])

## NumPy as np

- NumPy is usually imported under the `np` alias.

- __alias__: In Python alias are an alternate name for referring to the same thing.

- Create an alias with the `as` keyword while importing:

In [5]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


This widespread convention allows access to NumPy features with a short, recognizable prefix (`np.`) while distinguishing NumPy features from others that have the same name.

## Why use NumPy?

Python lists are excellent, general-purpose containers. They can be “heterogeneous”, meaning that they can contain elements of a variety of types, and they are quite fast when used to perform individual operations on a handful of elements.

Depending on the characteristics of the data and the types of operations that need to be performed, other containers may be more appropriate; by exploiting these characteristics, we can improve speed, reduce memory consumption, and offer a high-level syntax for performing a variety of common processing tasks. NumPy shines when there are large quantities of “homogeneous” (same-type) data to be processed on the CPU.

## What is an “array”?

In computer programming, an array is a structure for storing and retrieving data. We often talk about an array as if it were a grid in space, with each cell storing one element of the data. For instance, if each element of the data were a number.

## Dimensions in Arrays

- A dimension in arrays is one level of array depth (nested arrays).

- __nested array__: are arrays that have arrays as their elements.

### 0-D Arrays

0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

In [5]:
import numpy as np

arr = np.array(42)

print(arr)

42


### 1-D Arrays

An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.

These are the most common and basic arrays.

In [6]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr)

[1 2 3 4 5]


### 2-D Arrays

An array that has 1-D arrays as its elements is called a 2-D array.

These are often used to represent matrix or 2nd order tensors.

In [7]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)

[[1 2 3]
 [4 5 6]]


### 3-D arrays

An array that has 2-D arrays (matrices) as its elements is called 3-D array.

These are often used to represent a 3rd order tensor.

In [8]:
import numpy as np

arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])

print(arr)

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [4 5 6]]]


## NumPy Array Indexing

- Array indexing is the same as accessing an array element.

- You can access an array element by referring to its index number.

- The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [9]:
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[0])

1


In [10]:
import numpy as np

arr = np.array([1, 2, 3, 4])

print(arr[2])

3


### Access 2-D Arrays

To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.

Think of 2-D arrays like a table with rows and columns, where the dimension represents the row and the index represents the column.

In [11]:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('2nd element on 1st row: ', arr[0, 1])

2nd element on 1st row:  2


In [12]:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('5th element on 2nd row: ', arr[1, 4])

5th element on 2nd row:  10


### Access 3-D Arrays

To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.

In [13]:
import numpy as np

arr = np.array([[[1, 2, 3],
                 [4, 5, 6]],
                
                [[7, 8, 9],
                 [10, 11, 12]]])

print(arr[0, 1, 2]) # (shape, row, col)

6


In [14]:
import numpy as np

arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])

print('Last element from 2nd dim: ', arr[1, -1])

Last element from 2nd dim:  10


## NumPy Array Slicing

- Slicing in python means taking elements from one given index to another given index.

- We pass slice instead of index like this: `[start:end]`.

- We can also define the step, like this: `[start:end:step]`.

- If we don't pass start its considered 0

- If we don't pass end its considered length of array in that dimension

- If we don't pass step its considered 1

In [15]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:5])

[2 3 4 5]


In [16]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:])

[5 6 7]


In [17]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[-3:-1])

[5 6]


In [21]:
# Use the step value to determine the step of the slicing:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[1:7:2])

[2 4 6]


In [22]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[::2])

[1 3 5 7]


In [24]:
# Slicing 2-D Arrays Example From the second element, slice elements from index 1 to index 4 (not included):

import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


In [25]:
import numpy as np

arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 1:4])

[[2 3 4]
 [7 8 9]]


## Array attributes

This section covers the `ndim`, `shape`, `size`, and `dtype` attributes of an array.

- The number of dimensions of an array is contained in the `ndim` attribute.

In [26]:
import numpy as np

arr = np.array([1, 2, 3, 4])

arr.ndim

1

- The shape of an array is a tuple of non-negative integers that specify the number of elements along each dimension.

In [27]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a.shape

(3, 4)

- The fixed, total number of elements in array is contained in the `size` attribute.

In [28]:
a.size

12

In [29]:
import math
a.size == math.prod(a.shape)

True

- Arrays are typically “homogeneous”, meaning that they contain elements of only one “data type”. The data type is recorded in the `dtype` attribute.

In [31]:
a.dtype

dtype('int32')

## How to create a basic array

This section covers `np.zeros()`, `np.ones()`, `np.empty()`, `np.arange()`, `np.linspace()`.

- Besides creating an array from a sequence of elements, you can easily create an array filled with `0`’s:

In [32]:
np.zeros(2)

array([0., 0.])

In [33]:
#Create an array of zeros
np.zeros((3,4)) #creating 2D array

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

- Or an array filled with `1`’s:

In [34]:
np.ones(2)

array([1., 1.])

In [35]:
#Create an array of ones
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

Or even an empty array! The function `empty` creates an array whose initial content is random and depends on the state of the memory. The reason to use `empty` over `zeros` (or something similar) is speed - just make sure to fill every element afterwards!

In [41]:
# Create an empty array with 2 elements
np.empty(2)

array([1., 1.])

And even an array that contains a range of evenly spaced intervals. To do this, you will specify the first number, last number, and the step size.

In [42]:
#Create an array of evenly spaced values (step value)
d = np.arange(10,25,2)
print(d)

[10 12 14 16 18 20 22 24]


You can also use `np.linspace()` to create an array with values that are spaced linearly in a specified interval:

In [43]:
np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

## Data Types in NumPy

NumPy has some extra data types, and refer to data types with one character, like `i` for integers, `u` for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

- `i` - integer
  
- `b` - boolean
  
- `u` - unsigned integer
  
- `f` - float
  
- `c` - complex float
  
- `m` - timedelta
  
- `M` - datetime
  
- `O` - object
  
- `S` - string
  
- `U` - unicode string
  
- `V` - fixed chunk of memory for other type ( void )

In [45]:
# String
import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)

[b'1' b'2' b'3' b'4']
|S1


In [46]:
# Integer
import numpy as np

arr = np.array([1, 2, 3, 4], dtype='i4')

print(arr)
print(arr.dtype)

[1 2 3 4]
int32


### Converting Data Type on Existing Arrays

- The best way to change the data type of an existing array, is to make a copy of the array with the `astype()` method.

- The `astype()` function creates a copy of the array, and allows you to specify the data type as a parameter.

- The data type can be specified using a string, like '`f`' for float, '`i`' for integer etc. or you can use the data type directly like float for `float` and `int` for integer.

In [47]:
#Change data type from float to integer by using 'i' as parameter value:
arr = np.array([1.1, 2.1, 3.1])

newarr = arr.astype('i')

print(newarr)
print(newarr.dtype)

[1 2 3]
int32


In [6]:
#Change data type from integer to boolean:
arr = np.array([-1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype)

[ True False  True]
bool


## NumPy Array Copy vs View

The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.

The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.

The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.

### COPY:

In [49]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[1 2 3 4 5]


- __*Note__: The copy SHOULD NOT be affected by the changes made to the original array.

### VIEW:

In [50]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42

print(arr)
print(x)

[42  2  3  4  5]
[42  2  3  4  5]


- __*Note__: The view SHOULD be affected by the changes made to the original array.

In [51]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

x = arr.copy()
y = arr.view()

print(x.base)
print(y.base)

None
[1 2 3 4 5]


## Adding, removing, and sorting elements

This section covers `np.sort()`, `np.concatenate()`

- Sorting an array is simple with `np.sort()`. You can specify the axis, kind, and order when you call the function.

In [52]:
arr = np.array([2, 1, 5, 3, 7, 4, 6, 8])
np.sort(arr)

array([1, 2, 3, 4, 5, 6, 7, 8])

- You can concatenate them with `np.concatenate()`.

In [53]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
np.concatenate((a, b))

array([1, 2, 3, 4, 5, 6, 7, 8])

In [11]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6],[4,5]])
np.concatenate((x, y), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6],
       [4, 5]])

## How do you know the shape and size of an array?

- This section covers `ndarray.ndim`, `ndarray.size`, `ndarray.shape`

- `ndarray.ndim` will tell you the number of axes, or dimensions, of the array.

- `ndarray.size` will tell you the total number of elements of the array. This is the product of the elements of the array’s shape.

- `ndarray.shape` will display a tuple of integers that indicate the number of elements stored along each dimension of the array. If, for example, you have a 2-D array with 2 rows and 3 columns, the shape of your array is `(2, 3)`.

In [55]:
array_example = np.array([[[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0, 1, 2, 3],
                           [4, 5, 6, 7]],

                          [[0 ,1 ,2, 3],
                           [4, 5, 6, 7]]])

In [56]:
array_example.ndim

3

In [57]:
array_example.size

24

In [58]:
array_example.shape

(3, 2, 4)

## Can you reshape an array?

Using `arr.reshape()` will give a new shape to an array without changing the data. Just remember that when you use the reshape method, the array you want to produce needs to have the same number of elements as the original array. If you start with an array with 12 elements, you’ll need to make sure that your new array also has a total of 12 elements.

In [13]:
a = np.arange(6)
print(a)

[0 1 2 3 4 5]


In [19]:
b = a.reshape(2, 3)
print(b)

[[0 1 2]
 [3 4 5]]


In [22]:
#Convert the following 1-D array with 12 elements into a 2-D array.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 11, 9, 10, 11, 12, 13, 16])

newarr = arr.reshape(2, 7)

print(newarr)

[[ 1  2  3  4  5  6  7]
 [11  9 10 11 12 13 16]]


In [28]:
#Convert the following 1-D array with 12 elements into a 3-D array.
#The outermost dimension will have 2 arrays that contains 3 arrays, 
#each with 2 elements:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

newarr = arr.reshape(1, 12, 1)

print(newarr)

[[[ 1]
  [ 2]
  [ 3]
  [ 4]
  [ 5]
  [ 6]
  [ 7]
  [ 8]
  [ 9]
  [10]
  [11]
  [12]]]


In [65]:
# Python Program to create
# a data type object
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

In [66]:
# Addition of two Arrays
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


In [67]:
# subtract of the elements
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [68]:
#multiplication or  product
print(x * y)
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


In [69]:
#division
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [70]:
#square root
print(np.sqrt(x))
print(np.sqrt(y))

[[1.         1.41421356]
 [1.73205081 2.        ]]
[[2.23606798 2.44948974]
 [2.64575131 2.82842712]]


In [71]:
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))

[0 1 2 3]


In [72]:
#Slice elements from index 1 to index 5 from the following array:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])

[2 3 4 5]


In [73]:
#Slice elements from index 4 to the end of the array:
arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[4:])

[5 6 7]


In [74]:
#Slice elements from the beginning to index 4 (not included):
arr = np.array([1, 2, 3, 4, 5, 6, 7])

print(arr[:4])

[1 2 3 4]


### Negative Slicing

In [29]:
import pandas as pd
import numpy as np

# Sample data with missing values
data = {'values': [10, 15, np.nan, 1000, 20, 25, np.nan]}
df = pd.DataFrame(data)

# Fill missing values with mean
df['mean_filled'] = df['values'].fillna(df['values'].mean())

# Fill missing values with median
df['median_filled'] = df['values'].fillna(df['values'].median())

print(df)


   values  mean_filled  median_filled
0    10.0         10.0           10.0
1    15.0         15.0           15.0
2     NaN        214.0           20.0
3  1000.0       1000.0         1000.0
4    20.0         20.0           20.0
5    25.0         25.0           25.0
6     NaN        214.0           20.0


In [4]:
import pandas as pd
import numpy as np


data = {'Values':[10,15,np.nan,1000,20,25,np.nan]}

df = pd.DataFrame(data)

df['mean_value'] = df['Values'].fillna(df['Values'].mean())
df['meadian_value'] = df['Values'].fillna(df['Values'].median())

print(df)

   Values  mean_value  meadian_value
0    10.0        10.0           10.0
1    15.0        15.0           15.0
2     NaN       214.0           20.0
3  1000.0      1000.0         1000.0
4    20.0        20.0           20.0
5    25.0        25.0           25.0
6     NaN       214.0           20.0
