<h3>Numpy</h3>
<p><font size=2>NumPy, short for Numerical Python, is a fundamental library for numerical and scientific computing in Python.
It provides support for arrays, matrices, and a wide range of mathematical functions to operate on these data structures.
</font>
</p>

In [1]:
import numpy as np          # Importing Numpy and Pandas library
import pandas as pd

<h3>Arrays in Numpy</h3>
<p>In NumPy, an array is a powerful and versatile data structure used for numerical computing. NumPy arrays are more efficient than Python lists for numerical operations and come with a wide range of functionality.
<br>
<font size=2>Features of NumPy Arrays:
<br>

- Homogeneity: All elements in a NumPy array must be of the same data type. This homogeneity allows for efficient storage and operations.

- Multidimensional: While Python lists are primarily one-dimensional, NumPy arrays can be multidimensional. For example, a 2D array (matrix) or even higher-dimensional arrays are possible.

- Element-wise Operations: NumPy supports element-wise operations, allowing you to perform mathematical operations on arrays in a straightforward and efficient manner without needing explicit loops.

- Broadcasting: NumPy arrays support broadcasting, which allows for arithmetic operations between arrays of different shapes in a manner that aligns their dimensions.

- Vectorization: Operations on NumPy arrays are implemented in a vectorized form, meaning they are processed in bulk and are highly optimized, often implemented in C or Fortran under the hood.

- Array Methods: NumPy provides a wide range of methods and functions to perform operations on arrays, such as reshaping, slicing, and aggregating data.</font></p>

#### Creating Arrays

1D Array

In [2]:
lst = [1,2,3,4,5]  # creating a list 1d array
arr = np.array(lst)         # 1d array
print(arr)

[1 2 3 4 5]


2d Array

In [3]:
lst2 = [[1,2,3,4,5],[6,7,8,9,0]]   # creating a nested list for 2d array
arr2 = np.array(lst2)           # 2d array
print(arr2)

[[1 2 3 4 5]
 [6 7 8 9 0]]


Checking for the dimensions of the array (eg. 1D, 2D, 3D etc)

In [4]:
print(arr.ndim)
print(arr2.ndim)

1
2


Checking number of rows and columns of array

In [5]:
print(arr.shape)
print(arr2.shape)

(5,)
(2, 5)


Checking data type of array

In [6]:
print(arr.dtype)
print(arr2.dtype)
# or
str_arr = np.array(['Apple','Banana','Mangoes'])
print(str_arr.dtype)

int32
int32
<U7


Creating an array filled with zeroes or ones

In [7]:
zeroes = np.zeros((3,4))     # pass shape of array as paramenter
ones = np.ones((3,4))       
print("Zeroes Array:",zeroes)
print("Ones Array:",ones)

Zeroes Array: [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
Ones Array: [[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


Creating an Empty array(which may or may not actually be empty and may be filled with random values and values can be changed later)

In [8]:
empty = np.empty((2,3))
empty

array([[0., 0., 0.],
       [0., 0., 0.]])

Arange is used to create ndarray it is numpy version of range

In [9]:
arange = np.arange(15)     # pass a number for creating an 
arange                    # array in that range 

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

Changing the data type of array

In [10]:
# use astype() and datatype to set
# data type for predefined array
print(arange.dtype)
arange = arange.astype(np.float64)   # changing to float   
print(arange)                           
print(arange.dtype)

arange = arange.astype(np.string_)   # changing to string
print(arange)                           
print(arange.dtype)

int32
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14.]
float64
[b'0.0' b'1.0' b'2.0' b'3.0' b'4.0' b'5.0' b'6.0' b'7.0' b'8.0' b'9.0'
 b'10.0' b'11.0' b'12.0' b'13.0' b'14.0']
|S32


<h3>Operations between an Array and a Scalar</h3>
<font size=2>An operation between an array and a scalar involves applying a mathematical operation (such as addition, subtraction, multiplication, or division) between each element of the array and the scalar value. This process is often called broadcasting in array-processing libraries like NumPy in Python.
<br>
e.g. 
<br>
Consider an array and a scalar:
<br>
Array: 2,4,6,8
Scalar: 3
<br>
Add the scalar to each element of the array.
[2+3,4+3,6+3,8+3]=[5,7,9,11]
</font>

In [11]:
add_scalar = np.arange(1,6) # array([1,2,3,4,5])
print(add_scalar)
print(add_scalar + 5)

[1 2 3 4 5]
[ 6  7  8  9 10]


Even an array can be mathematically operated with other one

In [12]:
arr*arr2

array([[ 1,  4,  9, 16, 25],
       [ 6, 14, 24, 36,  0]])

In [13]:
arr**0.5

array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798])

<h3>Indexing in Arrays</h3>
<font size=2>
Indexing in Python arrays allows you to access specific elements or slices of elements in the array. The most common way to work with arrays in Python is using libraries like NumPy, which provides powerful array-handling capabilities
<br>
There are two types of indexing in numpy:
<br>

- Positive: Starting from 0 to (no. of elements - 1)
- Negative: Starting from -1(at last element of array) to -n(first element of array)

<h3>Slicing in arrays</h3>
Slicing is the method for accessing multiple elements of array by accessing a part of array.
</font>

In [14]:
index = np.array([10,20,30,40,50,60,70,80,90,100])
print(index[4])     # element at index 4
print(index[-9])    # element at index -5

50
20


In [15]:
print(index[5:8])   # slicing array from index 5 to 8(excluded)
print(index[5:])    # skipping last index display the array till last
print(index[:6])    # skipping first index display the array from beginning
                    # till the last index specified
print(index[:])     # skippin both the indexes prints complete array    

[60 70 80]
[ 60  70  80  90 100]
[10 20 30 40 50 60]
[ 10  20  30  40  50  60  70  80  90 100]


Accessing the element in 2 or 3 dimensional array need rows and columns to be specified, if only one thing is specified it access rows


In [16]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2d[2])
print(arr2d[0][2])

[7 8 9]
3


A comma separated subscript ([row,col]) can also be used

In [17]:
arr2d[1,2]

6

Or we can also perform slicing in 2d arrays as well

In [18]:
print(arr2d[1,:2])
print(arr2d[:2, 1:])

[4 5]
[[2 3]
 [5 6]]


<h3>Boolean Indexing</h3>
<font size=2>Boolean indexing (also known as masking) is a powerful feature in Python, especially when working with NumPy arrays. It allows you to select elements from an array based on the results of a boolean condition applied to the array. Instead of using explicit loops or conditionals, you can use boolean indexing to filter or modify elements efficiently.
<br>
How Boolean Indexing Works:

- Create a Boolean Array: Apply a condition to an array, which results in a boolean array where each element is True or False, depending on whether the condition is met.
- Use Boolean Array for Indexing: Use this boolean array to index the original array, selecting only the elements where the boolean array has True.
</font>

In [19]:
bool_index = np.array([10,20,30,40,50])         # creating an array
bool_index

array([10, 20, 30, 40, 50])

In [20]:
condition = bool_index > 25     # creating boolean condition
print(condition)

[False False  True  True  True]


In [21]:
filtered_arr = bool_index[condition]    # applying bool condition
print(filtered_arr)                     # as index on array to filter

[30 40 50]


In [22]:
bool_index[bool_index>40]      # directly applying condition as index

array([50])

<font size=2>Let’s consider an example where we have some data in an array and an array of names with duplicates and the randn function in numpy.random used to generate some random normally distributed data.</font>

In [23]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
names

array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='<U4')

In [25]:
data = np.random.randn(7, 4)
data

array([[-1.4267011 ,  0.13252478,  1.03923994,  1.02748659],
       [ 0.04207949, -1.40418122,  0.99024009, -2.3153865 ],
       [ 0.60849859, -0.680123  ,  1.09553484,  0.34940146],
       [ 0.21365149, -0.60072994, -1.22101717, -0.52420417],
       [ 0.0543096 , -0.39105914,  0.06218221, -0.24265439],
       [ 0.00365229, -0.0932944 ,  0.60338276, -1.49923569],
       [ 0.70819585,  1.30744786, -0.55718   ,  0.50678139]])

<font size=2>Suppose each name corresponds to a row in the data array. If we wanted to select all
the rows with corresponding name 'Bob'. Like arithmetic operations, comparisons
(such as ==) with arrays are also vectorized. Thus, comparing names with the string
'Bob' yields a boolean array:</font>

In [26]:
names == 'Bob'

array([ True, False, False,  True, False, False, False])

In [27]:
data[names == 'Bob']
# this will access the corresponding nth row of array where the
# condition holds true

array([[-1.4267011 ,  0.13252478,  1.03923994,  1.02748659],
       [ 0.21365149, -0.60072994, -1.22101717, -0.52420417]])

In [28]:
mask = (names == 'Bob') | (names == 'Will')

In [30]:
print(mask)
data[mask]

[ True False  True  True  True False False]


array([[-1.4267011 ,  0.13252478,  1.03923994,  1.02748659],
       [ 0.60849859, -0.680123  ,  1.09553484,  0.34940146],
       [ 0.21365149, -0.60072994, -1.22101717, -0.52420417],
       [ 0.0543096 , -0.39105914,  0.06218221, -0.24265439]])

Fancy Indexing