<h2>Numpy<h2>

<p>One of the reasons NumPy is so important for numerical computations in Python is
because it is designed for efficiency on large arrays of data. There are a number of
reasons for this:
<br>
<ul>
<li>NumPy internally stores data in a contiguous block of memory, independent of
other built-in Python objects. NumPy’s library of algorithms written in the C language
can operate on this memory without any type checking or other overhead.
NumPy arrays also use much less memory than built-in Python sequences.</li>
<li>NumPy operations perform complex computations on entire arrays without the
need for Python for loops, which can be slow for large sequences. NumPy is
faster than regular Python code because its C-based algorithms avoid overhead
present with regular interpreted Python code.</li></p>

In [21]:
import numpy as np

In [22]:
# Performance difference numpy and python
# Consider numpy array of one million elements and python list of one million elements
np_arr = np.arange(1_000_000)
py_list = list(range(1_000_000))
# Multiply each element by 2
%timeit np_arr * 2
%timeit [x * 2 for x in py_list]

2.6 ms ± 638 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
76.9 ms ± 6.97 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


<h3>ndarray: A Multidimensional Array Object</h3>
<p>N-dimensional array object, or ndarray, is a fast, flexible container for large datasets in Python. Arrays enable you to
perform mathematical operations on whole blocks of data using similar syntax to the
equivalent operations between scalar elements.It is a multidimensional container for homogeneous data; that is, all
of the elements must be the same type.</p>

In [23]:
# Every array has a shape, a tuple of integers indicating the size of each dimension. 
print(np_arr.shape)

# a dtype is an object describing how the bytes in the fixed-size block of memory should be interpreted.
print(np_arr.dtype)

(1000000,)
int64


<h3>Creating ndarrays</h3>

In [24]:
# Consider a list
data1 = [1, 3, 5, 7]
# We can convert this lsit to numpy array by calling the array function.
arr1 = np.array(data1)
print('List to array:')
print(arr1)
print('Shape of array:', arr1.shape)
print()

# Nested sequences are converted to multidimensional arrays
data2 = [[1, 2, 3], [4, 5, 6]]
arr2 = np.array(data2)
print('Multidimensional array :')
print(arr2)
# ndim is the number of dimensions of the array
print('ndim of multidimensional array:', arr2.ndim)
# Shape of multi-dimensional array
print('Shape of multidimensional array:', arr2.shape)


List to array:
[1 3 5 7]
Shape of array: (4,)

Multidimensional array :
[[1 2 3]
 [4 5 6]]
ndim of multidimensional array: 2
Shape of multidimensional array: (2, 3)


<p>Unless explicitly specified,
numpy.array tries to infer a good data type for the array that it creates. The data
type is stored in a special dtype metadata object.</p>

In [25]:
# dtype of the array is inferred from the data type of the elements in the sequences
print('dtype of multidimensional array:', arr2.dtype)

# array with float elements
arr3 = np.array([[1.0, 3.4], [6.9, 8.1]])
# dtype of the array is inferred from the data type of the elements in the sequences
print('Array with float elements:')
print(arr3)
print('dtype of array with float elements:', arr3.dtype)

dtype of multidimensional array: int64
Array with float elements:
[[1.  3.4]
 [6.9 8.1]]
dtype of array with float elements: float64


<p>In addition to numpy.array, there are a number of other functions for creating
new arrays. As examples, numpy.zeros and numpy.ones create arrays of 0s or 1s,
respectively, with a given length or shape. numpy.empty creates an array without
initializing its values to any particular value.</p>

In [26]:
# Create an array of zeros
zero_arr = np.zeros(5)
print('Array of zeros:', zero_arr)
print()

# Array of zeros with a specific shape
zero_arr_2d = np.zeros((3, 4))  # 3 rows and 4 columns
print('2D Array of zeros:')
print(zero_arr_2d)
print()

# Create an array of ones
ones_arr = np.ones(5)
print('Array of ones:', ones_arr)
print()

# Array of ones with a specific shape
ones_arr_2d = np.ones((2, 3))
print('2D Array of ones:')
print(ones_arr_2d)

# Create an empty array (values are uninitialized)
empty_arr = np.empty(5)
print('Empty array:', empty_arr)
print()

# Empty array with a specific shape
empty_arr_2d = np.empty((2, 3, 2))
print('2D Empty array:')
print(empty_arr_2d)



Array of zeros: [0. 0. 0. 0. 0.]

2D Array of zeros:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Array of ones: [1. 1. 1. 1. 1.]

2D Array of ones:
[[1. 1. 1.]
 [1. 1. 1.]]
Empty array: [1. 1. 1. 1. 1.]

2D Empty array:
[[[0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]]]


<p>numpy.empty will not always return an array of all
zeros. This function returns uninitialized memory and thus may
contain nonzero “garbage” values. You should use this function
only if you intend to populate the new array with data.</p>

<h3>Data Types for ndarrays</h3>
<p>The data type or dtype is a special object containing the information (or metadata,
data about data) the ndarray needs to interpret a chunk of memory as a particular
type of data.</p>

In [27]:
# Creating an array of dtype float64
float_arr = np.array([1, 2, 3], dtype=np.float64)
print('Array with dtype float64:', float_arr)
print('dtype of array with dtype float64:', float_arr.dtype)
print()

# Creating an array of dtype int32
int_arr = np.array([1, 2, 3], dtype=np.int32)
print('Array with dtype int32:', int_arr)
print('dtype of array with dtype int32:', int_arr.dtype)

Array with dtype float64: [1. 2. 3.]
dtype of array with dtype float64: float64

Array with dtype int32: [1 2 3]
dtype of array with dtype int32: int32


<p>Data types are a source of NumPy’s flexibility for interacting with data coming from
other systems. In most cases they provide a mapping directly onto an underlying
disk or memory representation, which makes it possible to read and write binary
streams of data to disk and to connect to code written in a low-level language like
C or FORTRAN. The numerical data types are named the same way: a type name,
like float or int, followed by a number indicating the number of bits per element.
A standard double-precision floating-point value (what’s used under the hood in
Python’s float object) takes up 8 bytes or 64 bits. Thus, this type is known in NumPy
as float64.</p>

In [28]:
# You can explicitly convert or cast an array from one data type to another using ndarray’s astype method
float_arr2 = int_arr.astype(np.float64)
print('Converted array from int32 to float64:', float_arr2)
print('dtype of converted array:', float_arr2.dtype)
print()

# Casting float to int will truncate the decimal part
int_arr2 = float_arr.astype(np.int32)
print('Converted array from float64 to int32:', int_arr2)
print('dtype of converted array:', int_arr2.dtype)
print()

# Converting strings to numbers
str_arr = np.array(['1.0', '2.5', '3.6'], dtype=np.str_)
print('String array:', str_arr)
num_arr = str_arr.astype(np.float64)
print('Converted string array to float64:', num_arr)
print('dtype of converted array:', num_arr.dtype)
print()

# Using another array's dtype
another_arr = np.array([4, 5, 6])
converted_arr = another_arr.astype(float_arr.dtype)
print('Converted array using another array\'s dtype:', converted_arr)
print('dtype of converted array:', converted_arr.dtype)
print()

Converted array from int32 to float64: [1. 2. 3.]
dtype of converted array: float64

Converted array from float64 to int32: [1 2 3]
dtype of converted array: int32

String array: ['1.0' '2.5' '3.6']
Converted string array to float64: [1.  2.5 3.6]
dtype of converted array: float64

Converted array using another array's dtype: [4. 5. 6.]
dtype of converted array: float64



<p>Calling astype always creates a new array (a copy of the data), even
if the new data type is the same as the old data type.</p>

<h3>Arithmetic with NumPy Arrays</h3>
<p>Arrays are important because they enable you to express batch operations on data
without writing any for loops. NumPy users call this vectorization.</p>

In [29]:
# Any arithmetic operations between equal-size arrays apply the operation element-wise
arr = np.array([[1, 2, 3], [4, 5, 6]])
print('Array for arithmetic operations: \n', arr)
print()
# array multiplication
print('Array multiplication: \n', arr * arr)

Array for arithmetic operations: 
 [[1 2 3]
 [4 5 6]]

Array multiplication: 
 [[ 1  4  9]
 [16 25 36]]


<p>If you perform an arithmetic operation (like +, -, *, /) between a NumPy array and a scalar (a single number), NumPy will apply that operation to every element of the array.</p>

In [30]:
# Scalar division
print('Scalar division: \n', arr / 2)
print()

# Scalar multiplication
print('Scalar multiplication: \n', arr * 2)
print()

# Comparision of array elements yeild boolean arrays
print('Array elements greater than 3: \n', arr > 3)
print()


Scalar division: 
 [[0.5 1.  1.5]
 [2.  2.5 3. ]]

Scalar multiplication: 
 [[ 2  4  6]
 [ 8 10 12]]

Array elements greater than 3: 
 [[False False False]
 [ True  True  True]]



<h3>Indexing and Slicing</h3>
<p>Indexing or slicing can be used to select
a subset of your data or individual elements.</p>

In [31]:
# One dimensional array act like a list
one_d_arr = np.array([1, 2, 3, 4, 5])
print('One dimensional array:', one_d_arr)
print('First element:', one_d_arr[0])
print('Last element:', one_d_arr[-1])
print('Slicing first three elements:', one_d_arr[:3])
print('Slicing last two elements:', one_d_arr[-2:])
print()

# Assigning scalar value to a slice
one_d_arr[:3] = 10
print('After assigning 10 to first three elements:', one_d_arr)
print()

One dimensional array: [1 2 3 4 5]
First element: 1
Last element: 5
Slicing first three elements: [1 2 3]
Slicing last two elements: [4 5]

After assigning 10 to first three elements: [10 10 10  4  5]



In [32]:
# Slicing on array reflects in the original array
slice_arr = one_d_arr[1:4]
print('Slice of one dimensional array:', slice_arr)
print('Original array after slicing:', one_d_arr)
print()

# Assign slice to a different array
slice_arr = one_d_arr[1:4].copy()  # Use copy to avoid modifying the original array
print('Slice of one dimensional array (copy):', slice_arr)
# Change value of slice_arr
slice_arr[0] = 20
print('Modified slice of one dimensional array:', slice_arr)
print('Original array after modifying slice:', one_d_arr)
print()

# Bare slicing on multi-dimensional arrays
multi_d_arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print('Multi-dimensional array:\n', multi_d_arr)
multi_d_arr[:] = 7
print('Original multi-dimensional array after bare slicing: \n', multi_d_arr)
print()

Slice of one dimensional array: [10 10  4]
Original array after slicing: [10 10 10  4  5]

Slice of one dimensional array (copy): [10 10  4]
Modified slice of one dimensional array: [20 10  4]
Original array after modifying slice: [10 10 10  4  5]

Multi-dimensional array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Original multi-dimensional array after bare slicing: 
 [[7 7 7]
 [7 7 7]
 [7 7 7]]



<p>Slicing a Python list always creates a new list object (a copy of the references, not the actual objects). Where as, in slicing an numpy array makes changes to the source.As NumPy has been
designed to be able to work with very large arrays, you could imagine performance
and memory problems if NumPy insisted on always copying data.</p>

In [33]:
# Create a 2D array
two_d_arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Slicing 2D array using a scalar
print('2D Array for slicing:')
print(two_d_arr)
print()
# Slicing first row
first_row = two_d_arr[0]
print('First row of 2D array:', first_row)

2D Array for slicing:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

First row of 2D array: [1 2 3]


<p>Thus, individual elements can be accessed recursively. But that is a bit too much
work, so you can pass a comma-separated list of indices to select individual elements.</p>

In [34]:
# Slicing a two-dimensional array
print('Slicing first two rows and first two columns:')
sliced_2d_arr = two_d_arr[2][0]
print('Sliced 2D array:', sliced_2d_arr)


Slicing first two rows and first two columns:
Sliced 2D array: 7


<p>For indexing on a two-dimensional array it is
helpful to think of axis 0 as the “rows” of the array and axis 1 as the “columns.”</p>

<p>In multidimensional arrays, if you omit later indices, the returned object will be a
lower dimensional ndarray consisting of all the data along the higher dimensions.</p>

In [35]:
# Indexing multi-dimensional arrays
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print('3D Array for indexing:')
print(arr3d)
print()
# Indexing 
print('indexing :')
print(arr3d[0])
print()

# Both scalar values and arrays can be assigned to arr3d[0]
old_value = arr3d[0].copy()
arr3d[0] = 100
print('3D Array after assigning scalar value 100 to first element:')
print(arr3d)
print()
# Assign back the old value
arr3d[0] = old_value
print('3D Array after assigning back the old value:')
print(arr3d)
print()

# Similarly, arr3d[1, 0] gives you all of the values whose indices start with (1, 0), forming a one-dimensional array
print('Indexing first row of second element:')
print(arr3d[1, 0])
print()

# We can also
x = arr3d[1]
# Then slice
print('Indexing second element of 3D array:')
print(x[0]) # This will return simialr output to the one above
print()

3D Array for indexing:
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

indexing :
[[1 2 3]
 [4 5 6]]

3D Array after assigning scalar value 100 to first element:
[[[100 100 100]
  [100 100 100]]

 [[  7   8   9]
  [ 10  11  12]]]

3D Array after assigning back the old value:
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

Indexing first row of second element:
[7 8 9]

Indexing second element of 3D array:
[7 8 9]



<p>Note that in all of these cases where subsections of the array have been selected, the
returned arrays are views.
<br>
When you slice or index a NumPy array, by default NumPy does not copy the data.
Instead, it creates a view of the same underlying data in memory.
That means:
<ul>
<li>The new array object is just a "window" into the same data buffer.</li>

<li>Changes you make in the view will also affect the original array.</li>
</ul>

This multidimensional indexing syntax for NumPy arrays will not
work with regular Python objects, such as lists of lists.
</p>

<h3>Indexing with slices</h3>

In [36]:
# ndarrays can be sliced similar to slicing of lists
a = np.array([1, 2, 3, 4, 5])
print('Original array:', a)
print()

# Slicing the elements
sliced_a = a[2:4]
print('Sliced array (first three elements):', sliced_a)
print()

# Consider a two-dimensional array
two_d_arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print('2D Array for slicing:')
print(two_d_arr)
print()
# Slicing first two rows and first two columns
sliced_2d_arr = two_d_arr[1:, :2]
print('Sliced two-dimensional array :\n', sliced_2d_arr)
print()

# Assigning to slice expression
two_d_arr[:2, 1:] = 100
print('After assigning 100 to first element of sliced 2D array:')
print(two_d_arr)



Original array: [1 2 3 4 5]

Sliced array (first three elements): [3 4]

2D Array for slicing:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Sliced two-dimensional array :
 [[4 5]
 [7 8]]

After assigning 100 to first element of sliced 2D array:
[[  1 100 100]
 [  4 100 100]
 [  7   8   9]]


<h3>Boolean Indexing</h3>

In [37]:
# Consider array of strings with duplicates
str_arr = np.array(['apple', 'banana', 'apple', 'orange', 'banana', 'cherry', 'kiwi'])

# Consider another array that is two dimensional
data = np.array([[4, 7], [0, 2], [-5, 6], [0, 0], [1, 2], [-12, -4], [3, 4]])

#Suppose each name corresponds to a row in the data array and we wanted to select all the rows with the corresponding name "apple".
# Comparing with the string "apple" yields a Boolean array
print('Boolean array corresponding to string "apple" :',str_arr == 'apple')
# The boolean array can be used to index
print('Array indexed using boolean array :\n',data[str_arr == 'apple'])

# Selecting rows with boolean array
print('Array indexed using boolean array :\n',data[str_arr == 'apple' , 1])


Boolean array corresponding to string "apple" : [ True False  True False False False False]
Array indexed using boolean array :
 [[ 4  7]
 [-5  6]]
Array indexed using boolean array :
 [7 6]


<p><b>Note :</b> The Boolean array must be of the same length as the array axis it’s indexing. You can
even mix and match Boolean arrays with slices or integers</p>

In [38]:
# To select everything but "banana" you can either use != or negate the condition using ~:
print('Boolean array where banana is not required :', str_arr != 'banana')
print('Boolean array where banana is not required using "`" operator :', ~(str_arr == 'banana'))
print()

# Indexing using the negate operator
print('Array indexed using boolean array from negation :\n',data[str_arr != 'apple' ])
print()

# To select two of the three names to combine multiple Boolean conditions, use Boolean arithmetic operators like & (and) and | (or)
mask = (str_arr=='orange') | (str_arr=='kiwi')
print('Indexed using boolean array from arithmetic operator : \n', data[mask])

Boolean array where banana is not required : [ True False  True  True False  True  True]
Boolean array where banana is not required using "`" operator : [ True False  True  True False  True  True]

Array indexed using boolean array from negation :
 [[  0   2]
 [  0   0]
 [  1   2]
 [-12  -4]
 [  3   4]]

Indexed using boolean array from arithmetic operator : 
 [[0 0]
 [3 4]]


<p><b>Note :</b> Selecting data from an array by Boolean indexing and assigning the result to a new
variable always creates a copy of the data, even if the returned array is unchanged.</p>

<p>The Python keywords "and" and "or" do not work with Boolean arrays.
Use & (and) and | (or) instead.</p>

In [39]:
# Select all values in data that are less than 0 and assign value 0
print('Original array :\n', data)
print('Array after assigning 0 to all values less than 0 :')
data[data < 0] = 0
print(data)

Original array :
 [[  4   7]
 [  0   2]
 [ -5   6]
 [  0   0]
 [  1   2]
 [-12  -4]
 [  3   4]]
Array after assigning 0 to all values less than 0 :
[[4 7]
 [0 2]
 [0 6]
 [0 0]
 [1 2]
 [0 0]
 [3 4]]


<h3>Fancy Indexing</h3>
<p>Through fancy indexing one can can get the exact values in an array by mentioning Fancy indexing, unlike slicing, always copies the data into a new
array when assigning the result to a new variable. This is because it usses lists or boolean mask as indices.

Why fancy indexing copy instead of view?
<ul>
<li>The elements you pick may not be contiguous in memory.</li>
<li>NumPy’s data buffer is stored as a continuous block, but fancy indexing can grab elements from scattered positions.</li>
<li>To handle this, NumPy has no choice but to create a new array and copy the data.</li>
</ul>
</p>

In [45]:
# Fancy Indexing
arr = np.zeros((8, 4))
for i in range(8):
    arr[i] = i
print('Original array :\n', arr)
print()
print('Array subset by passing a list: \n', arr[[4,3,0,6]])
print()


# Negative indexing
print('Negative indexing of arr :\n',arr[[-3,-1,-4]])
print()





Original array :
 [[0. 0. 0. 0.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [4. 4. 4. 4.]
 [5. 5. 5. 5.]
 [6. 6. 6. 6.]
 [7. 7. 7. 7.]]

Array subset by passing a list: 
 [[4. 4. 4. 4.]
 [3. 3. 3. 3.]
 [0. 0. 0. 0.]
 [6. 6. 6. 6.]]

Negative indexing of arr :
 [[5. 5. 5. 5.]
 [7. 7. 7. 7.]
 [4. 4. 4. 4.]]



<p>Passing multiple index arrays does something slightly different; it selects a onedimensional
array of elements corresponding to each tuple of indices.</p>

In [None]:
# Passing multiple index arrays
arr2 = np.arange(32).reshape((8,4))
print('Original array for multiple index arrays :\n', arr2)
print()
print('Array subset by passing multiple index arrays: \n', arr2[[1,5,7,2], [0,3,1,2]])
print()

<h3>Transposing Arrays and Swapping Axes</h3>
<p>Transposing is a special form of reshaping that similarly returns a view on the
underlying data without copying anything.</p>

In [47]:
# Arrays have the transpose method and the special T attribute
arr = np.arange(15).reshape((3, 5))
print('Original array for transpose :\n', arr)
print()
print('Transposed array using T attribute :\n', arr.T)

Original array for transpose :
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]

Transposed array using T attribute :
 [[ 0  5 10]
 [ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]]


In [52]:
# We can perform dot produts with the dot function or the method
print('Dot product using numpy\'s dot function :\n', np.dot(arr.T, arr))
print()
print('Dot product using numpy\'s @ operator :\n', arr.T @ arr)
print()

Dot product using numpy's dot function :
 [[125 140 155 170 185]
 [140 158 176 194 212]
 [155 176 197 218 239]
 [170 194 218 242 266]
 [185 212 239 266 293]]

Dot product using numpy's @ operator :
 [[125 140 155 170 185]
 [140 158 176 194 212]
 [155 176 197 218 239]
 [170 194 218 242 266]
 [185 212 239 266 293]]



<p>Simple transposing with .T is a special case of swapping axes. ndarray has the method
swapaxes, which takes a pair of axis numbers and switches the indicated axes to
rearrange the data</p>

In [55]:
# Array
print('Original array for swapping axes :\n', arr)
print()
print('Swapping axes using the swapaxes function :\n', arr.swapaxes(0,1))

Original array for swapping axes :
 [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]]

Swapping axes using the swapaxes function :
 [[ 0  5 10]
 [ 1  6 11]
 [ 2  7 12]
 [ 3  8 13]
 [ 4  9 14]]
