Credits: https://github.com/bansalkanav/Machine_Learning_and_Deep_Learning

# **High Speed Numerical Array Computation using NumPy Module**

NumPy (**Num**erical **Py**thon) is an open source Python library that’s widely used in science and engineering. The NumPy library contains multidimensional array data structures, such as the homogeneous, N-dimensional ndarray, and a large library of functions that operate efficiently on these data structures. 

## **Installing NumPy**
```python
! pip install numpy
```

In [1]:
# # Install a pip package in the current Jupyter kernel
# ! pip install numpy

## **How to import NumPy & Creating NumPy Array**
```python
import numpy as np

a = np.array([1, 2, 3, 4, 5, 6])
```

In [2]:
# Creating a simple array in numpy
import numpy as np

arr = np.array([1, 2, 3, 4])

print(type(arr))

print("Numpy Array: ", arr)

<class 'numpy.ndarray'>
Numpy Array:  [1 2 3 4]


## **Why NumPy when we already have lists?**
Python lists are excellent, general-purpose containers. They can be “heterogeneous”, meaning that they can contain elements of a variety of types, and they are quite fast when used to perform individual operations on a handful of elements.

Depending on the characteristics of the data and the types of operations that need to be performed, other containers may be more appropriate; by exploiting these characteristics, we can improve speed, reduce memory consumption, and offer a high-level syntax for performing a variety of common processing tasks. NumPy shines when there are large quantities of “homogeneous” (same-type) data to be processed on the CPU.

Other reasons:
- Most powerful numerical processing library in python. Array Oriented computing.
- Provides extension package to python for multi dimensional array.
- Very efficient.
- Scientific computation.

## **NumPy vs Lists in action**

In [4]:
%%time

lst = list(range(1000000))

for i in range(1000000):
    lst[i] *= lst[i]

CPU times: user 101 ms, sys: 10.7 ms, total: 112 ms
Wall time: 111 ms


In [5]:
%%time

arr = np.arange(1000000)

arr = arr * arr

CPU times: user 4.22 ms, sys: 4.48 ms, total: 8.7 ms
Wall time: 5.2 ms


## **What is an "array"?**

In computer programming, an array is a structure for storing and retrieving data. We often talk about an array as if it were a grid in space, with each cell storing one element of the data. For instance, if each element of the data were a number, we might visualize a "one-dimensional" array like a list:<br />
<img width="400" height="400" src="images/one_d_array.png"><br />


A two-dimensional array would be like a table:<br />
<img width="400" height="400" src="images/two_d_array.png"><br />


A three-dimensional array would be like a set of tables, perhaps stacked as though they were printed on separate pages. In NumPy, this idea is generalized to an arbitrary number of dimensions, and so the fundamental array class is called ndarray: it represents an "N-dimensional array".

## **NumPy Array Restrictions**
Most NumPy arrays have some restrictions. For instance:
- All elements of the array must be of the same type of data.
- Once created, the total size of the array can't change.
- The shape must be "rectangular", not "jagged"; e.g., each row of a two-dimensional array must have the same number of columns.

When these conditions are met, NumPy exploits these characteristics to make the array faster, more memory efficient, and more convenient to use than less restrictive data structures.

## **NumPy Array Axis**
In NumPy, a dimension of an array is sometimes referred to as an "axis". This terminology may be useful to disambiguate between the dimensionality of an array and the dimensionality of the data represented by the array. 

For instance, the array **arr** could represent three points, each lying within a four-dimensional space, but a has only two "axes".

## **NumPy Array's Attributes/Properties**

The important attributes of an ndarray object are:

**ndarray.ndim**  
the number of axes (dimensions) of the array.

**ndarray.shape**  
the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.

**ndarray.size**  
the total number of elements of the array. This is equal to the product of the elements of shape.

**ndarray.dtype**  
an object describing the type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own. numpy.int32, numpy.int16, and numpy.float64 are some examples.

**ndarray.itemsize**  
the size in bytes of each element of the array. For example, an array of elements of type float64 has itemsize 8 (=64/8), while one of type complex32 has itemsize 4 (=32/8). It is equivalent to ndarray.dtype.itemsize.

**ndarray.data**  
the buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities.

In [27]:
import numpy as np

arr = np.array([1, 2, 3, 4])

print("Array: \n", arr)

# Print shape
print("Shape: ", arr.shape)

# Print Size
print("Size: ", arr.size)

# Print datatype
print("Data Type: ", arr.dtype)

# Print item size in byte of each element
print("Item Size: ", arr.itemsize)

# Print the dimensionality of the numpy array
print("Dimensionality: ", arr.ndim)

# Print the databuffer of the numpy array
print("Data: ", arr.data)

Array: 
 [1 2 3 4]
Shape:  (4,)
Size:  4
Data Type:  int64
Item Size:  8
Dimensionality:  1
Data:  <memory at 0x10808b4c0>


In [29]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Array: \n", arr)

# Print shape
print("Shape: ", arr.shape)

# Print Size
print("Size: ", arr.size)

# Print datatype
print("Data Type: ", arr.dtype)

# Print item size in byte of each element
print("Item Size: ", arr.itemsize)

# Print the dimensionality of the numpy array
print("Dimensionality: ", arr.ndim)

Array: 
 [[1 2 3]
 [4 5 6]]
Shape:  (2, 3)
Size:  6
Data Type:  int64
Item Size:  8
Dimensionality:  2


In [30]:
# Let's put all the attributes in a function

def array_attributes(arr):
    # Print shape
    print("Shape: ", arr.shape)

    # Print Size
    print("Size: ", arr.size)

    # Print datatype
    print("Data Type: ", arr.dtype)
    
    # Print item size in byte of each element
    print("Item Size: ", arr.itemsize)
    
    # Print the dimensionality of the numpy array
    print("Dimensionality: ", arr.ndim)


array_attributes(arr)

Shape:  (2, 3)
Size:  6
Data Type:  int64
Item Size:  8
Dimensionality:  2


## **Datatypes in Numpy**

Every NumPy array has a data type that defines the kind of values it holds. Unlike Python, where integers and floats are flexible, NumPy enforces fixed-size data types for performance.

```python
# Specifying a Data Type
arr = np.array([1, 2, 3], dtype=np.int16)  
print(arr.dtype)

# Output: int16
```

Below is a list of all data types in NumPy:
| **Category**     | **Description**                         | **Example Data Types** |
|------------------|---------------------------------------|----------------------|
| **Integer**      | Whole numbers                        | `int8, int16, int32, int64` |
| **Unsigned Int** | Non-negative whole numbers          | `uint8, uint16, uint32, uint64` |
| **Floating-Point** | Decimal numbers                     | `float16, float32, float64, float128` |
| **Complex Numbers** | Numbers with real + imaginary parts | `complex64, complex128` |
| **Boolean**      | True/False                           | `bool` |
| **String**       | Fixed-size text                      | `S` (Byte String) or `U` (Unicode String) |
| **Object**       | Python objects (slowest)             | `object` |
| **Datetime**     | Date & time values                   | `datetime64` |
| **Timedelta**    | Differences between datetime64 values | `timedelta64` |

---


In [96]:
import numpy as np

arr = np.array([2, 1, 4, 5], dtype='float16')

print("Array: \n", arr)

# Print datatype
print("Data Type: ", arr.dtype)

Array: 
 [2. 1. 4. 5.]
Data Type:  float16


In [97]:
import numpy as np

arr = np.array([2, 1, 4, 5], dtype='object')

print("Array: \n", arr)

# Print datatype
print("Data Type: ", arr.dtype)

Array: 
 [2 1 4 5]
Data Type:  object


In [102]:
import numpy as np

# S10 means max 10 Bytes
arr = np.array(["Hi", "Hello"], dtype='S10')

print("Array: \n", arr)

# Print datatype
print("Data Type: ", arr.dtype)

Array: 
 [b'Hi' b'Hello']
Data Type:  |S10


In [107]:
import numpy as np

# U10 means max 5 Characters
arr = np.array(["Hi", "Hello", "Hello World"], dtype='U5')

print("Array: \n", arr)

# Print datatype
print("Data Type: ", arr.dtype)
# Observe how "Hello World" is trimmed to 5 chars

Array: 
 ['Hi' 'Hello' 'Hello']
Data Type:  <U5


## **Functions for creating NumPy Array**

### **Creating NumPy Array with arange()**

In [31]:
arr = np.arange(10)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [0 1 2 3 4 5 6 7 8 9]
Shape:  (10,)
Size:  10
Data Type:  int64
Item Size:  8
Dimensionality:  1


In [32]:
arr = np.arange(1, 10)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [1 2 3 4 5 6 7 8 9]
Shape:  (9,)
Size:  9
Data Type:  int64
Item Size:  8
Dimensionality:  1


In [33]:
arr = np.arange(1, 10, 2)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [1 3 5 7 9]
Shape:  (5,)
Size:  5
Data Type:  int64
Item Size:  8
Dimensionality:  1


### **Creating NumPy Array with ones()**

In [34]:
arr = np.ones((3, 3))

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Shape:  (3, 3)
Size:  9
Data Type:  float64
Item Size:  8
Dimensionality:  2


### **Creating NumPy array with zeros()**

In [35]:
arr = np.zeros((3, 3))

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
Shape:  (3, 3)
Size:  9
Data Type:  float64
Item Size:  8
Dimensionality:  2


### **Creating NumPy array with eye()**

In [36]:
arr = np.eye(3)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Shape:  (3, 3)
Size:  9
Data Type:  float64
Item Size:  8
Dimensionality:  2


In [37]:
arr = np.eye(3, 2)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [[1. 0.]
 [0. 1.]
 [0. 0.]]
Shape:  (3, 2)
Size:  6
Data Type:  float64
Item Size:  8
Dimensionality:  2


### **Creating NumPy array with linspace()**

You can also use np.linspace() to create an array with values that are spaced linearly in a specified interval.

In [38]:
arr = np.linspace(0, 10, num=5)

print("Array: \n", arr)

# Print array attributes
array_attributes(arr)

Array: 
 [ 0.   2.5  5.   7.5 10. ]
Shape:  (5,)
Size:  5
Data Type:  float64
Item Size:  8
Dimensionality:  1


## **Numpy Random Numbers**

1. **np.random.rand** - generates an array with random numbers that are uniformly distribute between 0 and 1.
2. **np.random.randn** - generates an array with random numbers that are normally distributed, mean = 0 and stdev = 1.
3. **np.random.randint** - generates an array with random numbers (integers) that are uniformly distribute between 0 and given number.
4. **np.random.uniform** - generates an array with random numbers (float) between given numbers.

In [20]:
# Randomly generate an array from uniform distribution
import numpy as np

arr = np.random.rand(5)

print("Numpy Array: \n", arr)

Numpy Array: 
 [0.82836615 0.06301418 0.35701122 0.57106337 0.74289751]


In [21]:
# Randomly generate an array with 10 rows and 2 columns
arr = np.random.rand(10, 2)

print("Numpy Array generated randomly from uniform distribution: \n", arr)

Numpy Array generated randomly from uniform distribution: 
 [[0.14780062 0.17471453]
 [0.11369732 0.75176273]
 [0.24432784 0.88829167]
 [0.19609728 0.79319541]
 [0.64327064 0.52992433]
 [0.70977242 0.98405821]
 [0.47159476 0.54128465]
 [0.59540315 0.33811901]
 [0.05437257 0.91444331]
 [0.7751666  0.40268502]]


In [22]:
# Randomly generate an array from normal distribution
import numpy as np

arr = np.random.randn(5)

print("Numpy Array: \n", arr)

Numpy Array: 
 [ 0.49839533 -1.11310584  0.19360978  0.80252796 -0.59723223]


In [23]:
# Randomly generate an array with 5 rows and 4 columns
arr = np.random.randn(5, 4)

print("Numpy Array generated randomly from normal distribution: \n", arr)

Numpy Array generated randomly from normal distribution: 
 [[ 0.3276004   0.65298936 -0.34492366 -0.40719086]
 [-0.84149979  1.24736407 -1.65994873 -0.74336453]
 [ 0.51983285  0.42959454  0.2538706   2.13430918]
 [ 0.08714018 -0.57434726 -0.66789889  0.04031373]
 [-0.39606974  0.03331962  0.27393002 -1.24712954]]


In [24]:
# Generate one random integer between 0 to 9

value = np.random.randint(10)

print(value)

0


In [25]:
# Randomly generate a 5*4 array containing values in the range of 0 to 9
arr = np.random.randint(10, size = (5, 4))

print("Numpy Array: \n", arr)

Numpy Array: 
 [[8 5 5 5]
 [3 1 3 7]
 [0 0 1 1]
 [9 5 3 6]
 [5 3 9 0]]


In [26]:
# Randomly generate a 5*10 array containing values in the range of 10 to 39
arr = np.random.randint(10, 40, size = (5, 10))

print("Numpy Array: \n", arr)

Numpy Array: 
 [[19 34 38 10 18 27 27 37 36 29]
 [26 19 19 15 21 25 30 34 38 30]
 [23 10 25 34 18 39 24 26 25 22]
 [36 10 23 39 16 28 35 10 22 18]
 [37 13 36 34 30 33 32 37 38 11]]


In [27]:
# Generate one random decimal value between 0 to 10
value = np.random.uniform(10)

print(value)

8.208142773056082


In [28]:
# Randomly generate a 5*4 array containing values in the range of 0 to 10
arr = np.random.uniform(10, size = (5, 4))

print("Numpy Array: \n", arr)

Numpy Array: 
 [[1.08645919 6.99780372 9.3472326  1.78444655]
 [9.44252538 2.34501493 5.61920347 7.43719881]
 [9.52631199 5.14569691 5.4363398  1.02344593]
 [7.80693877 5.89343685 4.63857927 8.68612154]
 [1.90823285 2.31590231 5.2456755  9.03484739]]


In [29]:
# Randomly generate a 5*3 array containing values in the range of 10 to 40
arr = np.random.uniform(10, 40, size = (5, 3))

print("Numpy Array: \n", arr)

Numpy Array: 
 [[22.05211375 19.63663591 25.85448062]
 [13.64502298 18.7925905  12.78952453]
 [26.19022358 19.38307581 20.59128571]
 [32.65337646 33.9344369  23.85014336]
 [30.83100912 14.24930461 15.1472414 ]]


## **NumPy Array - Indexing, Slicing and Updating**

Since data in numpy array is **stored sequentially**, it is possible to access the data with the help of **Indexing** and **Slicing** operation. NumPy offers more indexing facilities than regular Python sequences. In addition to indexing by integers and slices, as we saw before, **arrays can be indexed by arrays of integers and arrays of booleans**.

Also remember that **numpy array's are mutable**. i.e. Data stored in numpy array can be updated/changed.

### **Data Accessing using `Indexing` in Numpy Array**

It is familiar practice in mathematics to refer to elements of a matrix by the row index first and the column index second. This happens to be true for two-dimensional arrays, but a better mental model is to think of the column index as coming last and the row index as second to last. This generalizes to arrays with any number of dimensions.

In [30]:
# Randomly generating 1 dimensional array
arr = np.random.randint(100, size = (5, ))

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [ 1 98 59 28 12]
Shape:  (5,)


In [31]:
# Accessing 2nd index
print("Value at 2nd index: ", arr[2])

Value at 2nd index:  59


In [32]:
# Accessing 5th index
print("Value at 5th index: ", arr[5])

IndexError: index 5 is out of bounds for axis 0 with size 5

In [33]:
# Randomly generating 2 dimensional array
arr = np.random.randint(100, size = (5, 4))

print("Numpy Array: \n", arr)

Numpy Array: 
 [[43 32  9 42]
 [39 51 78 75]
 [49 59 87 69]
 [11 78 46 44]
 [40 43 20 42]]


In [34]:
# Accessing 2nd index
print("Value at 2nd index: ", arr[2])

Value at 2nd index:  [49 59 87 69]


In [35]:
# Accessing value at 2, 1 index
print("Accessing Value using Way-1: ", arr[2][1])

print("Accessing Value using Way-2: ", arr[2, 1]) 
# Way-2 syntax can be helpful to access multiple values

Accessing Value using Way-1:  59
Accessing Value using Way-2:  59


In [36]:
# Accesssing multiple values
print(arr[[2, 1], [1, 1]])

[59 51]


### **Data Accessing using `Slicing` in Numpy Array**

Like the original list, Python slice notation can also be used for indexing.

One major difference is that slice indexing of a list copies the elements into a new list, but slicing an array returns a view: an object that refers to the data in the original array. The original array can be mutated using the view.

#### **Copy VS View**

**Copy**  
When a new array is created by duplicating the data buffer as well as the metadata, it is called a copy. Changes made to the copy do not reflect on the original array. Making a copy is **slower and memory-consuming** but sometimes necessary. A copy can be forced by using **ndarray.copy()**.

**View**  
A view is a new array object that shares the same data as the original array. Any modifications to the view will affect the original array, and vice versa. A view can be forced through the **ndarray.view()** method.

Think of a view in NumPy like a window into the same data. It doesn't create a new copy; it just lets you look at the original array in a different way. This makes view **faster and efficient for memory management**.

In [43]:
# Randomly generating 1 dimensional array
import numpy as np

arr = np.random.randint(100, size = (10, ))

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [13 17 35 32 24 23  6 94 61 78]
Shape:  (10,)


In [44]:
arr_slice = arr[1:4]

print("Array Slice Before:", arr_slice)

# Updating the arr_slice element at index 1
arr_slice[1] = 99999

print("Array Slice After:", arr_slice)
print()
print("Original Array:", arr)

Array Slice Before: [17 35 32]
Array Slice After: [   17 99999    32]

Original Array: [   13    17 99999    32    24    23     6    94    61    78]


#### **How to tell if the array is a view or a copy?**

The **base** attribute of the ndarray makes it easy to tell if an array is a view or a copy. The base attribute of a view returns the original array while it returns None for a copy.

In [46]:
print(arr.base) # base attribute of copy returns None

print(arr_slice.base) # base attribute of view returns the original array

None
[   13    17 99999    32    24    23     6    94    61    78]


#### **Few more slicing examples**

In [38]:
# print(arr[1:4])

# print(arr[0:-4])

# print(arr[::2])

In [39]:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Shape:  (3, 4)


In [40]:
# # Remember Way-2 for accessing the data in Numpy array
# print(arr[:2, 1:3])

In [41]:
# print(arr[1:2, :])

### **Data Accessing by Indexing with Boolean Arrays**

In [42]:
# Randomly generating 1 dimensional array
import numpy as np

arr = np.random.randint(100, size = (10, ))

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [ 1 80 68 22 54 68 61 57 97 25]
Shape:  (10,)


In [43]:
idx = [True, False, False, False, False, True, False, False, True, True]

print(arr[idx])

[ 1 68 97 25]


### **Boolean Masking in NumPy**

Boolean Masking is a technique in NumPy that allows you to filter elements of an array based on a condition. Instead of using loops, you can use boolean arrays (masks) to select specific values.

In [91]:
import numpy as np

arr = np.array([10, 20, 30, 40, 50])
mask = arr > 25  # Create a boolean mask

print(mask)

[False False  True  True  True]


In [92]:
filtered_arr = arr[mask]  # Apply mask

print(filtered_arr)

[30 40 50]


### **Boolean Masking with Multiple Conditions**

- & is used instead of and
- | is used instead of or
- ~ is used for negation

In [93]:
arr = np.array([5, 10, 15, 20, 25, 30])

mask = (arr > 10) & (arr < 25)  # Select numbers between 10 and 25

print(arr[mask])

[15 20]


In [95]:
# Real-time use case by removing null values
data = np.array([45, np.nan, 78, 56, np.nan, 89])

# Remove NaN values
clean_data = data[~np.isnan(data)]

print(clean_data)

[45. 78. 56. 89.]


### **Updating Data in Numpy Array**

In [44]:
# Updating value in the array

# Randomly generating 2 dimensional array
arr = np.random.randint(100, size = (5, 4))

print("Original Array: \n", arr)

arr[1, 1] = 99

print("Updated Array: \n", arr)

Original Array: 
 [[94  7 65 88]
 [23 72 58 57]
 [33  9 40 88]
 [48 20 33 52]
 [ 2 90  5 63]]
Updated Array: 
 [[94  7 65 88]
 [23 99 58 57]
 [33  9 40 88]
 [48 20 33 52]
 [ 2 90  5 63]]


## **Changing the NumPy Array Shape**

### **Numpy Flatten and Ravel**

In [45]:
# Randomly generate a 5*10 array containing values in the range of 10 to 39
arr = np.random.randint(10, 40, size = (5, 10))

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [[16 36 17 28 17 13 19 20 33 38]
 [37 19 15 25 31 16 21 31 15 16]
 [12 27 14 14 37 34 22 27 36 17]
 [16 32 19 19 19 33 14 25 39 19]
 [15 29 35 34 28 23 33 10 12 29]]
Shape:  (5, 10)


In [46]:
flatten_arr = arr.flatten()

print("Flatten Array: \n", flatten_arr)
print("Shape: ", flatten_arr.shape)

Flatten Array: 
 [16 36 17 28 17 13 19 20 33 38 37 19 15 25 31 16 21 31 15 16 12 27 14 14
 37 34 22 27 36 17 16 32 19 19 19 33 14 25 39 19 15 29 35 34 28 23 33 10
 12 29]
Shape:  (50,)


In [47]:
ravel_arr = arr.ravel()

print("Ravel Array: \n", ravel_arr)
print("Shape: ", ravel_arr.shape)

Ravel Array: 
 [16 36 17 28 17 13 19 20 33 38 37 19 15 25 31 16 21 31 15 16 12 27 14 14
 37 34 22 27 36 17 16 32 19 19 19 33 14 25 39 19 15 29 35 34 28 23 33 10
 12 29]
Shape:  (50,)


#### Look's like both are performing fatten operation on the array data. But is there any difference between two functions?

In [48]:
# Updating value in the array
print("Original Array: \n", arr)

arr[1, 1] = 99

print("Updated Array: \n", arr)

Original Array: 
 [[16 36 17 28 17 13 19 20 33 38]
 [37 19 15 25 31 16 21 31 15 16]
 [12 27 14 14 37 34 22 27 36 17]
 [16 32 19 19 19 33 14 25 39 19]
 [15 29 35 34 28 23 33 10 12 29]]
Updated Array: 
 [[16 36 17 28 17 13 19 20 33 38]
 [37 99 15 25 31 16 21 31 15 16]
 [12 27 14 14 37 34 22 27 36 17]
 [16 32 19 19 19 33 14 25 39 19]
 [15 29 35 34 28 23 33 10 12 29]]


In [49]:
print("Flatten Array: \n", flatten_arr)

print("Ravel Array: \n", ravel_arr)

Flatten Array: 
 [16 36 17 28 17 13 19 20 33 38 37 19 15 25 31 16 21 31 15 16 12 27 14 14
 37 34 22 27 36 17 16 32 19 19 19 33 14 25 39 19 15 29 35 34 28 23 33 10
 12 29]
Ravel Array: 
 [16 36 17 28 17 13 19 20 33 38 37 99 15 25 31 16 21 31 15 16 12 27 14 14
 37 34 22 27 36 17 16 32 19 19 19 33 14 25 39 19 15 29 35 34 28 23 33 10
 12 29]


#### Observation
Ravel is faster than flatten() as it does not occupy any memory. Ravel returns a view of the original array.

### **Numpy Reshape**

In [50]:
# Randomly generate a 5*10 array containing values in the range of 10 to 39
arr = np.random.randint(10, 40, size = (5, 10))

print("Numpy Array: \n", arr)

print("Shape: ", arr.shape)

Numpy Array: 
 [[27 28 15 34 17 23 19 21 12 22]
 [22 29 26 31 18 26 20 22 37 29]
 [30 19 32 17 10 31 16 28 16 22]
 [22 27 11 22 39 39 30 11 31 31]
 [35 18 38 24 24 10 19 10 15 21]]
Shape:  (5, 10)


In [51]:
arr_reshaped = arr.reshape(10, 5)

print(arr_reshaped)

[[27 28 15 34 17]
 [23 19 21 12 22]
 [22 29 26 31 18]
 [26 20 22 37 29]
 [30 19 32 17 10]
 [31 16 28 16 22]
 [22 27 11 22 39]
 [39 30 11 31 31]
 [35 18 38 24 24]
 [10 19 10 15 21]]


In [52]:
arr_reshaped = arr.reshape(25, 2)

print(arr_reshaped)

[[27 28]
 [15 34]
 [17 23]
 [19 21]
 [12 22]
 [22 29]
 [26 31]
 [18 26]
 [20 22]
 [37 29]
 [30 19]
 [32 17]
 [10 31]
 [16 28]
 [16 22]
 [22 27]
 [11 22]
 [39 39]
 [30 11]
 [31 31]
 [35 18]
 [38 24]
 [24 10]
 [19 10]
 [15 21]]


In [53]:
# arr_reshaped = arr.reshape(3, 3)

# print(arr_reshaped)

In [54]:
# arr_reshaped = arr.reshape(5, 50)

# print(arr_reshaped)

### **Transpose of NumPy Array**

In [72]:
# Transpose of an array

x = np.array([[1, 2, 3], [4, 5, 6]])

print("Original Array: \n", x)

print()

print("Transpose: \n", x.T)

Original Array: 
 [[1 2 3]
 [4 5 6]]

Transpose: 
 [[1 4]
 [2 5]
 [3 6]]


## **Iterating over NumPy Array**

### **Iterating over a 1 Dimensional Array**

In [16]:
# Randomly generate an array containing values in the range of 10 to 39
import numpy as np

arr1d = np.random.randint(10, 40, size = (10, ))

print("Numpy Array: \n", arr1d)

print("Shape: ", arr1d.shape)

Numpy Array: 
 [16 21 15 19 23 11 36 34 32 36]
Shape:  (10,)


In [17]:
# Looping over items in the array
for item in arr1d:
    print(item, end=" ")

16 21 15 19 23 11 36 34 32 36 

### **Iterating over a 2 Dimensional Array**

In [18]:
# Randomly generate a 5*10 array containing values in the range of 10 to 39
arr2d = np.random.randint(10, 40, size = (5, 10))

print("Numpy Array: \n", arr2d)

print("Shape: ", arr2d.shape)

Numpy Array: 
 [[35 12 28 34 13 36 17 14 27 33]
 [31 20 38 29 23 13 22 35 35 28]
 [16 12 16 13 20 19 20 12 19 27]
 [30 35 26 16 31 23 31 18 27 39]
 [35 28 17 12 20 26 27 21 16 36]]
Shape:  (5, 10)


In [19]:
# Looping over rows in the 2d array
for row in arr2d:
    print(row)

[35 12 28 34 13 36 17 14 27 33]
[31 20 38 29 23 13 22 35 35 28]
[16 12 16 13 20 19 20 12 19 27]
[30 35 26 16 31 23 31 18 27 39]
[35 28 17 12 20 26 27 21 16 36]


In [20]:
# Looping over each item in the 2d array
for item in arr2d.ravel():
    print(item, end=" ")

35 12 28 34 13 36 17 14 27 33 31 20 38 29 23 13 22 35 35 28 16 12 16 13 20 19 20 12 19 27 30 35 26 16 31 23 31 18 27 39 35 28 17 12 20 26 27 21 16 36 

### **Iterating using `np.nditer()`**

In [21]:
for item in np.nditer(arr1d):
    print(item, end=' ')

16 21 15 19 23 11 36 34 32 36 

In [22]:
for item in np.nditer(arr2d):
    print(item, end=' ')

35 12 28 34 13 36 17 14 27 33 31 20 38 29 23 13 22 35 35 28 16 12 16 13 20 19 20 12 19 27 30 35 26 16 31 23 31 18 27 39 35 28 17 12 20 26 27 21 16 36 

In [23]:
print("Original Array: ", arr1d)

for item in np.nditer(arr1d):
    if item > 20:
        item[...] = (item * 0)
        
print("Updated Array", arr1d)

Original Array:  [16 21 15 19 23 11 36 34 32 36]


ValueError: assignment destination is read-only

In [24]:
print("Original Array: ", arr1d)

for item in np.nditer(arr1d, op_flags=['readwrite']):
    if item > 20:
        item[...] = (item * 0)
        
print("Updated Array", arr1d)

Original Array:  [16 21 15 19 23 11 36 34 32 36]
Updated Array [16  0 15 19  0 11  0  0  0  0]


**Exercise: Write a program to generate an array with shape 5*4 at random containing positive integer. Perform an update by replacing all odd numbers with -1. (Using a Loop)**

In [60]:
# Code

**Exercise: Given an array [1, -10, 2, 3, 0, 6], print the array in this order [0, 6, -10, 2, 1, 3]**

In [None]:
# Code

## **Operations on NumPy Array**

### **Basic Array Operations**

This section covers addition, subtraction, multiplication, division, and more.

Arithmetic operators on arrays apply **elementwise**. A new array is created and filled with the result.

In [48]:
data = np.array([1, 2])
ones = np.ones(2, dtype=int)

In [49]:
result = data + ones

print(result)

[2 3]


In [50]:
result = data - ones

print(result)

[0 1]


In [53]:
result = data * data

print(result)

[1 4]


In [54]:
result = data / data

print(result)

[1. 1.]


### **NumPy Mathematical Operations**

Reference: https://numpy.org/doc/stable/reference/routines.math.html

All the operations like sqrt(), exp(), sin(), cos(), add(), subtract(), etc... are applied **elementwise**.

In [67]:
print("Square Root: ", np.sqrt(4))

print("Exponent: ", np.exp(1))

print("Trigonometric Sin: ", np.sin(0))

print("Trigonometric Cos: ", np.cos(0))

print("... and many more")

Square Root:  2.0
Exponent:  2.718281828459045
Trigonometric Sin:  0.0
Trigonometric Cos:  1.0
... and many more


In [68]:
arr = np.array([1, 2, 3, 4])

print("Square Root: ", np.sqrt(arr))

print("Exponent: ", np.exp(arr))

print("Trigonometric Sin: ", np.sin(arr))

print("Trigonometric Cos: ", np.cos(arr))

Square Root:  [1.         1.41421356 1.73205081 2.        ]
Exponent:  [ 2.71828183  7.3890561  20.08553692 54.59815003]
Trigonometric Sin:  [ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
Trigonometric Cos:  [ 0.54030231 -0.41614684 -0.9899925  -0.65364362]


In [69]:
# ELEMENT WISE OPERATIONS

x = np.array([[1,2], [3,4]])

y = np.array([[5,6], [7,8]])

print("Elementwise Addition: \n", np.add(x, y))

print("Elementwise Subtraction: \n", np.subtract(x, y))

print("Elementwise Multiplication: \n", np.multiply(x, y))

print("Elementwise Division: \n", np.divide(x, y))

Elementwise Addition: 
 [[ 6  8]
 [10 12]]
Elementwise Subtraction: 
 [[-4 -4]
 [-4 -4]]
Elementwise Multiplication: 
 [[ 5 12]
 [21 32]]
Elementwise Division: 
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


### **NumPy Matrix Multiplication**

In [2]:
# Matrix Multiplication
import numpy as np

x = np.array([[1,2], [3,4]])

y = np.array([[5,6], [7,8]])

print("Matrix Multiplication (Way-1): \n", np.matmul(x, y))

print("Matrix Multiplication (Way-2): \n", np.dot(x, y))

print("Matrix Multiplication (Way-3): \n", x @ y)

Matrix Multiplication (Way-1): 
 [[19 22]
 [43 50]]
Matrix Multiplication (Way-2): 
 [[19 22]
 [43 50]]
Matrix Multiplication (Way-3): 
 [[19 22]
 [43 50]]


### **NumPy Statistics**

In [73]:
x = np.array([[1,2], [3,4]])

print("Array: \n", arr)

print("Sum: ", np.sum(x))

print("Columnwise Sum: ", np.sum(x, axis=0)) # Column Wise

print("Rowwise Sum: ", np.sum(x, axis=1)) # Row wise

Array: 
 [1 2 3 4]
Sum:  10
Columnwise Sum:  [4 6]
Rowwise Sum:  [3 7]


In [74]:
x = np.array([[1, 2, 3], [4, 5, 6]])

print("Array: \n", x)

print("Minimum: ", np.min(x))

print("Columnwise Minimum: ", np.min(x, axis=0)) # Column Wise

print("Rowwise Minimum: ", np.min(x, axis=1)) # Row wise

Array: 
 [[1 2 3]
 [4 5 6]]
Minimum:  1
Columnwise Minimum:  [1 2 3]
Rowwise Minimum:  [1 4]


In [75]:
x = np.array([160, 180, 146, 162, 184, 180])

print("Array: \n", x)

print("Minimum: ", np.min(x))

print("Maximum: ", np.max(x))

print("Mean: ", np.mean(x))

print("Median: ", np.median(x))

print("Variance: ", np.var(x))

print("Std Dev: ", np.std(x))

Array: 
 [160 180 146 162 184 180]
Minimum:  146
Maximum:  184
Mean:  168.66666666666666
Median:  171.0
Variance:  187.55555555555557
Std Dev:  13.695092389449425


In [76]:
x = np.array([[1, 2, 3], [4, 5, 6]])

print("Array: \n", x)

print("Minimum: ", np.min(x))

print("Maximum: ", np.max(x))

print("Mean: ", np.mean(x))

print("Median: ", np.median(x))

print("Variance: ", np.var(x))

print("Std Dev: ", np.std(x))

Array: 
 [[1 2 3]
 [4 5 6]]
Minimum:  1
Maximum:  6
Mean:  3.5
Median:  3.5
Variance:  2.9166666666666665
Std Dev:  1.707825127659933


In [77]:
heights = np.array([160, 180, 146, 162, 184, 180])

weights = np.array([50, 78, 45, 51, 80, 60])

np.corrcoef(heights, weights)

array([[1.        , 0.88546942],
       [0.88546942, 1.        ]])

In [71]:
# Diagonal elements

x = np.array([[1, 2, 3], [4, 5, 6]])

print("Original Array: \n", x)

print("Diagonal: ", np.diag(x))

Original Array: 
 [[1 2 3]
 [4 5 6]]
Diagonal:  [1 5]


**Explore the following functions:  
np.insert()  
np.append()  
np.delete()**

### **Broadcasting**

There are times when you might want to carry out an operation between an array and a single number (also called an operation between a vector and a scalar) or between arrays of two different sizes. For example, your array (we’ll call it "data") might contain information about distance in miles but you want to convert the information to kilometers. You can perform this operation with:

```python
dist_in_miles = np.array([1.0, 2.0])
convert_to_km = dist_in_miles * 1.6
```
<img width="500" height="200" src="images/broadcasting.png"><br />

In [61]:
# We can apply any python operator on Numpy Array.
# It will perform the operation on each element of a numpy array.

x = np.array([[1, 2, 3], [4, 5, 6]])

print("Original Array: \n", x)

Original Array: 
 [[1 2 3]
 [4 5 6]]


In [62]:
print(x + 5)

[[ 6  7  8]
 [ 9 10 11]]


In [63]:
print(x % 2)

[[1 0 1]
 [0 1 0]]


In [64]:
print(x >= 3)

[[False False  True]
 [ True  True  True]]


In [65]:
print(x % 2 == 0)

[[False  True False]
 [ True False  True]]


**Exercise: Write a program to generate an array with shape 5*4 at random containing positive integer. Perform an update by replacing all odd numbers with -1. (Without using a Loop)**

In [66]:
# Code

**Exercise: Write a program to filter the values from the array based on below mentioned conditions:  
Either value should be divisible by 5.  
(or) value should be an odd number and factor of 7.**

In [None]:
# Code

#### **NumPy Broadcasting Between Arrays of Different Shape**

* Start matching the dimensions backward (Right to Left)
    - Compatible - If same number appears or if one of them is 1
    - Incompatible - Otherwise

In [1]:
import numpy as np

In [2]:
arr_1 = np.array([[1, 2, 3], [4, 5, 6]])

arr_2 = np.array([1, 2, 3])

print(arr_1 + arr_2)

[[2 4 6]
 [5 7 9]]


In [3]:
# # Run this program

# arr_1 = np.array([[1, 2, 3], [4, 5, 6]])

# arr_2 = np.array([[1], [2]])

# print(arr_1 + arr_2)

In [4]:
# # Run this program

# arr_1 = np.array([[1, 2, 3], [4, 5, 6]])

# arr_2 = np.array([1])

# print(arr_1 + arr_2)

In [5]:
# # Run this program

# arr_1 = np.array([1, 2, 3, 4, 5])

# arr_2 = np.array([1, 2, 3, 4])

# print(arr_1 + arr_2)

In [6]:
# # Run this program

# arr_1 = np.array([[1], [2], [3], [4], [5]])

# arr_2 = np.array([1, 2, 3, 4])

# print(arr_1 + arr_2)

## **Miscellaneous Topics**

### **Reverse an array**

In [83]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

reversed_arr = np.flip(arr)

print('Reversed Array:\n', reversed_arr)


Reversed Array:
 [8 7 6 5 4 3 2 1]


In [87]:
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

reversed_arr = np.flip(arr_2d)

print('Reversed Array:\n', reversed_arr)

Reversed Array:
 [[12 11 10  9]
 [ 8  7  6  5]
 [ 4  3  2  1]]


In [89]:
# Reverse only the rows
arr_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])

reversed_arr_rows = np.flip(arr_2d, axis=0)

print('Reversed Array:\n', reversed_arr_rows)

Reversed Array:
 [[ 9 10 11 12]
 [ 5  6  7  8]
 [ 1  2  3  4]]


### **Getting Unique Items and Count**

In [66]:
a_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [1, 2, 3, 4], [9, 10, 11, 12]])

print(a_2d)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 1  2  3  4]
 [ 9 10 11 12]]


In [79]:
unique_values = np.unique(a_2d)

print("Unique Values:\n", unique_values)

Unique Values:
 [ 1  2  3  4  5  6  7  8  9 10 11 12]


In [80]:
unique_values, indices_list = np.unique(a_2d, return_index=True)

print("Unique Values:\n", unique_values)
print()
print("Indices:", indices_list)

Unique Values:
 [ 1  2  3  4  5  6  7  8  9 10 11 12]

Indices: [ 0  1  2  3  4  5  6  7 12 13 14 15]


**Note:** If the axis argument isn’t passed, your 2D array will be flattened.

In [81]:
unique_values, occurrence_count = np.unique(a_2d, return_counts=True)

print("Unique Values:\n", unique_values)
print()
print("Indices:", indices_list)
print()
print("Count:", occurrence_count)

Unique Values:
 [ 1  2  3  4  5  6  7  8  9 10 11 12]

Indices: [ 0  1  2  3  4  5  6  7 12 13 14 15]

Count: [2 2 2 2 1 1 1 1 1 1 1 1]


In [76]:
unique_rows, indices_list, occurrence_count = np.unique(a_2d, 
                                                   axis=0, 
                                                   return_counts=True, 
                                                   return_index=True)

print("Unique Values:\n", unique_rows)
print()
print("Indices:", indices_list)
print()
print("Count:", occurrence_count)

Unique Values:
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

Indices: [0 1 3]

Count: [2 1 1]


### **Sorting**

In [80]:
import numpy as np

arr = np.random.randint(50, 100, size = (5, 10))

print(arr)

[[99 77 86 65 76 96 63 54 81 72]
 [69 75 90 61 85 53 95 51 94 51]
 [84 72 70 79 71 97 85 90 61 58]
 [93 59 92 70 90 98 85 95 78 75]
 [92 76 86 79 93 75 52 99 58 67]]


In [81]:
np.sort(arr)

array([[54, 63, 65, 72, 76, 77, 81, 86, 96, 99],
       [51, 51, 53, 61, 69, 75, 85, 90, 94, 95],
       [58, 61, 70, 71, 72, 79, 84, 85, 90, 97],
       [59, 70, 75, 78, 85, 90, 92, 93, 95, 98],
       [52, 58, 67, 75, 76, 79, 86, 92, 93, 99]])

In [82]:
# Column Wise Sorting

np.sort(arr, axis = 0)

array([[69, 59, 70, 61, 71, 53, 52, 51, 58, 51],
       [84, 72, 86, 65, 76, 75, 63, 54, 61, 58],
       [92, 75, 86, 70, 85, 96, 85, 90, 78, 67],
       [93, 76, 90, 79, 90, 97, 85, 95, 81, 72],
       [99, 77, 92, 79, 93, 98, 95, 99, 94, 75]])

In [83]:
# Row Wise Sorting

np.sort(arr, axis = 1)

array([[54, 63, 65, 72, 76, 77, 81, 86, 96, 99],
       [51, 51, 53, 61, 69, 75, 85, 90, 94, 95],
       [58, 61, 70, 71, 72, 79, 84, 85, 90, 97],
       [59, 70, 75, 78, 85, 90, 92, 93, 95, 98],
       [52, 58, 67, 75, 76, 79, 86, 92, 93, 99]])

### **Stacking**

In [84]:
arr_1 = np.arange(5,15).reshape(2,5)

print(arr_1)

[[ 5  6  7  8  9]
 [10 11 12 13 14]]


In [85]:
arr_2 = np.arange(25,35).reshape(2,5)
print(arr_2)

[[25 26 27 28 29]
 [30 31 32 33 34]]


In [86]:
np.vstack([arr_1, arr_2])

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [87]:
np.hstack([arr_1, arr_2])

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

### **Concatenate**

In [89]:
np.concatenate([arr_1, arr_2], axis = 0) # concatinating - vertical stacking

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [90]:
np.concatenate([arr_1, arr_2], axis = 1) # concatinating - horizontal stacking

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

### **Append**

In [93]:
np.append(arr_1, arr_2, axis = 0) # Appending - vertical stacking

array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34]])

In [94]:
np.append(arr_1, arr_2, axis = 1) # Appending - horizontal stacking

array([[ 5,  6,  7,  8,  9, 25, 26, 27, 28, 29],
       [10, 11, 12, 13, 14, 30, 31, 32, 33, 34]])

### **Where - Process Array Elements Conditionally**

Understanding np.where() Syntax 
```python
numpy.where(  
  condition,   # Where True, yield x, otherwise y  
  [x, y, ]     # Values to choose from  
)  
```

In [95]:
arr = np.arange(50, 100).reshape(5, 10)

print(arr)

[[50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]


In [98]:
np.where(arr > 64, 0, 1)

array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [97]:
np.where(arr > 64, arr/10, arr)

array([[50. , 51. , 52. , 53. , 54. , 55. , 56. , 57. , 58. , 59. ],
       [60. , 61. , 62. , 63. , 64. ,  6.5,  6.6,  6.7,  6.8,  6.9],
       [ 7. ,  7.1,  7.2,  7.3,  7.4,  7.5,  7.6,  7.7,  7.8,  7.9],
       [ 8. ,  8.1,  8.2,  8.3,  8.4,  8.5,  8.6,  8.7,  8.8,  8.9],
       [ 9. ,  9.1,  9.2,  9.3,  9.4,  9.5,  9.6,  9.7,  9.8,  9.9]])

### **argsort**

In [99]:
import numpy as np

arr = np.array([10, -5, 6, -1])

print("Indexes: ", arr.argsort()) # Returns indexes

print("Sorted Array: ", arr[arr.argsort()])

Indexes:  [1 3 2 0]
Sorted Array:  [-5 -1  6 10]


## **Reading CSV into NumPy Array**

In [90]:
import numpy as np
 
# using loadtxt()
arr = np.loadtxt("data/nyc_weather.csv", delimiter=",", dtype='str')
print(arr)

[['EST' 'Temperature' 'DewPoint' 'Humidity' 'Sea Level PressureIn'
  'VisibilityMiles' 'WindSpeedMPH' 'PrecipitationIn' 'CloudCover'
  'Events' 'WindDirDegrees']
 ['1/1/2016' '38' '23' '52' '30.03' '10' '8' '0' '5' '' '281']
 ['1/2/2016' '36' '18' '46' '30.02' '10' '7' '0' '3' '' '275']
 ['1/3/2016' '40' '21' '47' '29.86' '10' '8' '0' '1' '' '277']
 ['1/4/2016' '25' '9' '44' '30.05' '10' '9' '0' '3' '' '345']
 ['1/5/2016' '20' '-3' '41' '30.57' '10' '5' '0' '0' '' '333']
 ['1/6/2016' '33' '4' '35' '30.5' '10' '4' '0' '0' '' '259']
 ['1/7/2016' '39' '11' '33' '30.28' '10' '2' '0' '3' '' '293']
 ['1/8/2016' '39' '29' '64' '30.2' '10' '4' '0' '8' '' '79']
 ['1/9/2016' '44' '38' '77' '30.16' '9' '8' 'T' '8' 'Rain' '76']
 ['1/10/2016' '50' '46' '71' '29.59' '4' '' '1.8' '7' 'Rain' '109']
 ['1/11/2016' '33' '8' '37' '29.92' '10' '' '0' '1' '' '289']
 ['1/12/2016' '35' '15' '53' '29.85' '10' '6' 'T' '4' '' '235']
 ['1/13/2016' '26' '4' '42' '29.94' '10' '10' '0' '0' '' '284']
 ['1/14/2016' '3