Notebook prepared by Muhammad Sohail Abbas

# PAI Lab 04

## Outline
* Numpy

## File Reading

The key function for working with files in Python is the open() function.

The open() function takes two parameters; filename, and mode.

In addition you can specify if the file should be handled as binary or text mode

"t" - Text - Default value. Text mode


In [23]:
f = open("Strings.txt")

In [28]:
f = open("Strings.txt", "rt")

Because "r" for read, and "t" for text are the default values, you do not need to specify them.

In [43]:
f = open("Strings.txt", "rt",encoding="utf-8")
text = []
# loop through each line
for line in f.readlines():
    text.append(line.strip())
    
print(text[:5])

['A', '', '', '', 'A-  prefix (also an- before a vowel sound) not, without (amoral). [greek]']


For File writing and deletion visit [Documentation](https://www.w3schools.com/python/python_file_write.asp)

# Numpy

Numpy is the core library for scientific computing in Python. It
provides a high-performance multidimensional array object, and tools for
working with these arrays.
To use Numpy, we first need to import the `numpy` package. By
convention, we import it using the alias `np`. Then, when we want to use
modules or functions in this library, we preface them with `np.`

In [1]:
pip install numpy # conda install numpy

/usr/bin/python3: No module named pip
Note: you may need to restart the kernel to use updated packages.


In [2]:
import numpy as np

## Numpy Arrays

A numpy array is a grid of values, all of the same type, and is indexed
by a tuple of nonnegative integers. The number of dimensions is the rank
of the array; the shape of an array is a tuple of integers giving the
size of the array along each dimension.
We can create a `numpy` array by passing a Python list to `np.array()`.

In [47]:
a = np.array([1, 2, 3])  # Create a rank 1 array
a

array([1, 2, 3])

This creates the array we can see on the right here:

![](http://jalammar.github.io/images/numpy/create-numpy-array-1.png)

In [49]:
print(type(a), a.shape, a[0], a[1], a[2])
a[0] = 5                 # Change an element of the array
print(a)                  

<class 'numpy.ndarray'> (3,) 1 2 3
[5 2 3]


To create a `numpy` array with more dimensions, we can pass nested
lists, like this:

![](http://jalammar.github.io/images/numpy/numpy-array-create-2d.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array.png)

In [51]:
b = np.array([[1,2],[3,4]])   # Create a rank 2 array
print(b)

[[1 2]
 [3 4]]


In [52]:
print(b.shape)

(2, 2)


There are often cases when we want numpy to initialize the values of the
array for us. numpy provides methods like `ones()`, `zeros()`, and
`random.random()` for these cases. We just pass them the number of
elements we want it to generate:

![](http://jalammar.github.io/images/numpy/create-numpy-array-ones-zeros-random.png)

We can also use these methods to produce multi-dimensional arrays, as
long as we pass them a tuple describing the dimensions of the matrix we
want to create:

![](http://jalammar.github.io/images/numpy/numpy-matrix-ones-zeros-random.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array-creation.png)

Sometimes, we need an array of a specific shape with “placeholder”
values that we plan to fill in with the result of a computation. The
`zeros` or `ones` functions are handy for this:

In [55]:
a = np.zeros((2,2))  # Create an array of all zeros
print(a)

[[0. 0.]
 [0. 0.]]


In [57]:
b = np.ones((1,2,3))   # Create an array of all ones
print(b)

[[[1. 1. 1.]
  [1. 1. 1.]]]


In [59]:
c = np.full((2,2,4), 7) # Create a constant array
print(c)

[[[7 7 7 7]
  [7 7 7 7]]

 [[7 7 7 7]
  [7 7 7 7]]]


In [61]:
d = np.eye(3)        # Create a 2x2 identity matrix
print(d)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [67]:
e = np.random.random((2,2)) # Create an array filled with random values
print(e)

[[0.72756775 0.30853502]
 [0.16648583 0.5358984 ]]


Numpy also has two useful functions for creating sequences of numbers:
`arange` and `linspace`.

The `arange` function accepts three arguments, which define the start
value, stop value of a half-open interval, and step size. (The default
step size, if not explicitly specified, is 1; the default start value,
if not explicitly specified, is 0.)

The `linspace` function is similar, but we can specify the number of
values instead of the step size, and it will create a sequence of evenly
spaced values.

In [68]:
f = np.arange(10,50,5)   # Create an array of values starting at 10 in increments of 5
print(f)

[10 15 20 25 30 35 40 45]


Note this ends on 45, not 50 (does not include the top end of the
interval).

In [72]:
g = np.linspace(0., 1., num=5)
print(g)

[0.   0.25 0.5  0.75 1.  ]


Sometimes, we may want to construct an array from existing arrays by
“stacking” the existing arrays, either vertically or horizontally. We
can use `vstack()` (or `row_stack`) and `hstack()` (or `column_stack`),
respectively.

In [73]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.vstack((a,b))

array([[1, 2, 3],
       [4, 5, 6]])

In [74]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.hstack((a,b))

array([1, 2, 3, 4, 5, 6])

### Indexing and Slicing

We can index and slice numpy arrays in all the ways we can slice Python
lists:

![](http://jalammar.github.io/images/numpy/numpy-array-slice.png)

And you can index and slice numpy arrays in multiple dimensions. If
slicing an array with more than one dimension, you should specify a
slice for each dimension:

![](http://jalammar.github.io/images/numpy/numpy-matrix-indexing.png)

Slicing return values by reference

In [75]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


updating value of b will update value of a

In [76]:
print(a[0, 1])
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) 

2
77


In [77]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:3, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)

[5 6 7 8] (4,)
[[ 5  6  7  8]
 [ 9 10 11 12]] (2, 4)


Boolean array indexing: Boolean array indexing lets you pick out
arbitrary elements of an array. Frequently this type of indexing is used
to select the elements of an array that satisfy some condition. Here is
an example:

In [79]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print(bool_idx)

[[False False]
 [ True  True]
 [ True  True]]


### Math Functions

What makes working with `numpy` so powerful and convenient is that it
comes with many *vectorized* math functions for computation over
elements of an array. These functions are highly optimized and are
*very* fast - much, much faster than using an explicit `for` loop.

For example, let’s create a large array of random values and then sum it
both ways. We’ll use a `%%time` *cell magic* to time them.

In [81]:
a = np.random.random(100000000)
len(a)

100000000

Look at the “Wall Time” in the output - note how much faster the
vectorized version of the operation is! This type of fast computation is
a major enabler of machine learning, which requires a *lot* of
computation.

Whenever possible, we will try to use these vectorized operations.

Some mathematic functions are available both as operator overloads and
as functions in the numpy module.

For example, you can perform an elementwise sum on two arrays using
either the + operator or the `add()` function.

![](http://jalammar.github.io/images/numpy/numpy-arrays-adding-1.png)

![](http://jalammar.github.io/images/numpy/numpy-matrix-arithmetic.png)

In [82]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


And this works for other operations as well, not only addition:

![](http://jalammar.github.io/images/numpy/numpy-array-subtract-multiply-divide.png)

In [84]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [85]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


We use the `dot()` function to compute inner
products of vectors, to multiply a vector by a matrix, and to multiply
matrices. `dot()` is available both as a function in the numpy module
and as an instance method of array objects:

![](http://jalammar.github.io/images/numpy/numpy-matrix-dot-product-1.png)

In [93]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))





219
219


In [95]:
print(np.dot(x, y))

[[19 22]
 [43 50]]


Besides for the functions that overload operators, Numpy also provides
many useful functions for performing computations on arrays, such as
`min()`, `max()`, `sum()`, and others:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-1.png)

In [96]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(np.max(x)) 
print(np.min(x))  
print(np.sum(x)) 

6
1
21


Not only can we aggregate all the values in a matrix using these
functions, but we can also aggregate across the rows or columns by using
the `axis` parameter:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-4.png)

In [98]:
x = np.array([[1, 2], [5, 3], [4, 6]])

print(np.max(x, axis=0))  # Compute max of each column; prints "[5 6]"
print(np.max(x, axis=1))  # Compute max of each row; prints "[2 5 6]"

[5 6]
[2 5 6]


You can find the full list of mathematical functions provided by numpy
in the
[documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

Apart from computing mathematical functions using arrays, we frequently
need to reshape or otherwise manipulate data in arrays. The simplest
example of this type of operation is transposing a matrix; to transpose
a matrix, simply use the T attribute of an array object.

![](http://jalammar.github.io/images/numpy/numpy-transpose.png)

In [99]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(x)
print("transpose\n", x.T)

[[1 2]
 [3 4]
 [5 6]]
transpose
 [[1 3 5]
 [2 4 6]]


# Exercises

### T1
* Generate a 3x3 matrix with random values between 0 and 1.
* Use a conditional operation to replace values less than 0.5 with 0.

### T2
* Create two 2D arrays of compatible dimensions.
* Compute the dot product (matrix multiplication) of the two arrays.
* Calculate the sum of the diagonal elements in the resulting matrix.

### T3
* Create a 1D array with values ranging from -1 to 1 using numpy function.
* Count the number of positive values in the array that are greater than 0.5.

### T4 
* Start with a 5x5 matrix (you can create one or use an existing matrix).
* Normalize each row by dividing all elements in that row by the sum of that row's elements.

In [53]:
# YOUR CODE HERE

#TASK01

import numpy as np

matrix = np.random.rand(3,3)

matrix[matrix < 0.5] = 0

print(matrix)

[[0.92531953 0.77092324 0.        ]
 [0.         0.98357738 0.89679608]
 [0.58745642 0.58954862 0.        ]]


In [54]:
#TASK02

array1 = np.array([[1, 2, 3],
                   [3, 4, 5],
                   [3, 4, 5]])

array2 = np.array([[5, 6, 6],
                   [7, 8, 9],
                   [3, 4, 5]])
result_matrix = np.dot(array1, array2)
diagonal_sum = np.trace(result_matrix)

print("Resulting Matrix:")
print(result_matrix)
print("Sum of Diagonal Elements:", diagonal_sum)

Resulting Matrix:
[[28 34 39]
 [58 70 79]
 [58 70 79]]
Sum of Diagonal Elements: 177


In [55]:

#TASK03

array = np.linspace(-1, 1, num=10) 

# count variable here will wok like an array
count = np.sum(array > 0.5)

print("Array:", array)
print("Number of positive values greater than 0.5:", count)


Array: [-1.         -0.77777778 -0.55555556 -0.33333333 -0.11111111  0.11111111
  0.33333333  0.55555556  0.77777778  1.        ]
Number of positive values greater than 0.5: 3


In [56]:
#TASK014


matrix = np.array([[10, 20, 30, 40, 50],
                  [5, 15, 25, 35, 45],
                  [2, 4, 6, 8, 10],
                  [1, 2, 3, 4, 5],
                  [3, 6, 9, 12, 15]])

for rows in range(matrix.shape[0]):
    normalized_matrix = matrix / matrix.sum(axis=1,keepdims=True)

print(normalized_matrix)


[[0.06666667 0.13333333 0.2        0.26666667 0.33333333]
 [0.04       0.12       0.2        0.28       0.36      ]
 [0.06666667 0.13333333 0.2        0.26666667 0.33333333]
 [0.06666667 0.13333333 0.2        0.26666667 0.33333333]
 [0.06666667 0.13333333 0.2        0.26666667 0.33333333]]


## Read numbers.txt into a numpy array and perform following numpy operations

Read "numbers.txt" into a numpy arrfor rows in range(matrix.shape[0]):ay.

Use numpy to read the data from "numbers.txt" into a numpy array.

Find the sum of the even numbers in the array.

Calculate the square of each element in the array.

Find the indices of all elements greater than 20.

Create a new array that contains only unique elements from the original array.

Calculate the cumulative sum of the array.

Replace all negative elements in the array with their absolute values.


In [52]:

# Specify the file path
file_path = 'Numbers.txt'  # Replace with the actual file path

# Initialize an empty list to store the data
data = []

# Read the file line by line
with open(file_path, 'r') as file:
    for line in file:
        # Split the line into numbers, removing extra spaces
        numbers = line.strip().split()
        # Convert the numbers to float and append to the data list
        data.append([float(num) for num in numbers])

# Convert the data list to a NumPy array
data_array = np.array(data)


# Print the resulting 2D array
print("Original Array:")
print(data_array)
print("Shape of the array:", data_array.shape)

# Task 1: Find the sum of even numbers in the array
even_sum = np.sum(data_array[data_array % 2 == 0])
print("\nSum of even numbers:", even_sum)

# Task 2: Calculate the square of each element in the array
squared_array = np.square(data_array)
print("\nSquared Array:")
print(squared_array)

# Task 3: Find the indices of all elements greater than 20
indices_gt_20 = np.where(data_array > 20)
print("\nIndices of elements greater than 20:", indices_gt_20)

# Task 4: Create a new array with unique elements
unique_array = np.unique(data_array)
print("\nUnique Array:")
print(unique_array)

# Task 5: Calculate the cumulative sum of the array
cumulative_sum = np.cumsum(data_array)
print("\nCumulative Sum Array:")
print(cumulative_sum)

# Task 6: Replace negative elements with their absolute values
data_array_abs = np.abs(data_array)
print("\nArray with Absolute Values:")
print(data_array_abs)



Original Array:
[[7719. 7277. 1781. ... 3273. 2993. 3210.]]
Shape of the array: (1, 400010)

Sum of even numbers: 941355826.0

Squared Array:
[[59582961. 52954729.  3171961. ... 10712529.  8958049. 10304100.]]

Indices of elements greater than 20: (array([0, 0, 0, ..., 0, 0, 0]), array([     0,      1,      2, ..., 400007, 400008, 400009]))

Unique Array:
[0.000e+00 1.000e+00 2.000e+00 ... 9.998e+03 9.999e+03 1.000e+04]

Cumulative Sum Array:
[7.71900000e+03 1.49960000e+04 1.67770000e+04 ... 1.87942062e+09
 1.87942361e+09 1.87942682e+09]

Array with Absolute Values:
[[7719. 7277. 1781. ... 3273. 2993. 3210.]]


In [None]:
# YOUR CODE HERE
