# Exploring numpy

### Basic NumPy Operations

1.	Create a NumPy array 'arr' with values from 1 to 10.

In [1]:
import numpy as np

arr = np.arange(1, 11)
print(arr)

[ 1  2  3  4  5  6  7  8  9 10]


2.	Create another NumPy array 'arr2' with values from 11 to 20.

In [2]:
arr2 = np.arange(11, 21)
print(arr2)

[11 12 13 14 15 16 17 18 19 20]


3.	Add, subtract, multiply, and divide 'arr' and 'arr2'. Print the results

In [3]:
addition_result = arr + arr2
subtraction_result = arr - arr2
multiplication_result = arr * arr2
division_result = arr / arr2

print("Addition Result:", addition_result)
print("Subtraction Result:", subtraction_result)
print("Multiplication Result:", multiplication_result)
print("Division Result:", division_result)

Addition Result: [12 14 16 18 20 22 24 26 28 30]
Subtraction Result: [-10 -10 -10 -10 -10 -10 -10 -10 -10 -10]
Multiplication Result: [ 11  24  39  56  75  96 119 144 171 200]
Division Result: [0.09090909 0.16666667 0.23076923 0.28571429 0.33333333 0.375
 0.41176471 0.44444444 0.47368421 0.5       ]


### Array Manipulation

1.	Reshape 'arr' into a 2x5 matrix.

In [6]:
arr_reshaped = np.reshape(arr, (2, 5))
print(arr_reshaped)

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]


2.	Transpose the matrix obtained in the previous step.

In [7]:
arr_transposed = np.transpose(arr_reshaped)
print(arr_transposed)

[[ 1  6]
 [ 2  7]
 [ 3  8]
 [ 4  9]
 [ 5 10]]


3.	Flatten the transposed matrix into a 1D array.

In [9]:
arr_flatened=arr_transposed.flatten()
print(arr_flatened)

[ 1  6  2  7  3  8  4  9  5 10]


4.	Stack 'arr' and 'arr2' vertically. Print the result.

In [10]:
stacked_array = np.vstack((arr, arr2))
print(stacked_array)

[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]]


### Statistical Operations

1.	Calculate the mean, median, and standard deviation of 'arr'.

In [14]:
mean=np.mean(arr)
median=np.median(arr)
standardDeviation=np.std(arr)

print("Mean is:",mean)
print("Median is:",median)
print("Standard Deviation is:",standardDeviation)

Mean is: 5.5
Median is: 5.5
Standard Deviation is: 2.8722813232690143


2.	Find the maximum and minimum values in 'arr'

In [16]:
print("Max value:",np.max(arr))
print("Min value:",np.min(arr))

Max value: 10
Min value: 1


3.	Normalize 'arr' (subtract the mean and divide by the standard deviation).

In [18]:
arr_normalized = (arr - mean) / standardDeviation
print(arr_normalized)

[-1.5666989  -1.21854359 -0.87038828 -0.52223297 -0.17407766  0.17407766
  0.52223297  0.87038828  1.21854359  1.5666989 ]


### Boolean Indexing

1.	Create a boolean array 'bool_arr' for elements in 'arr' greater than 5.

In [19]:
bool_arr=arr>5
print(bool_arr)

[False False False False False  True  True  True  True  True]


2.	Use 'bool_arr' to extract the elements from 'arr' that are greater than 5.

In [20]:
filtered_arr = arr[bool_arr]
print(filtered_arr)

[ 6  7  8  9 10]


### Random module

1.	Generate a 3x3 matrix with random values between 0 and 1.

In [21]:
random_matrix = np.random.rand(3, 3)
print(random_matrix)

[[0.58983075 0.74440831 0.26119493]
 [0.07329969 0.04246981 0.79974822]
 [0.62921304 0.12983326 0.26564706]]


2.	Create an array of 10 random integers between 1 and 100.

In [23]:
arr3=np.random.randint(1,101,size=10)
print(arr3)

[10 10 78 11 24 50 93 26 34 54]


3.	Shuffle the elements of 'arr' randomly.

In [24]:
np.random.shuffle(arr)
print(arr)

[ 8 10  3  5  6  9  7  4  2  1]


### Universal Functions (ufunc)

1.	Apply the square root function to all elements in 'arr'.

In [25]:
arr_square_root = np.sqrt(arr)
print(arr_square_root)

[2.82842712 3.16227766 1.73205081 2.23606798 2.44948974 3.
 2.64575131 2.         1.41421356 1.        ]


2.	Use the exponential function to calculate exex for each element in 'arr'.

In [26]:
arr_exponential = np.exp(arr)
print(arr_exponential)

[2.98095799e+03 2.20264658e+04 2.00855369e+01 1.48413159e+02
 4.03428793e+02 8.10308393e+03 1.09663316e+03 5.45981500e+01
 7.38905610e+00 2.71828183e+00]


### Linear Algebra Operations

1.	Create a 3x3 matrix 'mat_a' with random values.

In [27]:
mat_a=np.random.rand(3,3)
print(mat_a)

[[0.01590376 0.063594   0.83532112]
 [0.37358043 0.26998991 0.28341007]
 [0.41803633 0.14738558 0.48161369]]


2.	Create a 3x1 matrix 'vec_b' with random values.

In [28]:
vec_b=np.random.rand(3,1)
print(vec_b)

[[0.89380936]
 [0.08256461]
 [0.51395215]]


3.	Multiply 'mat_a' and 'vec_b' using the dot product.

In [29]:
result = np.dot(mat_a, vec_b)
print(result)

[[0.44878063]
 [0.50186051]
 [0.63334001]]


### Broadcasting

1.	Create a 2D array 'matrix' with values from 1 to 9.

In [31]:
arr2=np.arange(1,10)
matrix = arr2.reshape(3, 3)
print(matrix)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


2.	Subtract the mean of each row from each element in that row.

In [37]:
# Calculate the mean of each row
row_means = np.mean(matrix, axis=1)

# Subtract the mean of each row from each element in that row
matrix_centered = matrix - row_means[:, np.newaxis]
print(matrix_centered)

[[-1.  0.  1.]
 [-1.  0.  1.]
 [-1.  0.  1.]]


# Introduction to Panda

### Basic DataFrame Operations

I.	Download a dataset of your choice (CSV, Excel, or any other format). And load the dataset into a Pandas DataFrame.

In [9]:
import pandas as pd

df = pd.read_csv('Cleaned_data.csv')

II.	Display the first 5 rows of the dataset. Check for missing values and handle them appropriately. Get a summary of the dataset using describe().

In [10]:
print("First 5 rows\n",df.head(5))
print("\n")

print("checking missing values\n",df.isnull().sum())

df = df.dropna()

print("summary",df.describe())

First 5 rows
    Unnamed: 0                    name   company  year   Price  kms_driven  \
0           0     Hyundai Santro Xing   Hyundai  2007   80000       45000   
1           1     Mahindra Jeep CL550  Mahindra  2006  425000          40   
2           2       Hyundai Grand i10   Hyundai  2014  325000       28000   
3           3  Ford EcoSport Titanium      Ford  2014  575000       36000   
4           4               Ford Figo      Ford  2012  175000       41000   

  fuel_type  
0    Petrol  
1    Diesel  
2    Petrol  
3    Diesel  
4    Diesel  


checking missing values Unnamed: 0    0
name          0
company       0
year          0
Price         0
kms_driven    0
fuel_type     0
dtype: int64
/n
summary        Unnamed: 0         year         Price     kms_driven
count  815.000000   815.000000  8.150000e+02     815.000000
mean   407.000000  2012.442945  4.017933e+05   46277.096933
std    235.414528     4.005079  3.815888e+05   34318.459638
min      0.000000  1995.000000  3.000