19.	Perform the following operations using Python on the given data sets (Toyota.csv)
a.	Sort observations on Price values order
b.	Create Subset by Selecting columns, selecting rows and columns, 
c.	Create subset of cars data having Price greater than 15000 and Age less than 8
d.	Create subset of cars data consuming Petrol
e.	Apply decimal normalization on Price column 

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("Toyota.csv")

In [17]:
# Standardize 'Doors' column for consistency before any filtering
df['Doors'] = df['Doors'].replace({'three': 3, 'Four': 4, 'five': 5})
df['Doors'] = pd.to_numeric(df['Doors'], errors='coerce')

# a. Sort observations on Price values order

In [20]:
sorted_df = df.sort_values(by='Price')
print("\nSorted by Price:\n", sorted_df.head())


Sorted by Price:
       Unnamed: 0  Price   Age      KM FuelType  HP  MetColor  Automatic    CC  \
191          191   4350  44.0  158320   Diesel  69       0.0          0  1800   
1048        1048   4400  74.0  203254   Diesel  72       1.0          0  2000   
393          393   4450  56.0  129155   Diesel  69       0.0          0  1800   
192          192   4750  44.0  131273   Diesel  69       1.0          0  1800   
402          402   5150  56.0  113997   Diesel  72       1.0          0  2000   

      Doors  Weight  Price_Normalized  
191     5.0    1110            0.0435  
1048    3.0    1135            0.0440  
393     5.0    1110            0.0445  
192     5.0    1110            0.0475  
402     5.0    1135            0.0515  


# b. Create Subset by:
#    - Selecting columns
#    - Selecting rows and columns

In [23]:
subset_columns = df[['Price', 'Age', 'FuelType']]  # selecting specific columns
print("\nSubset of Columns:\n", subset_columns.head())


Subset of Columns:
    Price   Age FuelType
0  13500  23.0   Diesel
1  13750  23.0   Diesel
2  13950  24.0   Diesel
3  14950  26.0   Diesel
4  13750  30.0   Diesel


In [25]:
subset_rows_cols = df.loc[0:4, ['Price', 'Age', 'FuelType']]  # rows 0 to 4 & selected cols
print("\nSubset of Rows and Columns:\n", subset_rows_cols)


Subset of Rows and Columns:
    Price   Age FuelType
0  13500  23.0   Diesel
1  13750  23.0   Diesel
2  13950  24.0   Diesel
3  14950  26.0   Diesel
4  13750  30.0   Diesel


# c. Subset of cars with Price > 15000 and Age < 8

In [28]:
subset_expensive_young = df[(df['Price'] > 15000) & (df['Age'] < 8)]
print("\nCars with Price > 15000 and Age < 8:\n", subset_expensive_young)


Cars with Price > 15000 and Age < 8:
      Unnamed: 0  Price  Age     KM FuelType   HP  MetColor  Automatic    CC  \
110         110  31000  4.0   4000   Diesel  116       1.0          0  2000   
111         111  31275  4.0   1500   Diesel  116       1.0          0  2000   
114         114  22950  7.0  10000   Diesel  116       1.0          0  2000   
177         177  19950  7.0   6250   Petrol  110       1.0          0  1600   
179         179  22500  6.0   3000   Petrol  110       0.0          0  1600   
180         180  18500  7.0   2000   Petrol  110       0.0          0  1600   
181         181  18700  7.0    450   Petrol   97       1.0          0  1400   
182         182  21125  2.0    225   Petrol   97       1.0          0  1400   
184         184  17795  1.0      1   Petrol   98       1.0          0  1400   
185         185  18245  1.0      1   Petrol  110       1.0          0  1600   

     Doors  Weight  Price_Normalized  
110    5.0    1480           0.31000  
111    5.0   

# d. Subset of cars that consume Petrol

In [31]:
subset_petrol = df[df['FuelType'].str.lower() == 'petrol']
print("\nCars consuming Petrol:\n", subset_petrol)


Cars consuming Petrol:
       Unnamed: 0  Price   Age     KM FuelType   HP  MetColor  Automatic    CC  \
8              8  21500  27.0  19700   Petrol  192       0.0          0  1800   
10            10  20950  25.0  31461   Petrol  192       0.0          0  1800   
11            11  19950  22.0  43610   Petrol  192       0.0          0  1800   
12            12  19600  25.0  32189   Petrol  192       0.0          0  1800   
13            13  21500  31.0  23000   Petrol  192       1.0          0  1800   
...          ...    ...   ...    ...      ...  ...       ...        ...   ...   
1430        1430   8450  80.0  23000   Petrol   86       0.0          0  1300   
1431        1431   7500   NaN  20544   Petrol   86       1.0          0  1300   
1432        1432  10845  72.0     ??   Petrol   86       0.0          0  1300   
1433        1433   8500   NaN  17016   Petrol   86       0.0          0  1300   
1435        1435   6950  76.0      1   Petrol  110       0.0          0  1600   

  

# e. Apply decimal normalization on Price column

In [34]:
# Decimal normalization by dividing by max power of 10 in Price
max_price = df['Price'].max()
divisor = 10 ** len(str(int(max_price)))
df['Price_Normalized'] = df['Price'] / divisor
print("\nPrice with Decimal Normalization:\n", df[['Price', 'Price_Normalized']].head())


Price with Decimal Normalization:
    Price  Price_Normalized
0  13500            0.1350
1  13750            0.1375
2  13950            0.1395
3  14950            0.1495
4  13750            0.1375
