##   1. Simple Calculations 

Scenario: You have a dataset with product prices and quantities sold. 

Goal: Calculate the total revenue generated for each product by multiplying price by quantity. 


In [1]:
 import pandas as pd
 import numpy as np

In [2]:
df=pd.read_csv('Quantities sold.csv')
df

Unnamed: 0,Product prices,Quantities sold
0,10,3
1,20,6
2,35,8
3,30,9
4,40,5
5,45,4
6,42,2
7,100,1
8,80,7


In [3]:
df['Revenue']=df['Product prices']*df['Quantities sold']
df.head()

Unnamed: 0,Product prices,Quantities sold,Revenue
0,10,3,30
1,20,6,120
2,35,8,280
3,30,9,270
4,40,5,200



## 2. Extracting Information from Strings 

Scenario: You have a list of email addresses. 

Goal: Extract the username (part before the "@" symbol) from each email address. 

In [8]:
import pandas as pd

# Create a sample DataFrame with email addresses
df = pd.DataFrame({'Email': ['pratibha@gmail','nsti_ald@gmail.com']})

# Extract the usernames
df['username'] = df['Email'].str.split("@").str[0]

print(df)


                Email  username
0      pratibha@gmail  pratibha
1  nsti_ald@gmail.com  nsti_ald


 ## 3. Simple Conditional Statements 

Scenario: You have a list of product prices and a discount percentage. 

Goal: Write code to calculate the final price for each product after applying the discount. 


In [10]:
def calculate_discounted_price(total_cost):
    if total_cost >= 500:
        print("You have won a discount by 15 percent")
        total_cost *= 0.85
    elif total_cost > 100:
        print("You have won a discount by 10 percent")
        total_cost *= 0.9
    elif total_cost >= 50:
        print("You have won a discount by 5 percent")
        total_cost *= 0.95
    else:
        print("Your total cost is not in the range of discount!")

    return total_cost

# Example usage:
purchase_amount = float(input("Enter the total purchase amount: "))
final_price = calculate_discounted_price(purchase_amount)
print(f"Final price after applying the discount: ${final_price:.2f}")


Enter the total purchase amount:  8


Your total cost is not in the range of discount!
Final price after applying the discount: $8.00


##  4. Sorting Data 

Scenario: You have a DataFrame containing product information with prices and ratings. 

Goal: Sort the DataFrame by: 

Price (ascending order - cheapest to most expensive). 

Rating (descending order - highest to lowest rating). 


In [11]:
#   Creating DataFrame containing product information with prices and ratings.
DataFrame = [
            ['HH', 84982, '7'],
            ['AA', 97981, '4'],
            ['NN', 7699, '8'],
            ['CC', 7637, '9'],
            ['FF', 9448, '5'],
            ['II', 5608, '6'],
            ['UU', 17199, '8']
          ]
 
# creating a pandas dataframe
df = pd.DataFrame(DataFrame, columns=['Brand','Price', 'Rating(out of 10)'])
df

Unnamed: 0,Brand,Price,Rating(out of 10)
0,HH,84982,7
1,AA,97981,4
2,NN,7699,8
3,CC,7637,9
4,FF,9448,5
5,II,5608,6
6,UU,17199,8


In [12]:
# Price (ascending order - cheapest to most expensive).
df.sort_values(by=['Price'], ascending=True)

Unnamed: 0,Brand,Price,Rating(out of 10)
5,II,5608,6
3,CC,7637,9
2,NN,7699,8
4,FF,9448,5
6,UU,17199,8
0,HH,84982,7
1,AA,97981,4


In [13]:
# Rating (descending order - highest to lowest rating).
df.sort_values(by=['Rating(out of 10)'], ascending=False)

Unnamed: 0,Brand,Price,Rating(out of 10)
3,CC,7637,9
2,NN,7699,8
6,UU,17199,8
0,HH,84982,7
5,II,5608,6
4,FF,9448,5
1,AA,97981,4


## 5. Data Cleaning and Transformation 

Scenario: You have a dataset on customer purchases with the following issues: 

Missing values in the "price" column. 

Inconsistent data types (prices might be strings with a currency symbol). 

Inconsistent product names (typos or capitalization). 

Goal: Clean and transform the data using pandas functions

In [108]:
df=pd.read_csv('customer purchase.csv')
df

Unnamed: 0,Product name,Price
0,apple,879879.0
1,mi,565.0
2,samsung,
3,nokia,3453.0
4,moto,908709.0
5,vivo,5463.0
6,oppo,87678.0


In [109]:
df.isnull().sum()

Product name    0
Price           1
dtype: int64

In [110]:
df.dtypes

Product name     object
Price           float64
dtype: object

In [113]:
df['Price'].mean

<bound method Series.mean of 0    879879.0
1       565.0
2         NaN
3      3453.0
4    908709.0
5      5463.0
6     87678.0
Name: Price, dtype: float64>

In [114]:
df['Price'] = df['Price'].fillna(df['Price'].mean())

In [115]:
print(df)

  Product name          Price
0        apple  879879.000000
1           mi     565.000000
2      samsung  314291.166667
3        nokia    3453.000000
4         moto  908709.000000
5         vivo    5463.000000
6         oppo   87678.000000


In [116]:
df['Product name'] = df['Product name'].str.capitalize()
df['Product name'] 

0      Apple
1         Mi
2    Samsung
3      Nokia
4       Moto
5       Vivo
6       Oppo
Name: Product name, dtype: object