Python Weekly Assignment

1. Analyzing Student Performance

A school administrator wants to analyze students scores from a file that contains
records of students and their exam results in the format name,score. Unfortunately,
sometimes the file might be missing, corrupted, or contain invalid data. Write a
program that reads the file, calculates the average score, and lists students who scored
above average. Ensure proper handling of missing files and malformed data.

In [1]:
import pandas as pd
df=pd.read_csv('student.csv')
df

Unnamed: 0,name,score
0,Ananya,85
1,Sharma,
2,Priya,92
3,Vikram,58
4,Ekakash,88
5,Amitabh,eighty
6,Sakshi,95
7,Arjun,80
8,Kavita,77
9,Sameer,62


In [2]:
def scores(file):
    try:
        df=pd.read_csv(file)
    except FileNotFoundError:
        return "File not found."
    except pd.errors.EmptyDataError:
        return "File is empty or corrupted."
    except Exception as e:
        return f"Error: {e}"

    df['score']=pd.to_numeric(df['score'],errors='coerce')  
    df=df.dropna(subset=['score'])                          
    df=df[df['score']>=0]                          
    if df.empty:
        return "No data."

    avg_score=df['score'].mean()
    above_avg=df[df['score']>avg_score]

    print(f"Average Score:{avg_score:.2f}")
    print("Students above average:")
    above_avg_students=above_avg[['name','score']].values
    for student in above_avg_students:
        print(f"{student[0]}:{student[1]}")
    return avg_score,above_avg_students

In [3]:
filename="student.csv"
avg,above_avg=scores(filename)

Average Score:79.27
Students above average:
Ananya:85.0
Priya:92.0
Ekakash :88.0
Sakshi:95.0
Arjun :80.0
Meera:101.0
Pooja:89.0


----------------------------------------------------------------------------------------------------
2. Product Availability in a Store

You work for an online store, and you need to help the operations team clean up their
product list. They have a list of product IDs that contains duplicates due to system
errors. Write a function that takes this list, removes duplicates, sorts the product IDs,
and returns the cleaned list. Make sure your function can handle an empty product list
input.

In [4]:
def product(product_ids):
    return sorted(set(product_ids))

In [5]:
input=[102,501,102,305,305,401,101,451,501]
final_ids=product(input)
print(final_ids)

[101, 102, 305, 401, 451, 501]


In [6]:
input=[] #(this is for empty list)
final_ids=product(input)
print(final_ids)

[]


In [7]:
input=[400,500,300,100,200,600] #(this is for already uniqe and unsorted list)
final_ids=product(input)
print(final_ids)

[100, 200, 300, 400, 500, 600]


3.Organizing sales data

A small business owner has sales data in the form of tuples, each containing the
customers name and the amount they spent (e.g., ('Alice', 200)). Write a program
that stores this data in a dictionary, where the customers name is the key and the
amount spent is the value. If a customer appears more than once, update their total
spending. Print the customer data sorted by their names.

In [8]:
def organize_data(sales_data):
    customer_data={}
    for cust,amt in sales_data:
        if cust in customer_data:
            customer_data[cust]+=amt
        else:
            customer_data[cust]=amt
    sorted_cust=sorted(customer_data.items())
    for cust,spend in sorted_cust:
        print(f"{cust}:{spend}")
    return customer_data

In [9]:
sales_data=[('Sindhu',100),('Priya', 300),('Krish',150),('Sindhu',100),('Radhika',500),('Yashna',250)]
customer_data=organize_data(sales_data)

Krish:150
Priya:300
Radhika:500
Sindhu:200
Yashna:250


4.Saving User Preferences

A mobile app allows users to customize settings like theme (dark/light mode),
language, and notification preferences. Write a program that saves a user&#39;s preferences
using the pickle module and retrieves them when needed. Handle cases where the
preferences file is missing or corrupted.

In [10]:
import pickle
def save(pref,filename='preferences.pkl'):
    with open(filename,'wb') as file:
        pickle.dump(pref,file)
def load(filename='preferences.pkl'):
    try:
        with open('preferences.pkl','rb') as file:
            pref=pickle.load(file)
    except Exception as e:
        print(f"Error in loadin rhe preferences: {e}")
        pref={}
    return pref

In [11]:
preferences={'theme':'dark','language':'English','notifications':'enabled'}
save(preferences)
user_preferences=load()
if user_preferences:
    print("User preferences:",user_preferences)

User preferences: {'theme': 'dark', 'language': 'English', 'notifications': 'enabled'}


5.Analyzing Employee Salaries

A company’s HR department maintains employee records in a CSV file, which
includes details like employee name, department, and salary. You’ve been tasked with
analyzing this data to calculate the total and average salary per department. Write a
program that reads the CSV using pandas, computes the required data, and saves the
results to a new CSV. Handle situations where the file is missing or contains invalid
data.

In [12]:
import pandas as pd
data=pd.read_csv('employees.csv')
df=data.rename(columns={'JOB_ID':'DEPARTMENT'})
df=df[['FIRST_NAME','LAST_NAME','DEPARTMENT','SALARY']]
df

Unnamed: 0,FIRST_NAME,LAST_NAME,DEPARTMENT,SALARY
0,Donald,OConnell,SH_CLERK,2600
1,Douglas,Grant,SH_CLERK,2600
2,Jennifer,Whalen,AD_ASST,4400
3,Michael,Hartstein,MK_MAN,13000
4,Pat,Fay,MK_REP,6000
5,Susan,Mavris,HR_REP,6500
6,Hermann,Baer,PR_REP,10000
7,Shelley,Higgins,AC_MGR,12008
8,William,Gietz,AC_ACCOUNT,8300
9,Steven,King,AD_PRES,24000


In [13]:
import pandas as pd

file='employees.csv'
final='final.csv'

def employee(file):
    try:
        df=pd.read_csv(file)
        df=data.rename(columns={'JOB_ID':'DEPARTMENT'})
    except FileNotFoundError:
        print(f"{file} not found.")
        return None
    except Exception as e:
        print(f"Error {e} while reading CSV.")
        return None

    df['SALARY'] = pd.to_numeric(df['SALARY'],errors='coerce')
    data_clean = df.dropna(subset=['SALARY'])

    dept_stats = data_clean.groupby('DEPARTMENT')['SALARY'].agg(['sum', 'mean']).reset_index()
    res = pd.merge(data_clean[['FIRST_NAME','LAST_NAME','DEPARTMENT']],dept_stats,on='DEPARTMENT')
    res = res.rename(columns={'sum':'TOTAL_SALARY', 'mean':'AVG_SALARY'})
    res[['FIRST_NAME', 'LAST_NAME', 'DEPARTMENT', 'TOTAL_SALARY', 'AVG_SALARY']].to_csv(final, index=False)
    return res

In [14]:
file='employees.csv'
results=employee(file)
print(results)

     FIRST_NAME    LAST_NAME  DEPARTMENT  TOTAL_SALARY  AVG_SALARY
0        Donald     OConnell    SH_CLERK          5200      2600.0
1       Douglas        Grant    SH_CLERK          5200      2600.0
2      Jennifer       Whalen     AD_ASST          4400      4400.0
3       Michael    Hartstein      MK_MAN         13000     13000.0
4           Pat          Fay      MK_REP          6000      6000.0
5         Susan       Mavris      HR_REP          6500      6500.0
6       Hermann         Baer      PR_REP         10000     10000.0
7       Shelley      Higgins      AC_MGR         12008     12008.0
8       William        Gietz  AC_ACCOUNT          8300      8300.0
9        Steven         King     AD_PRES         24000     24000.0
10        Neena      Kochhar       AD_VP         34000     17000.0
11          Lex      De Haan       AD_VP         34000     17000.0
12    Alexander       Hunold     IT_PROG         28800      5760.0
13        Bruce        Ernst     IT_PROG         28800      57

6. Validating User Signups

Your company’s website allows users to sign up with their email addresses. Write a
Python program that checks if the provided email addresses are valid using regular
expressions. Make sure the emails follow the proper format (e.g.,
username@domain.com). Your program should filter out invalid emails from a given
list of signups.

In [15]:
import re
def isValid_mail(email):
    pattern=r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
    return re.match(pattern,email) is not None

def filter_mails(email_list):
    valid=[]
    invalid=[]
    for i in email_list:
        if isValid_mail(i):
            valid.append(i)
        else:
            invalid.append(i)
    return valid,invalid

In [16]:
emails_to_check=[
    "sindhu@gmail.com",
    "priya@gmail.com",
    "valid.email@ex.com",
    "invalid-email@com",
    "dhoni@domain.co.in",
    "user.virat@domain.com",
    "sharma@domain.c",
    "krishna@domain"
]
valid,invalid=filter_mails(emails_to_check)
print("Valid Emails:")
for k in valid:
    print(k)
print("\nInvalid Emails:")
for k in invalid:
    print(k)

Valid Emails:
sindhu@gmail.com
priya@gmail.com
valid.email@ex.com
dhoni@domain.co.in
user.virat@domain.com

Invalid Emails:
invalid-email@com
sharma@domain.c
krishna@domain


7. Currency Conversion Calculator

You’re building a currency conversion tool for a travel website. The tool should take
two user inputs: the amount to convert and the conversion rate. Implement a program
that handles cases where the user enters invalid data, such as non-numeric input or a
conversion rate of zero, and provides appropriate error messages.

In [17]:
"""try:
    del input
except NameError:
    pass""" 

In [18]:
def currency_convertor():
    try:
        amount = float(input("Enter the amount to convert: "))
        conversion_rate = float(input("Enter the conversion rate: "))
        if amount < 0:
            raise ValueError("Amount cannot be negative.")
        if conversion_rate == 0:
            raise ValueError("Conversion rate cannot be zero.")
        converted_amount = amount * conversion_rate
        print(f"Converted amount: {converted_amount:.2f}")
    except ValueError as e:
        print(f"Error: {e}")

In [19]:
currency_convertor()

Enter the amount to convert: 100
Enter the conversion rate: 1.4
Converted amount: 140.00


In [20]:
currency_convertor()

Enter the amount to convert: -50
Enter the conversion rate: 1.1
Error: Amount cannot be negative.


In [21]:
currency_convertor()

Enter the amount to convert: abc
Error: could not convert string to float: 'abc'


8.Movie Ratings Aggregation

A movie streaming service collects user ratings for movies. Each movie can be rated
on a scale of 1 to 10. Write a program that takes a list of movie ratings and uses list
comprehension to filter out ratings below 5 (bad ratings) and return a new list of good
ratings squared. Handle cases where no ratings are provided.

In [22]:
def ratings(movies):
    return [(name,rating**2) for name,rating in movies if rating>=5]

In [23]:
movie_ratings=[('Bahubali',10),('Welocome',4),('Hello',5),('RRR',9),('Eega',8),('Titanic',8),('ArjunReddy',5),('Happy',2),('Ready',3)]
good_ratings_squared=ratings(movie_ratings)
if good_ratings_squared:
    print("Good ratings squared:",good_ratings_squared)
else:
    print("No ratings provided.")

Good ratings squared: [('Bahubali', 100), ('Hello', 25), ('RRR', 81), ('Eega', 64), ('Titanic', 64), ('ArjunReddy', 25)]


9. Extracting Contact Information

A company stores client data in text files, and some of the records contain phone
numbers in inconsistent formats, such as (123) 456-7890 or 123-456-7890. Write a
program that reads a text file, uses regular expressions to extract all phone numbers in
either format, and prints the list of valid phone numbers.

In [24]:
import re

phone_pattern =r'(\+?\d{1,3}[- ]?)?(\(\d{3}\)[- ]?\d{3}[- ]?\d{4}|\d{3}[- ]?\d{3}[- ]?\d{4})'

with open('input.txt') as file:
    phone_num=[''.join(match) for k in file for match in re.findall(phone_pattern,k)]
for num in phone_num:
    print(num)

456-789-1234
123-4567890
+44-123-456-7890
123 456 7890
+91 9876543210
(123) 456-7890
987-654-3210
+1 234 567 8901


10. Removing Duplicate User Data

A loyalty program has a list of customer records, each stored as a tuple with the
customer’s name and email address (e.g., ('John Doe&','john@example.com&')).
Due to an import error, some customers are listed multiple times. Write a Python
program that removes duplicate entries using a set and prints the unique list of
customers.

In [25]:
def remove_duplicate(cust): 
    return (list(set(cust)))

In [26]:
cust = [
    ('Ram Nithin','ramn@gmail.com'),
    ('Priya Chandru','priyachandru@yahoo.com'),
    ('July','july@yahoo.com'), 
    ('Ram Nithin','ramn@gmail.com'),
    ('Kousalya','kousalya@gmail.com'),
    ('Ram Nithin','ramn@gmail.com') 
]
unique_customers=remove_duplicate(cust)
print("Unique customers:")
for i in unique_customers:
    print(i)

Unique customers:
('Kousalya', 'kousalya@gmail.com')
('Priya Chandru', 'priyachandru@yahoo.com')
('July', 'july@yahoo.com')
('Ram Nithin', 'ramn@gmail.com')


11. Product Inventory Analysis

Your company manages product inventory through a CSV file that contains product
ID, name, and quantity available. Write a program using pandas to filter products
with low stock (less than 10 units). Handle potential issues like a missing or
malformed CSV file, or missing columns in the data.

In [27]:
import pandas as pd
file=pd.read_csv('product.csv')
file

Unnamed: 0,product_id,name,quantity
0,101,Apple iPhone 13,5
1,102,Samsung Galaxy S21,8
2,103,Google Pixel 6,
3,104,OnePlus 9 Pro,12
4,105,Xiaomi Mi 11,-5
5,106,Nokia XR20,abc
6,107,Oppo Find X3,7
7,108,Vivo X60 Pro,15
8,109,Huawei P40 Pro,
9,110,Sony Xperia 5 II,9


In [28]:
import pandas as pd
def products(file):
    try:
        df = pd.read_csv(file)
        df['quantity'] = pd.to_numeric(df['quantity'], errors='coerce')
        low_stock = df[(df['quantity']<10) & (df['quantity']>=0)].dropna(subset=['quantity'])
        return low_stock[['product_id', 'name', 'quantity']]    
    except FileNotFoundError:
        print("File not found")
    except pd.errors.EmptyDataError:
        print("The CSV file is empty")
    except KeyError:
        print("Columns are missing in data.")

In [29]:
file = 'product.csv'
result=products(file)
if result is not None:
    print(result)

    product_id                   name  quantity
0          101        Apple iPhone 13       5.0
1          102     Samsung Galaxy S21       8.0
6          107           Oppo Find X3       7.0
9          110       Sony Xperia 5 II       9.0
11         112            Dell XPS 13       3.0
16         117  Microsoft Surface Pro       1.0
17         118         Razer Blade 15       0.0
18         119       MSI GS66 Stealth       9.0


12. Statistical Analysis for a Sports Team

A sports analyst wants to analyze the performance statistics of players on a team.
Each player’s performance over the season is recorded as an array of scores. Write a
program that generates a large array of player scores using numpy, and calculates the
mean, median, variance, and standard deviation of the players’ performance.

In [30]:
import numpy as np

scores=np.random.randint(0,500,size=1000)

mean_score=np.mean(scores)
median_score=np.median(scores)
variance_score=np.var(scores)
std_dev_score=np.std(scores)

print(f"Mean Score: {mean_score:.2f}")
print(f"Median Score: {median_score:.2f}")
print(f"Variance: {variance_score:.2f}")
print(f"Standard Deviation: {std_dev_score:.2f}")


Mean Score: 256.30
Median Score: 256.50
Variance: 20399.76
Standard Deviation: 142.83


---------------------------------------------------------------------------------------------------
13. Managing Task Lists

A task management system allows users to create and store to-do lists. Write a Python
program that stores a user's list of tasks using pickle, allowing them to save and
retrieve their tasks later. Ensure proper exception handling if the data file becomes
corrupted or is missing.

In [31]:
import pickle

def save_tasks(tasks, filename='tasks.pkl'):
    with open(filename,'wb') as file:
        pickle.dump(tasks, file)
def load_tasks(filename='tasks.pkl'):
    try:
        with open(filename,'rb') as file:
            return pickle.load(file)
    except Exception as e:
        print(f"Error loading tasks:{e}")
        return []

In [32]:
tasks = ['Study','Take Nap','Buy groceries','Homework','Complete project', 'Exercise']
save_tasks(tasks)
user_tasks = load_tasks()
print("Loaded tasks:", user_tasks)

Loaded tasks: ['Study', 'Take Nap', 'Buy groceries', 'Homework', 'Complete project', 'Exercise']


---------------------------------------------------------------------------------------------------
14. Social Media Post Analysis

A social media platform needs to analyze hashtags used in posts. Write a Python
program that extracts all unique hashtags from a given post using regular expressions.
Ensure that the hashtags only contain letters and numbers (e.g., #Python3) and print
them in a sorted list.

In [33]:
import re
def extract_hashtags(post):
    hashtags=re.findall(r'#([a-zA-Z0-9]+)',post)
    unique=sorted(set(hashtags),key=lambda x:x.lower())    
    return unique

In [34]:
post="I am working at CHUBB,This is python training-Guvi! #training #coding #Python #Guvi"
hashtags=extract_hashtags(post)
print("Unique hashtags:",hashtags)

Unique hashtags: ['coding', 'Guvi', 'Python', 'training']
