# Assignment 2: Building a Smart Data Aggregator

##### This notebook implements various Python functions for handling user data, transaction data, sets, and dictionaries as required in the assignment I have to Work More and More.

#  Part 1: User Data Processing with Lists


   We will write functions to:
1. Filter out users older than 30 from USA and Canada.
2. Extract their names into a new list.
3. Sort the list by age and return the top 10 oldest users.
4. Check for duplicate names in the list.







In [9]:
#Sample Data
users = [
    (1, "Anna", 28, "USA"),
    (2, "Brian", 34, "Canada"),
    (3, "Chris", 22, "UK"),
    (4, "Daniel", 40, "USA"),
    (5, "Ella", 29, "Germany"),
    (6, "Ella", 33, "Canada"),
    (7, "Gloria", 35, "USA"),
    (8, "Hannah", 31, "USA"),
    (9, "Brian", 34, "USA"),    
    (10, "Chris", 22, "Canada"),
    (11, "Iris", 36, "Canada"),
    (12, "Chris", 50, "Canada") 
]

# 1. Count unique users
def filter_users(users):
    names = []
    for user in users: 
        if user[2] > 30 and user[3] in ['USA', 'Canada']:   
            names.append(user[1])
    return names
# 2. Sort by age and return the top 10 oldest users
def top_10_old(users):
    sorted_users = sorted(users, key=lambda x: x[2], reverse=True)

    return sorted_users[:10]

# 3. Check for duplicate names
def find_duplicate(users):
    names = []
    duplicates = []
    for user in users:
        name = user[1]
        if name in names and name not in duplicates:
            duplicates.append(name)
        else:
            names.append(name) 
    return duplicates

# Testing the functions
print("Filtered users:", filter_users(users))
print("Top 10 oldest users:", top_10_old(users))
print("Duplicate names:", find_duplicate(users))

Filtered users: ['Brian', 'Daniel', 'Ella', 'Gloria', 'Hannah', 'Brian', 'Iris', 'Chris']
Top 10 oldest users: [(12, 'Chris', 50, 'Canada'), (4, 'Daniel', 40, 'USA'), (11, 'Iris', 36, 'Canada'), (7, 'Gloria', 35, 'USA'), (2, 'Brian', 34, 'Canada'), (9, 'Brian', 34, 'USA'), (6, 'Ella', 33, 'Canada'), (8, 'Hannah', 31, 'USA'), (5, 'Ella', 29, 'Germany'), (1, 'Anna', 28, 'USA')]
Duplicate names: ['Ella', 'Brian', 'Chris']


# Part 2: Immutable Data Management with Tuple


1. Count the number of unique users.
2. The transaction with the highest amount.
3. Separate the transaction_ids and user_ids into two lists.

In [10]:
#Sample Data
transactions = [
    (101, 1, 250.0, "2023-10-01 10:00"),
    (102, 2, 175.5, "2023-10-01 11:30"),
    (103, 1, 300.0, "2023-10-02 09:45"),
    (104, 3, 450.0, "2023-10-02 10:15"),
    (105, 4, 120.0, "2023-10-03 14:30"),
    (106, 2, 520.0, "2023-10-03 16:45"),
    (107, 5, 100.0, "2023-10-04 08:20"),
]

# 1. Count unique users
def count_unique_users(transactions):
    user_ids = set([t[1] for t in transactions])
    return len(user_ids)

# 2. Find the transaction with the highest amount
def highest_transaction(transactions):
    return max(transactions, key=lambda x: x[2])

# 3. Separate transaction_ids and user_ids

def extract_transaction_and_user_ids(transactions):
    transaction_ids = [t[0] for t in transactions]
    user_ids = [t[1] for t in transactions]
    return transaction_ids, user_ids

# Testing the functions
print("Unique users: ", count_unique_users(transactions))
print("Highest transaction:", highest_transaction(transactions))
print("Transaction IDs and User IDs:", extract_transaction_and_user_ids(transactions))


Unique users:  5
Highest transaction: (106, 2, 520.0, '2023-10-03 16:45')
Transaction IDs and User IDs: ([101, 102, 103, 104, 105, 106, 107], [1, 2, 1, 3, 4, 2, 5])


# Part 3: Unique Data Handling with Sets
This part deals with managing unique sets of user IDs who visited different pages. We will:

1. Find users who visited both Page A and Page B.
2. Find users who visited either Page A or Page C but not both.
3. Update the set for Page A with new user IDs.
4. Remove a list of user IDs from Page B.

In [11]:
#Sample Data
page_a = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
page_c = {5, 6, 11, 12, 13, 14, 15, 16, 17, 18}

#  Users who visited both Page A and Page B
def find_common_users(page_a, page_c):
    return page_a.intersection(page_c)

# Users who visited either Page A or Page C but not both
def find_unique_users(page_a, page_c):
    unique_users = (page_a - page_c) | (page_c - page_a)
    return unique_users
#  Update Page A with new user IDs

def update_page_a(page_a, new_users):
    page_a.update(new_users)

# 4. Remove users from Page B
def remove_from_page_c(page_c, remove_users):
    page_c.difference_update(remove_users)
    return page_c

# Testing the functions
new_users = {11, 12}
users_to_remove = {11}
print("Users who visited both Page A and Page B:", find_common_users(page_a, page_c))
print("Users who visited either Page A or Page C but not both:", find_unique_users(page_a, page_c))
print("Updated Page A:", update_page_a(page_a, new_users))
print("Updated Page B:", remove_from_page_c(page_c, users_to_remove))


Users who visited both Page A and Page B: {5, 6}
Users who visited either Page A or Page C but not both: {1, 2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18}
Updated Page A: None
Updated Page B: {5, 6, 12, 13, 14, 15, 16, 17, 18}


# Part 4: Data Aggregation with Dictionaries
 In this part, we will handle feedback data stored in dictionaries. The tasks are:

1. Filter users with a rating of 4 or higher.
2. Sort the dictionary by rating and return the top 5 users.
3. Combine feedback from multiple dictionaries.
4. Use dictionary comprehension to filter users with a rating greater than 3.

In [12]:
#Sample Data
feedback = {
    1: {'rating': 5, 'comments': 'Excellent service!'},
    2: {'rating': 4, 'comments': 'Very good experience.'},
    3: {'rating': 2, 'comments': 'Not satisfied.'},
    4: {'rating': 5, 'comments': 'Loved it!'},
    5: {'rating': 3, 'comments': 'It was okay.'},
    6: {'rating': 4, 'comments': 'Really enjoyed my time.'},
    7: {'rating': 1, 'comments': 'Very disappointing.'},
    8: {'rating': 5, 'comments': 'Would recommend to everyone!'},
    9: {'rating': 4, 'comments': 'Great atmosphere.'},
    10: {'rating': 3, 'comments': 'Average, nothing special.'},
    11: {'rating': 4, 'comments': 'Had a great time!'},
    12: {'rating': 2, 'comments': 'Won’t be back.'},
    13: {'rating': 5, 'comments': 'Absolutely fantastic!'},
    14: {'rating': 4, 'comments': 'Very pleasant experience.'},
    15: {'rating': 3, 'comments': 'Decent, but could be better.'},
}

# Filter users with a rating of 4 or higher
def filter_high_ratings(feedback):
    high_ratings = {}  
    for user_id, details in feedback.items():
        if details['rating'] >= 4:
            high_ratings[user_id] = details['rating']
    return high_ratings 

# Sort feedback by rating
def sort_by_rating(feedback):
    sorted_feedback = sorted(feedback.items(), key=lambda x: x[1]['rating'], reverse=True)
    return dict(sorted_feedback[:5])
# Combine feedback from multiple dictionaries
def combine_feedback(dict1, dict2):
    combined = dict1.copy()  
    for user_id, info in dict2.items(): 
        if user_id in combined:   
            combined[user_id]['rating'] = max(combined[user_id]['rating'], info['rating'])   
            combined[user_id]['comments'] += " / " + info['comments']
        else:     
            combined[user_id] = info
    return combined  

    
# Dictionary comprehension for rating > 3
def ratings_above_3(feedback):
    above_ratings = {}  
    for user_id, details in feedback.items():
        if details['rating'] > 3:
            above_ratings[user_id] = details['rating']
    return above_ratings 

# Testing the functions
print("Users with rating 4 or higher:", filter_high_ratings(feedback))
print("Top 5 feedback:", sort_by_rating(feedback))

feedback2 = {
    2: {'rating': 3, 'comments': 'Average service.'},
    3: {'rating': 4, 'comments': 'Really enjoyed my time.'},
}
print("Combined feedback:", combine_feedback(feedback,feedback2))
print("Users with rating above 3:", ratings_above_3(feedback))


Users with rating 4 or higher: {1: 5, 2: 4, 4: 5, 6: 4, 8: 5, 9: 4, 11: 4, 13: 5, 14: 4}
Top 5 feedback: {1: {'rating': 5, 'comments': 'Excellent service!'}, 4: {'rating': 5, 'comments': 'Loved it!'}, 8: {'rating': 5, 'comments': 'Would recommend to everyone!'}, 13: {'rating': 5, 'comments': 'Absolutely fantastic!'}, 2: {'rating': 4, 'comments': 'Very good experience.'}}
Combined feedback: {1: {'rating': 5, 'comments': 'Excellent service!'}, 2: {'rating': 4, 'comments': 'Very good experience. / Average service.'}, 3: {'rating': 4, 'comments': 'Not satisfied. / Really enjoyed my time.'}, 4: {'rating': 5, 'comments': 'Loved it!'}, 5: {'rating': 3, 'comments': 'It was okay.'}, 6: {'rating': 4, 'comments': 'Really enjoyed my time.'}, 7: {'rating': 1, 'comments': 'Very disappointing.'}, 8: {'rating': 5, 'comments': 'Would recommend to everyone!'}, 9: {'rating': 4, 'comments': 'Great atmosphere.'}, 10: {'rating': 3, 'comments': 'Average, nothing special.'}, 11: {'rating': 4, 'comments': 'Had