<center><h1>Smart Data Aggregator</h1></center>

***


<center><h2>PART:1 User Data Processing with lists.</h2></center>
Provided with user information in the form of a list of tuples. Each tuple
represents a user with the format: (user_id, user_name, age, country).

---


#### Function to filter out users based upon their age (>=30) and countries they belong from.


In [1]:
def FilterUserByAgeAndCountries(users):
    lst = []
    for each in users:
        if each[2] >= 30 and each[3] in ('Canada','USA'):
            lst.append(each[1])
    return lst

#### Function to return top 10 most Aged Users from all.

In [2]:
def top10AgedUsers(users):
    sorted_users = sorted(users, key=lambda x:x[2], reverse=True)
    sorted_users = [each[1:3] for each in sorted_users]
    return sorted_users[:10]
    

#### Function which return all the names by which more than one user exists.


In [3]:
from collections import defaultdict
def DuplicateNamedUsers(users):
    d = defaultdict(int)
    duplicates = []
    for each in users: 
        d[each[1]] += 1
    for i,j in d.items():
        if j > 1: duplicates += [i]
    return duplicates



### Testing Part 1 on sample data.

In [4]:
users = [
    (1, "Alice", 25, "USA"),
    (2, "Bob", 25, "Canada"),
    (3, "Charlie", 40, "USA"),
    (4, "David", 28, "UK"),
    (5, "Eve", 32, "Canada"),
    (6, "Frank", 29, "USA"),
    (7, "Grace", 45, "USA"),
    (8, "Alice", 33, "Canada"),
    (9, "Hank", 55, "USA"),
    (10, "Ivan", 30, "Mexico"),
    (11, "Charlie", 34, "India")
]



In [5]:
FilterUserByAgeAndCountries(users)


['Charlie', 'Eve', 'Grace', 'Alice', 'Hank']

In [6]:
top10AgedUsers(users)

[('Hank', 55),
 ('Grace', 45),
 ('Charlie', 40),
 ('Charlie', 34),
 ('Alice', 33),
 ('Eve', 32),
 ('Ivan', 30),
 ('Frank', 29),
 ('David', 28),
 ('Alice', 25)]

In [7]:
DuplicateNamedUsers(users)


['Alice', 'Charlie']

---

<center><h2>PART:2 Immutable Data Management with Tuples.</h2></center>
This Part focuses on managing transaction data as immutable tuples, with each tuple containing a (transaction_id, user_id, amount, and timestamp). 

---


#### Function to return all the unique ID's involved in the transaction

In [8]:
def UniqueUserID(transactions):
    ids = [each[1] for each in transactions]
    return len(set(ids))


#### Function to return the transaction with the maximum amount.


In [9]:
def MaxAmountTransaction(transactions):
    return max(transactions, key=lambda x:x[2])
    

#### Function to return seperate and transaction ID's
Challenge: If the tuple size is not consistent , it may lead to index error.



In [10]:
def SeparateIds(transactions):
    transaction_ids = [transaction[0] for transaction in transactions]
    user_ids = [transaction[1] for transaction in transactions]
    return transaction_ids, user_ids
    

### Testing Part 2 on sample data.

In [11]:
transactions = (
    (1001, 1, 250.0, "2024-09-01"),
    (1002, 2, 180.0, "2024-09-02"),
    (1003, 1, 300.0, "2024-09-03"),
    (1004, 3, 400.0, "2024-09-04"),
    (1005, 2, 500.0, "2024-09-05")
)

UniqueUserID(transactions)

In [12]:
MaxAmountTransaction(transactions)

(1005, 2, 500.0, '2024-09-05')

In [13]:
SeparateIds(transactions)

([1001, 1002, 1003, 1004, 1005], [1, 2, 1, 3, 2])

---

<center> <h2>PART:3 Unique Data Handling with Sets</h2> </center>
The Smart Data Aggregator also manages sets of unique user IDs who visited certain
pages. You have three sets, each representing user IDs of visitors to pages A, B, and C.

---

#### Function to find users who visited both Page A and Page B

In [14]:
def VisitedBoth(page_a, page_b):
    return page_a & page_b

#### Function to find users who visited either Page A or Page B, but not both

In [15]:
def VisitedOne(page_a, page_c):
    return page_a ^ page_c

#### Function to Updates the set for Page A with new user IDs.


In [16]:
def UpdatePageA(page_a, new_user_ids):
    page_a.update(new_user_ids)

#### Function to Removes a list of user IDs from the set for Page B.


In [17]:
def RemoveFromPage(page_b, user_ids_to_remove):
    page_b.difference_update(user_ids_to_remove) 


### Testing Part 3 on sample data.


In [18]:
a = {1, 2, 3,  4, 5, 6, 7}
b = {4, 5, 6, 7, 8}
c  = {5,6, 7, 8, 9, 10, 11}


In [19]:
VisitedBoth(a, b)


{4, 5, 6, 7}

In [20]:
VisitedOne(a, c)

{1, 2, 3, 4, 8, 9, 10, 11}

In [21]:
UpdatePageA(a,{100,102})
a

{1, 2, 3, 4, 5, 6, 7, 100, 102}

In [22]:
RemoveFromPage(a, {100,102})
a

{1, 2, 3, 4, 5, 6, 7}

---

<center> <h2>PART:4   Data Aggregation with Dictionaries</h2> </center>
The aggregator collects user feedback stored in a dictionary. The dictionary uses the
user_id as keys, and the values are nested dictionaries with feedback details.

---

#### Function to filter users who rated 4 or higher

In [32]:
def Rated4orAbove(feedback):
    sol = {}
    for i,j in feedback.items():
        if j.get('rating') >= 4:
            sol[i] = j['rating']
    return sol

#### Function to sort feedback by rating and return the top 5 users

In [33]:
def Top5RatedUsers(feedback):
    sorted_feedback = sorted(feedback.items(), key=lambda x:x[1]['rating'])
    sorted_feedback = sorted_feedback[::-1]
    return sorted_feedback[:5]

#### Function to combine feedback from multiple dictionaries

In [34]:
def CombineFeedback(*feedback_dicts):
    combined= {}

    for each in feedback_dicts:
        for ids, details in each.items():
            if ids in combined:
                combined[ids]['rating'] = max(combined[ids]['rating'], details['rating'])
                combined[ids]['comments'] += " ," + details['comments']
            else:
                combined[ids] = details

    return combined
    

#### Dictionary comprehension to create a dictionary of user_id and rating for users with rating > 3

In [35]:
def RatingAbove3(feedback):
    return {user_id: details['rating'] for user_id, details in feedback.items() if details['rating'] > 3}

### Testing for Part 4.

In [36]:
feedback = {
    1: {'rating': 5, 'comments': 'Excellent'},
    2: {'rating': 3, 'comments': 'Good'},
    3: {'rating': 4, 'comments': 'Very Good'},
    4: {'rating': 2, 'comments': 'Average'},
    5: {'rating': 5, 'comments': 'Outstanding'}
}

feedback2 = {
    7: {'rating': 5, 'comments': 'Excellent'},
    6: {'rating': 3, 'comments': 'Good'},
    5: {'rating': 4, 'comments': 'Very Good'},

}

In [37]:
Rated4orAbove(feedback)

{1: 5, 3: 4, 5: 5}

In [38]:
Top5RatedUsers(feedback)

[(5, {'rating': 5, 'comments': 'Outstanding'}),
 (1, {'rating': 5, 'comments': 'Excellent'}),
 (3, {'rating': 4, 'comments': 'Very Good'}),
 (2, {'rating': 3, 'comments': 'Good'}),
 (4, {'rating': 2, 'comments': 'Average'})]

In [39]:
CombineFeedback(feedback,feedback2)

{1: {'rating': 5, 'comments': 'Excellent'},
 2: {'rating': 3, 'comments': 'Good'},
 3: {'rating': 4, 'comments': 'Very Good'},
 4: {'rating': 2, 'comments': 'Average'},
 5: {'rating': 5, 'comments': 'Outstanding ,Very Good'},
 7: {'rating': 5, 'comments': 'Excellent'},
 6: {'rating': 3, 'comments': 'Good'}}

In [40]:
RatingAbove3(feedback)

{1: 5, 3: 4, 5: 5}

----