<a href="https://colab.research.google.com/github/dkenessey/Python_Coding_Challenges/blob/main/Python_Challenge_7-29-24.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Easy

You are given a list of sales transactions where each transaction is a dictionary containing a transaction ID and a sale amount. Write a function to detect and remove outliers based on the Interquartile Range (IQR) method. Return the cleaned data without outliers.

In [14]:
sales_data = [
    {"transaction_id": 1, "amount": 100},
    {"transaction_id": 2, "amount": 150},
    {"transaction_id": 3, "amount": 200},
    {"transaction_id": 4, "amount": 250},
    {"transaction_id": 5, "amount": 300},
    {"transaction_id": 6, "amount": 4000}  # Outlier
]

In [29]:
import numpy as np
import pandas as pd

In [60]:
def outlier_remover(df, column):
    df = pd.DataFrame(df)

    upper = np.percentile(df[column], 75)
    lower = np.percentile(df[column], 25)
    iqr = upper - lower
    upper_bound = upper + 1.5 * iqr
    lower_bound = lower - 1.5 * iqr

    filtered_df = df[(df[column] >= lower_bound) & (df[column] <= upper_bound)]  #Filter out the outliers

    return filtered_df.to_dict(orient='records')

In [61]:
outlier_remover(sales_data, 'amount')

[{'transaction_id': 1, 'amount': 100},
 {'transaction_id': 2, 'amount': 150},
 {'transaction_id': 3, 'amount': 200},
 {'transaction_id': 4, 'amount': 250},
 {'transaction_id': 5, 'amount': 300}]

# Medium

You are given multiple lists of dictionaries, each representing a dataset with records containing an 'id' and 'value'. Write a function to merge these datasets and sort the combined dataset by 'value'.

In [49]:
dataset1 = [
    {"id": 1, "value": 10},
    {"id": 2, "value": 20}
]
dataset2 = [
    {"id": 3, "value": 15},
    {"id": 4, "value": 25}
]
dataset3 = [
    {"id": 5, "value": 5},
    {"id": 6, "value": 30}
]

In [57]:
def merge_and_sort(datasets):
    combined_data = []
    for dataset in datasets:
        combined_data.extend(dataset)

    df = pd.DataFrame(combined_data) #Convert list of dictionaries to a df

    sorted_df = df.sort_values(by='value') #Sort by the 'value' column

    return sorted_df.to_dict(orient='records') #Convert back to list of dictionaries

In [58]:
merge_and_sort(datasets)

[{'id': 5, 'value': 5},
 {'id': 1, 'value': 10},
 {'id': 3, 'value': 15},
 {'id': 2, 'value': 20},
 {'id': 4, 'value': 25},
 {'id': 6, 'value': 30}]

# Hard

You are given a list of stock prices, where each price is a dictionary containing a 'date' and 'price'. Write a function to compute the rolling average of stock prices for a given window size and return a list of dictionaries with the 'date' and corresponding rolling average.

In [62]:
stock_prices = [
    {"date": "2024-07-01", "price": 100},
    {"date": "2024-07-02", "price": 110},
    {"date": "2024-07-03", "price": 120},
    {"date": "2024-07-04", "price": 130},
    {"date": "2024-07-05", "price": 140}
]
window_size = 3

In [63]:
def rolling_average(prices, window_size):
    df = pd.DataFrame(prices)
    df['rolling_average'] = df['price'].rolling(window=window_size).mean()
    return df.to_dict(orient='records')

In [64]:
rolling_average(stock_prices, window_size)

[{'date': '2024-07-01', 'price': 100, 'rolling_average': nan},
 {'date': '2024-07-02', 'price': 110, 'rolling_average': nan},
 {'date': '2024-07-03', 'price': 120, 'rolling_average': 110.0},
 {'date': '2024-07-04', 'price': 130, 'rolling_average': 120.0},
 {'date': '2024-07-05', 'price': 140, 'rolling_average': 130.0}]