<div align="center">
    <img src="project_icon.jpg" alt="Debugging code" width="50%">
</div>


In today’s fast-paced, data-driven world, companies rely heavily on accurate analyses of sales data to make informed decisions. However, the quality of the insights derived from such analyses depend significantly on the integrity of the underlying code.

You have been given some starting code with two functions: one that extracts and flattens JSON data into a structured format and the other that transforms electricity sales data by cleaning, filtering, and extracting relevant features. The company plans to use your revised code to improve the accuracy of sales analytics.

Your task is to identify potential errors in the functions and the underlying data that might result in logic and runtime errors, such as missing values, incorrect data types, or incompatible values (e.g., negatives). Enhance the custom functions provided by implementing exceptions to catch data quality issues and edge cases.

The data is available here ("sales_data_sample.csv") for the analyses. This data has 25 columns, but only two columns are analyzed, namely, `quantity_ordered` and `price_each`. A sample of the data is shown below.


In [110]:
# Import libraries
import pandas as pd

# Load data
sales_df = pd.read_csv("data/sales_data_sample.csv")
sales_df.head()

Unnamed: 0,order_number,quantity_ordered,price_each,order_line_number,sales,order_date,status,qtr_id,month_id,year_id,product_line,msrp,product_code,customer_name,phone,address_line1,address_line2,city,state,postal_code,country,territory,contact_last_name,contact_first_name,deal_size
0,10107,30,95.7,2,2871.0,2/24/2003 0:00,Shipped,1,2,2003,Motorcycles,95,S10_1678,Land of Toys Inc.,2125557818,897 Long Airport Avenue,,NYC,NY,10022.0,USA,,Yu,Kwai,Small
1,10121,34,81.35,5,2765.9,5/7/2003 0:00,Shipped,2,5,2003,Motorcycles,95,S10_1678,Reims Collectables,26.47.1555,59 rue de l'Abbaye,,Reims,,51100.0,France,EMEA,Henriot,Paul,Small
2,10134,-41,94.74,2,3884.34,7/1/2003 0:00,Shipped,3,7,2003,Motorcycles,95,S10_1678,Lyon Souveniers,+33 1 46 62 7555,27 rue du Colonel Pierre Avia,,Paris,,75508.0,France,EMEA,Da Cunha,Daniel,Medium
3,10145,45,,6,3746.7,8/25/2003 0:00,Shipped,3,8,2003,Motorcycles,95,S10_1678,Toys4GrownUps.com,6265557265,78934 Hillside Dr.,,Pasadena,CA,90003.0,USA,,Young,Julie,Medium
4,10159,49,100.0,14,5205.27,10/10/2003 0:00,Shipped,4,10,2003,Motorcycles,95,S10_1678,Corporate Gift Ideas Co.,6505551386,7734 Strong St.,,San Francisco,CA,,USA,,Brown,Julie,Medium


In [111]:
# Identify errors and add exceptions to the `get_quantity_ordered_sum()` function
def get_quantity_ordered_sum(sales_quantity_ordered):
    """Calculates the total sum on the 'quantity_ordered' column.

    Args:
        sales_quantity_ordered (pd.core.series.Series): The pandas Series for the 'quantity_ordered' column.

    Returns:
        total_quantity_ordered (int): The total sum of the 'quantity_ordered' column.
    """

    try:
        total_quantity_ordered = 0
        for quantity in sales_quantity_ordered:
            total_quantity_ordered += quantity
        return total_quantity_ordered
    except TypeError as error:
        raise error
    except Exception as error:
        raise error

In [112]:
# Identify errors and add exceptions to the `get_price_each_average()` function
def get_price_each_average(sales_price_each, num_places=2):
    """Calculates the average on the 'price_each' column
       using pandas built in methods and rounds to the desired number of places.

    Args:
        sales_price_each (pd.core.series.Series): The pandas Series for the 'price_each' column.
        num_of_places (int): The number of decimal places to round.

    Returns:
        average_price_each (float): The average of the 'price_each' column.
    """
    
    try:
        total_of_price_each = sales_price_each.sum()
        len_of_price_each = len(sales_price_each)
        average_price_each = round(
            total_of_price_each / len_of_price_each, num_places
        )
        return average_price_each
    except TypeError as error:
        raise error
    except Exception as error:
        raise error

# Debug Zone

In [113]:
#  Add as many cells as you require 
total_quantity_ordered = get_quantity_ordered_sum(sales_df[sales_df['quantity_ordered']>=0]['quantity_ordered'])
print(sum_orders)

98985


In [114]:
cleaned_price_each = sales_df[sales_df['price_each'] != ' ']['price_each'].astype(float)
average_price_each = get_price_each_average(cleaned_price_each)
print(average_price_each)

83.66


In [115]:
sales_df['quantity_ordered']
# sales_df[sales_df['price_each'].isna()==False]['price_each']()

0       30
1       34
2      -41
3       45
4       49
        ..
2818    20
2819    29
2820    43
2821    34
2822    47
Name: quantity_ordered, Length: 2823, dtype: int64