<a href="https://colab.research.google.com/github/prof-rossetti/intro-to-python/blob/master/exercises/monthly-sales-reporting/Monthly_Sales_Reporting_Exercise.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Learning Objectives

  + Practice processing CSV files in Python.
  + Practice using the Pandas package, so you can add that skill to your resume.
  + Practice automating a real life business process.



# Business Prompt

Assume you own and operate a successful small business, selling artisan clothing products through an online platform like Amazon, Etsy, or eBay.

![hoodie for sale on etsy](https://user-images.githubusercontent.com/1328807/51781151-cb7a5300-20e2-11e9-863f-3b82aaa5f5a9.png)

Due to certain investment interests in your business, you are obligated to hold monthly board meetings with your investors and advisors to discuss the company's strategic direction, set priorities, and allocate resources. To aid the board's decision-making processes, they expect you to provide them with a monthly summary report of business insights, including the aggregation of total sales and the identification of top-selling products.

Luckily, at the end of each calendar month, the online platform your company uses to sell its products makes available for download from its admin interface a CSV file representing all individual sales orders for that month.

But usually you only have a few hours from when the data becomes available to when you are required to turn it around and share the resulting insights with the board. Over the past few months, this report compilation process has been somewhat stressful and vulnerable to manual error. And the board's confidence in your operational leadership depends not only on your ability to meet sales goals, but also on your ability to provide them with accurate and timely information.

So your objective is to create a tool which automates the process of transforming monthly sales data into a report of business insights. 

# Instructions

Write Python code to process any of the provided monthly sales CSV file to produce the desired information outputs.

Start by developing against the "sales-201803.csv" file, and check your answers using that file. Then, once you are done, you should be able to simply change / update the selected filename to point to a different monthly sales file, and after re-running the notebook, it should automatically process that month's sales.


> NOTE: You can assume any of these monthly CSV files will have a file name resembling "sales-YYYYMM.csv" (where "YYYY" represents the four digit year and "MM" represents the zero-padded month). 

> NOTE: You can assume each of these CSV files will have the same header row (`date`, `product`, `unit price`, `units sold`, `sales price`).


## Basic Requirements

  1. Display a human-friendly representation of the selected month and year, based on which file was selected (e.g. "Selected Month: March 2018")
  2. Display the total sales for that month, formatted as USD with a dollar sign and two decimal places (e.g. "Total Monthly Sales: \$12,000.71")
  3. All price-related information should be formatted as USD, with a dollar sign and two decimal places. 


```
SELECTED MONTH: MARCH 2018
TOTAL SALES: $12,000.71
```

## Further Exploration Challenge


Also write Python code to determine which products are the top-sellers, and print the total sales for each product:

```
TOP SELLING PRODUCTS:
  1. Button-Down Shirt: $6,960.35
  2. Super Soft Hoodie: $1,875.00
  3. etc.
```

To test the accuracy of your program's calculations, compare its "March 2018" results against the table of expected values below:

Product | Sum of sales price
--- | ---
Button-Down Shirt | \$6,960.35
Super Soft Hoodie | \$1,875.00
Khaki Pants | \$1,602.00
Vintage Logo Tee | \$941.05
Brown Boots | \$250.00
Sticker Pack | \$216.00
Baseball Cap | \$156.31
**Total Monthly Sales** | **$12,000.71**




# References

You may find the following reference material helpful in completing this exercise:

  + [`pandas.read_csv()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)
  + [`pandas.DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html):
    + [`pandas.DataFrame.groupby()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html)
  + [`pandas.Series`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html):


> FYI - The basic requirements can be acheived using the Pandas `DataFrame` and `Series` datatypes. However the Further Exploration Challenge requires usage of the `GroupBy` datatype for pivot table - style aggrigation purposes.





# Setup

In [None]:
#
# SETUP
#

# DOWNLOAD FILES

import os

def filepath_for(url):
    return url.split("/")[-1] 

MONTHS = ["201710", "201711", "201712", "201801", "201802", "201803", "201804"]

for month in MONTHS:
    data_url = f"https://raw.githubusercontent.com/prof-rossetti/data-analytics-in-python/main/data/unit-2/monthly-sales/sales-{month}.csv"
    csv_filepath = filepath_for(data_url)
    print("CSV FILEPATH:", csv_filepath)
    if not os.path.isfile(csv_filepath):
        print("DOWNLOADING", data_url)
        # FYI: this wget command is a terminal command, NOT python
        # ... in colab, we can execute terminal commands by prefixing them with an exclamation point
        # ... students are not responsible for knowing terminal commands like this
        !wget -q $data_url 


# HERE IS A PRICE FORMATTING FUNCTION FOR YOU TO USE LATER

def to_usd(my_price):
    """
    Converts a numeric value to usd-formatted string, for printing and display purposes.
    
    Param: my_price (int or float) like 4000.444444
    
    Example: to_usd(4000.444444)
    
    Returns: $4,000.44
    """
    return f"${my_price:,.2f}" #> $12,000.71


CSV FILEPATH: sales-201710.csv
CSV FILEPATH: sales-201711.csv
CSV FILEPATH: sales-201712.csv
CSV FILEPATH: sales-201801.csv
CSV FILEPATH: sales-201802.csv
CSV FILEPATH: sales-201803.csv
CSV FILEPATH: sales-201804.csv


# Implementation


In [None]:
#
# FILE SELECTION
#
# ... start by leaving this cell as-is, and write the desired Python code in the cells below
# ... then come back later and update this filename as desired, and re-run the notebook to process the selected month

csv_filename = "sales-201803.csv" # OR ANY ONE OF THE OTHER VALID FILES


In [None]:

#
# BASIC REQUIREMENTS
#
# todo: process the monthly sales CSV file!








In [None]:

#
# FURTHER EXPLORATION CHALLENGE
#
# todo: further process the monthly sales CSV file!





