Welcome to the review of vectors and matrices. Vectors and matrices provide a foundation for various mathematical operations, your understanding of them will be essential for effectively exploring and interpreting real-world social data. By the end of this review, I hope you'll be familiar with basic data manipulation and get a feel for how powerful they can be.

### Learning goals ###

Refresh your knowledge of vectors and matrices:

* Vector (column) addition, subtraction, multiplication
* Data merging and deleting
* Data filtering and grouping

# Background Story #

At the end of one year, a trading company has hired you to help them calculate their trading results. The company is connecting to a large number of buyers, each of them has bought a number of products from them. The products of the company are in three different categories: Toys, Decorations and Clothing. For each product, they recorded its price (as Unit_Price in the unit of dollars) and the quantity sold (Sales_Quantity). They also inclueded its price and the quantity sold last year (as Unit_Price_Last_Year and Sales_Quantity_Last_Year). The recorded data is stored in the table "data_company_A_external.csv".

Use your knowledge of vector manipulation to help them in the following questions.

Note: All the data in this file are not real data, the actual situation will be more complicated than this.

In [11]:
import base64
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [12]:
# Read your data
external_df = pd.read_csv('data_company_A_external.csv')


# Defining the variables
unit_price_A_df= external_df['Unit_Price'] 
sales_quantity_A_df= external_df['Sales_Quantity']
unit_price_last_year_A_df= external_df['Unit_Price_Last_Year']
sales_quantity_last_year_A_df= external_df['Sales_Quantity_Last_Year']

# Bronze medal: Vector (column) addition, multiplication #

a) As an important parameter of the trading status for this year, they wish they can know their total sales value for this year. The sales value of one product is the unit price ($) times the sales quantity (number). The value of total sales is the value of all product sales combined. And this is exactly the dot product of vectors.

In other words, the equations of sales are:

Sales(for one product)= Unit Price * Sales Quantity

Total Sales= Sum of sales for all products

If V(Unit Price) is the vector representing all unit price, and V(Sales Quantity) is the one for sales quantity, the equation will be:

V(Total Sales)= V(Unit Price) · V(Sales Quantity)

Can you help them calculate their total sales for this year? (Try to manipulate on the entire column rather than each product)

In [13]:
# Complete this function to print the total sales
def compute_total_sales(df):
    total_sales = df['Unit_Price'] * df['Sales_Quantity']
   
    pass

    return total_sales.sum()

print("The total sales this year is: {:.2f}".format(compute_total_sales(external_df)))

The total sales this year is: 20348107.40


In [14]:
# Run the following code to show one of the possible answer
print(base64.b64decode("ZGVmIGNvbXB1dGVfdG90YWxfc2FsZXMoZGYpOgogICAgdG90YWxfc2FsZXMgPSBkZlsnVW5pdF9QcmljZSddLmRvdChkZlsnU2FsZXNfUXVhbnRpdHknXSkKICAgIHJldHVybiB0b3RhbF9zYWxlcwoKVGhlIHRvdGFsIHNhbGVzIHRoaXMgeWVhciBpczogMjAzNDgxMDcuNDA=").decode())

def compute_total_sales(df):
    total_sales = df['Unit_Price'].dot(df['Sales_Quantity'])
    return total_sales

The total sales this year is: 20348107.40


b) Their data also contains information from last year. So we can calculate growth and growth rate of total sales compared to last year.

The growth and growth rate are parameters that describes how much a particular variable, like total sales here, has grown over a period of time. Growth of total sales is calculated by subtracting the total sales of the previous year from the total sales of the current year. It answers the question, "By how much did our sales figures change when compared to last year?"

Mathematically, the formula to calculate sales growth is:

Growth= Total Sales This Year - Total Sales Last Year

And growth rate is typically expressed as a percentage of the initial value and can be calculated using the formula:

Growth Rate= (Growth/Total Sales Last year) * 100

Can you help them calculate what the growth and growth rate?

In [15]:
def compute_growth(df):
        
    # Calculating the total sales last and this year 
    total_sales_this_year = df['Unit_Price'] * df['Sales_Quantity']
    total_sales_last_year = df['Unit_Price_Last_Year'] * df['Sales_Quantity_Last_Year']
    
    # Calculate growth
    growth = (total_sales_this_year - total_sales_last_year).sum()
    
    return growth


def compute_growth_rate(df):
    
    # Calculation of total sales last year
    total_sales_last_year = df['Unit_Price_Last_Year'] * df['Sales_Quantity_Last_Year']
    
    # Calculating growth for previous function
    growth = compute_growth(df)
    
    # Calculate the growth rate
    growth_rate = (growth.sum() / total_sales_last_year.sum()) * 100
    
    return growth_rate

# Creating a dataframe for compute growth and growth rate
Growth = compute_growth(external_df)
Growth_rate = compute_growth_rate(external_df)

# Create a numpy.float64 object for both
growth_rate = np.float64(0.12345)
growth = np.float64(0.12345)

print("Growth: {:.2f}".format( compute_growth(external_df)))
print("Growth Rate: {:.2f} %".format( compute_growth_rate(external_df)))

Growth: 137616.03
Growth Rate: 0.68 %
