# Analysing Retail Sales

In this project we take on the role of an analyst for a chain of retail stores. We are tasked with creating a study that explores the chain's performance over the past few years across all categories.

## Preparing the Data

Not all the data is available in a single dataset. Most of the data is in the `orders` worksheet, but some are in the `people` and `returns` sheets. We will use `VLOOKUP()` to merge the tables:

We used the following formulas:
`=VLOOKUP([@region],people,2,FALSE)`
`=IFERROR(VLOOKUP([@[order_id]],returns,2,FALSE),FALSE)`

![image.png](attachment:image.png)
![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

## Order Dates: Basic Statistics

Let's now familiarise ourselves with the dates and statistics provided in the dataset at a high level. Note that the profit margin is calculated as follows:

$$\text{overall profit margin} = \frac{\text{total profits}}{\text{total sales}}$$

![image.png](attachment:image.png)

## Profit Composition by Category and Segment

Focusing on the last recorded full year of sales, let's explore the relative makeup of sales by category and segment for this year.

![image.png](attachment:image.png)

Observations:

- There are 3 segments, with the 'Consumer' segment contributing the most to profit.
- For 'Furniture', the 'Chairs' subcategory is the most significant driver of sales. For 'Office Supplies' it is 'Binders', and for 'Technology' it is 'Phones'.

## Sales by Category Over Time

Let's now look at how these categories performed over time. To do this we'll create another PivotTable, displaying the sum of sales by category and year, summarised as a percentage difference from the previous year:

![image.png](attachment:image.png)

2016 had the biggest growth in sales, with technology showing the highest growth in that year. Looking at the sum of sales as a sum of the column total:

![image.png](attachment:image.png)

During the year of highest growth (2016), the technology category made up 37.16% of that year's sales.

## Forecasting Future Sales

Let's now create a line chart of total sales by quarter between 2014 and 2017:

![image-3.png](attachment:image-3.png)

In general, quarters 1 and 2 produce the least sales for Superstore, and quarters 3 and 4 produce the most. The best-selling quarter was Q4 2017 and the worst-selling quarter was Q1 2015. The sales data are not linear in time. In fact, they show seasonality which may aid upcoming marketing strategies.

This forecasting predicts increasing sales for Superstore. Using a linear forecast may be problematic seeing as the data does not follow a linear pattern.

## Analysing Average Order Value

Average order value (AOV) is an important metric in retail: it indicates the financial performance of a typical transaction. It is calculated as follows:

$$\text{AOV}=\frac{\text{total sales}}{\text{number of orders}}$$

Creating a PivotTable with `order_id` along rows and `Sum of sales` as values, we can then use Descriptive Statistics from the Analysis ToolPak to find the Summary Statistics for Sum of Sales:

![image.png](attachment:image.png)

It can also be helpful to derive a distribution of the sales. We'll do so using a histogram:

![image-2.png](attachment:image-2.png)

We observe that the sales follow a right-skewed distribution, with most values between \\$0.56 and \\$391. 

Let's say that `Chuck Magee` in the `East` region and `Kelly Williams` in the `Central` region are having a competition about whose region, if any, had a better average order sale. Let's find out using PivotTables and the independent samples t-test. We created a PivotTable with `person` in columns (filtering for the relevant people), `order_id` in rows and `sum of sales` in values. We then perform a t-test, which yields the following:

![image-2.png](attachment:image-2.png)

While there is no statistical difference (a p-value of more than 5%), we may want to approach this analysis with some caution as the data is not normally distributed. The difference in mean sales (\\$484 vs \\$427) seems significant in this business context.

## Sensitivity Analysis and Scenario Planning

As a data analyst, we may need to build basic financial models to guide decision-making and forecast results. Here are some simple examples.

**Example:**

Salespeople at Superstore receive an annual bonus, which is calculated as 2% of their region's total sales times its annual turnover. Using a two-way Data Table, let's determine a salesperson's bonus at each combination of 80%, 90% and 100% of turnover and \\$100,000, \\$150,000 and \\$200,000 in sales.

Here is the basic model to calculate a base rate:
    
![image.png](attachment:image.png)    

We plug the different turnovers and annual sales values into rows and columns, then highlight the table to create a two-way data table with `B2` as row input and `B3` as column input. This yields the following:

![image.png](attachment:image.png)

Let's consider another example which requires us to use Goal Seek:

**Example:**

Superstore is planning to sell engraved rubber chickens for \\$25 each. The profit margin for each product is estimated at 15\%. Using Goal Seek, how many units does Superstore need to sell to reach \\$2,000 in profits?

To solve this, we create a basic model as follows:

![image-2.png](attachment:image-2.png)

Using Goal Seek with the following options:

- Set cell: `B26`
- To value: `2000`
- By changing cell: `B23`

This yields a value of 533 units:

![image.png](attachment:image.png)

# Summary

In this project we:

- Used `VLOOKUP()` to merge all the data into one dataset
- Calculated some basic statistics about sales and profit margin using standard Excel functions
- Used `PivotTables` to explore the relative makeup of sales by category and over time
- Charted sales by quarter, and used this to linearly forecast future sales
- Derived a distribution of sales, and found that sales have a right-skew
- Performed a two-sample t-test to determine if there is any statistical difference between different regions' sales
- Used two-way data tables to perform sensitivity analysis of sales
- Used `Goal Seek` for scenario planning related to profits.  