# Making Business Decisions to Maximise Sales of a Digital Media Store

We will be using a database from a digital media store similar to the iTunes Library. It contains several tables which stores information of customers, invoices, artists, albums, and tracks from its store.

The original unmodified file can be found on [GitHub](https://github.com/lerocha/chinook-database#:~:text=Chinook%20Database,single%20and%20multiple%20database%20servers.).

The aim of this project is to examine how we can make appropriate business decisions to potentially maximise our sales by analysing our data using SQL. We will be using [SQLite](https://en.wikipedia.org/wiki/SQLite), a relational database management system (RMDBS).

## Guide

- [Connecting to Database](#connect_db)
- [Data Exploration](#data_exp)
- [Conclusion](#conclusion)

<a id='connect_db'></a>

### Connecting to Database

Let's begin by connecting our Jupyter Notebook to the database file.

In [1]:
%%capture
%load_ext sql
%sql sqlite:///chinook.db

#### Obtaining a List of all Tables & Views in the Database

In [2]:
%%sql
SELECT
    name,
    type
FROM sqlite_master
WHERE type IN ("table","view");

 * sqlite:///chinook.db
Done.


name,type
album,table
artist,table
customer,table
employee,table
genre,table
invoice,table
invoice_line,table
media_type,table
playlist,table
playlist_track,table


To illustrate how these tables are related to one another, I have attached a picture below:

<img src="chinook-schema.svg">

From this table, we can analyse our data in several fun ways: 

1. Assess the performance of each employee by looking at the revenue each of them generated
2. Finding out which artists or genres appeal to customers the most (looking at sales performance)
3. Assess the sales performance from each country

<a id='data_exp'></a>

### Data Exploration

#### Evaluating Sales Performance of Each Employee

To understand how to boost our sales performance, we need to examine how each employee is performing. However, we would need to first identify the employees who are in charge of sales: 

In [3]:
%%sql

SELECT DISTINCT title
FROM employee;

 * sqlite:///chinook.db
Done.


title
General Manager
Sales Manager
Sales Support Agent
IT Manager
IT Staff


The Sales Support Agents are the ones who should be in charge. Now let's have a look at each of their sales performance:

In [7]:
%%sql

WITH all_sales AS
        (
         SELECT e.first_name || " " || e.last_name employee_name,
                e.employee_id,
                e.hire_date,
                e.hire_date - e.birthdate age,
                e.title,
                i.total
         FROM invoice i
         LEFT JOIN customer c ON i.customer_id = c.customer_id
         LEFT JOIN employee e ON c.support_rep_id = e.employee_id
        )
    
SELECT
    al.employee_name,
    al.age,
    al.hire_date,
    ROUND(SUM(al.total), 2) total
FROM all_sales al
GROUP BY employee_id
ORDER BY total DESC;

 * sqlite:///chinook.db
Done.


employee_name,age,hire_date,total
Jane Peacock,44,2017-04-01 00:00:00,1731.51
Margaret Park,70,2017-05-03 00:00:00,1584.0
Steve Johnson,52,2017-10-17 00:00:00,1393.92


Although it is true that Steve did not generate as much revenue as Jane or Margaret, we have to bear in mind that Steve was hired a few months later after them (based on the hire date). Jane and Margaret earned about the same revenue. 

#### Finding the Artists Whose Songs Generate The Most Sales

Knowing the top selling artists would help the company better in understanding the popular artists (and possibly the genres) in demand.

In [6]:
%%sql

WITH artist_sales AS
        (
         SELECT 
                ar.name,
                i.total
         FROM invoice i
         LEFT JOIN invoice_line il ON i.invoice_id = il.invoice_id
         LEFT JOIN track t ON il.track_id = t.track_id
         LEFT JOIN album al ON t.album_id = al.album_id
         LEFT JOIN artist ar ON al.artist_id = ar.artist_id
        )
    
SELECT ar.name,
       ROUND(SUM(ar.total), 2) total_revenue
FROM artist_sales ar
GROUP BY name
ORDER BY total_revenue DESC
LIMIT 15;

 * sqlite:///chinook.db
Done.


name,total_revenue
Jimi Hendrix,2623.5
Queen,2269.08
Red Hot Chili Peppers,1484.01
Pearl Jam,1387.98
Nirvana,1366.2
Guns N' Roses,1261.26
The Rolling Stones,1107.81
Eric Clapton,1090.98
Foo Fighters,1040.49
AC/DC,1029.6


Quite a number of these bands produce music of the Rock genre, so the Rock genre should be one of the more popular genres. To check whether our theory is correct, we can analyse the total number of songs sold according to genre:

In [13]:
%%sql

WITH total_tracks_sold AS
    (SELECT
         SUM(quantity) total_tracks
     FROM invoice_line
    ),
     tracks_by_genre AS
    (SELECT
         il.quantity,
         g.name
     FROM invoice_line il
     LEFT JOIN track t ON il.track_id = t.track_id
     LEFT JOIN genre g ON t.genre_id = g.genre_id
    )
    
SELECT
    tg.name,
    ROUND(CAST(SUM(quantity) AS Float)*100/(SELECT * FROM total_tracks_sold), 2) proportion
FROM tracks_by_genre tg
GROUP BY name
ORDER BY proportion DESC;

 * sqlite:///chinook.db
Done.


name,proportion
Rock,55.39
Metal,13.01
Alternative & Punk,10.34
Latin,3.51
R&B/Soul,3.34
Blues,2.61
Jazz,2.54
Alternative,2.46
Easy Listening,1.56
Pop,1.32


As expected, Rock is the most popular genre, followed by Metal and Alternative & Punk. The company should continue to cater to these demographics comprising of the majority of sales made (by number of tracks sold) by introducing more music of these genres by other bands into their library.

#### Assessing the Sales Performance of Each Country

We find the following attributes of each country:

* the total number of customers
* the total revenue made by sales
* the average revenue made per customer
* the average revenue made per invoice

In [15]:
%%sql

WITH invoices_w_country AS
    (SELECT
         i.invoice_id,
         i.total,
         c.country,
         c.customer_id
     FROM invoice i
     LEFT JOIN customer c ON i.customer_id = c.customer_id
    )
    
SELECT
    ic.country,
    COUNT(DISTINCT(ic.customer_id)) num_customers,
    ROUND(SUM(ic.total),2) total_revenue,
    ROUND(SUM(ic.total) / COUNT(DISTINCT(ic.customer_id)),2) avg_per_customer,
    ROUND(SUM(ic.total) / COUNT(DISTINCT(ic.invoice_id)),2) avg_per_invoice
FROM invoices_w_country ic
GROUP BY country
ORDER BY total_revenue DESC;

 * sqlite:///chinook.db
Done.


country,num_customers,total_revenue,avg_per_customer,avg_per_invoice
USA,13,1040.49,80.04,7.94
Canada,8,535.59,66.95,7.05
Brazil,5,427.68,85.54,7.01
France,5,389.07,77.81,7.78
Germany,4,334.62,83.66,8.16
Czech Republic,2,273.24,136.62,9.11
United Kingdom,3,245.52,81.84,8.77
Portugal,2,185.13,92.57,6.38
India,2,183.15,91.58,8.72
Ireland,1,114.84,114.84,8.83


If the average amount spent per customer in certain countries are high (such as Czech Republic and Ireland), the company should consider more aggressive advertising / marketing towards these countries. 

However, we have to be cautious about the size of the merged tables, which is extremely small given the lack of data. We note that Czech Republic and Ireland has only 2 and 1 customers respectively. Therefore, we should not make any hasty generalisations about the customers from each country. 

<a id='conclusion'></a>

### Conclusion

Given the lack of data, it would be difficult to make certain business decisions. However, the insights that we have managed to uncover from such a small dataset would be for the company to continue selling music with popular genres such as Rock and Metal, as well as the company to consider being more aggressive in their marketing to attract more customers, in particular those who seem to spend a lot on average.