# Plotting Multiple Data Series

Complete the following set of exercises to solidify your knowledge of plotting multiple data series with pandas, matplotlib, and seaborn. Part of the challenge that comes with plotting multiple data series is transforming the data into the form needed to visualize it like you want. For some of the exercises in this lab, you will need to transform the data into the form most appropriate for generating the visualization and then create the plot.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

warnings.filterwarnings('ignore')
%matplotlib inline

In [None]:
data = pd.read_csv('../data/liquor_store_sales.csv')
data.head()

## 1. Create a bar chart with bars for total Retail Sales, Retail Transfers, and Warehouse Sales by Item Type.

In [None]:
data.groupby(['ItemType'])[["RetailSales", "RetailTransfers", "WarehouseSales"]].sum().plot(kind="bar")

# Adding a title to the bar chart
plt.title('Total Sales Types by Item Types')

# Labeling the y-axis
plt.ylabel('Total Sum')

# Displaying the bar chart
plt.show()

## 2. Create a horizontal bar chart showing sales mix for the top 10 suppliers with the most total sales. 

In [None]:
import matplotlib.pyplot as plt

data['TotalSales'] = (
    data["RetailSales"]
  + data["RetailTransfers"]
  + data["WarehouseSales"]
)

sales_suppliers = data.groupby('Supplier')['TotalSales'].sum()

top_10 = sales_suppliers.sort_values(ascending=False).head(10)

plt.figure(figsize=(12, 6))
top_10.plot(kind='bar')

plt.title('Sales Mix of Top 10 Suppliers By Total Sales')
plt.xlabel('Supplier')
plt.ylabel('Total Sales')
plt.xticks(rotation=45, ha='right')

plt.tight_layout()
plt.show()


## 3. Create a multi-line chart that shows average Retail Sales, Retail Transfers, and Warehouse Sales per month over time.

In [None]:
sales_grouped_month = data.groupby('Month')[["RetailSales", "RetailTransfers", "WarehouseSales"]].mean()
sales_grouped_month

fig, ax = plt.subplots(figsize=(12, 6))

# Mehrere Linien malen – pandas plot shortcut verwendet matplotlib intern
sales_grouped_month.plot(ax=ax)

# Titel und Achsenbeschriftungen
ax.set_title('Average Sales Numbers per Month Over Years')
ax.set_xlabel('Month')
ax.set_ylabel('Average Sales')

# X-Achse: Monat als Ganzzahl, optional als Text (Jan, Feb, ...)
ax.set_xticks(range(1, 13))
# Wenn du eine Spalte MonthText hast, kannst du stattdessen so rotieren:
# ax.set_xticklabels(data['MonthText'].unique(), rotation=45, ha='right')

plt.xticks(rotation=45, ha='right')

plt.tight_layout()
plt.show()

## 4. Plot the same information as above but as a bar chart.

In [None]:
sales_grouped_month = data.groupby('Month')[["RetailSales", "RetailTransfers", "WarehouseSales"]].mean()

plt.figure(figsize=(12, 6))
sales_grouped_month.plot(kind='bar')

## 5. Create a multi-line chart that shows Retail Sales summed by Item Type over time (Year & Month).

*Hint: There should be a line representing each Item Type.*

In [None]:
data['Month/Year_dt'] = pd.to_datetime(data['Month/Year'], format='%m/%Y')

# 2. Gruppieren und aufsummieren
sales_pivot = (
    data
    .groupby(['Month/Year_dt', 'ItemType'])['RetailSales']
    .sum()
    .unstack(fill_value=0)           # jede ItemType wird zur eigenen Spalte
    .sort_index()                    # sicherstellen, dass die Datumsindex sortiert ist
)

# 3. Plotten
fig, ax = plt.subplots(figsize=(12, 6))

sales_pivot.plot(ax=ax, marker='o')  # Linien mit Markern

# 4. Achsen beschriften
ax.set_title('Sum of Retail Sales by Item Type Over Time')
ax.set_xlabel('Month/Year')
ax.set_ylabel('Total Retail Sales')

# 5. X-Achse-Labels auf Month/Year formatieren
ax.set_xticks(sales_pivot.index)  # Datumsticks an jedem Punkt
ax.set_xticklabels(sales_pivot.index.strftime('%m/%Y'), rotation=45, ha='right')

plt.tight_layout()
plt.show()

## 6. Plot the same information as above but as a bar chart.

In [None]:
fig, ax = plt.subplots(figsize=(14, 6))

# gruppierte Balken: jede ItemType pro Month/Year nebeneinander
sales_pivot.plot(kind='bar', ax=ax)

ax.set_title('Total Retail Sales by Item Type per Month/Year (Gruppiert)')
ax.set_xlabel('Month/Year')
ax.set_ylabel('Total Retail Sales')

# X-Ticks mit deinen MM/YYYY-Labels
ax.set_xticks(range(len(sales_pivot.index)))
ax.set_xticklabels(sales_pivot.index.strftime('%m/%Y'), rotation=45, ha='right')

plt.tight_layout()
plt.show()

## 7. Create a scatter plot showing the relationship between Retail Sales (x-axis) and Retail Transfers (y-axis) with the plot points color-coded according to their Item Type.

*Hint: Seaborn's lmplot is the easiest way to generate the scatter plot.*

In [None]:
sns.scatterplot(
    data=data,
    x="RetailSales",    # x-Achse: Gesamtbetrag
    y="RetailTransfers",           # y-Achse: Trinkgeld
    hue='ItemType',         # Farbe nach Wochentag
    sizes=(20, 200),   # Range der Punktgrößen
    alpha=0.7          # Transparenz
)

# 3. Achsenbeschriftungen und Titel
plt.title('Scatterplot of Retail Sales and Retail Transfers by Item Types')
plt.xlabel('Retail Sales')
plt.ylabel('Retail Transfers')

# 4. Anzeigen
plt.tight_layout()
plt.show()

## 8. Create a scatter matrix using all the numeric fields in the data set with the plot points color-coded by Item Type.

*Hint: Seaborn's pairplot may be your best option here.*

In [2]:
plot_df = data[["ItemType", "RetailSales", "RetailTransfers", "WarehouseSales"]]

sns.pairplot(
    data=plot_df,
    hue="ItemType",            
    diag_kind="hist",          
    plot_kws={"alpha": 0.6},   
)
plt.suptitle("Pairplot of Sales Numbers By Item Type", y=1.02)
plt.show()

NameError: name 'data' is not defined