# The Blue Paint Incident

You have received an anonymous tip that something 'irregular' is happening with your purchases of blue paint. Your inital meeting with the purchase manager reveals the following:
* Your company has five vendors that deliver **blue paint** (Material ID: BLUEPAINT).
* Each vendor has a long-running contract with a standard delivery volume of **100** liters per shipment
* The **delivery tolerance** acceptable for good receipt of shipment of blue paint is +/- 5 percent.

You have requested a copy of the data from the SAP system. You focus on the `goods receipt` events of your purchasing processes. You have received two tables:
* `MKPF` contains the header information of the material documents.
* `MSEG` contains the line item information of the material documents.

Analyze the data to understand the irregularity. 

## Setup

Some initialization to make life easier. **Make sure to run the following cell before proceeding.**

In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all'
import matplotlib.pyplot as plt
%matplotlib inline

[Pandas](https://pandas.pydata.org) is the most important workhorse in data analytics.
[Seaborn](https://seaborn.pydata.org) is a visualization library.

In [2]:
import pandas as pd
import seaborn as sns
pd.set_option('display.float_format', lambda x: '%.0f' % x)

You have received two tables from the SAP system that contain all the information of the `goods receipt` step:
2. The header information is stored in the table `MKPF`.
1. The line items are stored in the table `MSEG`.

In [3]:
mkpf_table = pd.read_csv('https://raw.githubusercontent.com/mschermann/forensic_accounting/master/MKPF.csv')
mseg_table = pd.read_csv('https://raw.githubusercontent.com/mschermann/forensic_accounting/master/MSEG.csv')

Both tables from with huge numbers of columns.

In [None]:
mkpf_table.columns
mseg_table.columns

You can find the definition of all the columns in the SAP system using the transaction code `SE16`.

## Understanding the Data

### The MKPF table

For our purposes, we use the following columns from `MKPF`:
* `MBLNR` - Contains the material document number.
* `USNAM` - Contains the inventory employee who posted the material document.

**Your task:** Reduce the MKPF table on the two columns. Store the result in a variable called `mkpf`.

**Your task:** Show the first five rows of the MKPF table.

### The MSEG table

For our purposes, we use the following columns from `MSEG`:
* `MBLNR` - Contains the material document number.
* `BWART` - Contains the movement type of the line item.
* `MATNR` - Contains the material id.
* `LIFNR` - Contains the vendor id.
* `MENGE` - Contains the volume of the shipment. 

**Your task:** Reduce the MSEG table on the columns of interest. Store the result in a variable called `mseg`.

**Your task:** Show the unique movement types in the `MSEG`.

The following list shows important movement types:
* 101 - Goods receipt for a purchase order
* 102 - Goods receipt for a purchase order - reversal
* 122 - Return delivery to vendor
* 161 - Return delivery to vendor for a purchase order
* 261 - Consumption for production order from warehouse.

**Your task:** Filter the MSEG table on the movement type of interest. Store the result in a variable called `mseg`.

Additionally, we are only interested in the goods movements of blue paint.

**Your task:** Filter the `MSEG` table on the material id of blue paint. Store the result in a variable called `mseg`.

**Your task:** Assign the variable type `int` to the column that contains the shipment volume.

**Make sure to run the following cell before proceeding.**

In [None]:
mseg.reset_index(inplace=True, drop=True)

**Your task:** Show a sample of the filtered `MSEG` table.

## Analysis of blue paint shipments

**Your task**: Calculate the mean, the minimum value, and the maximum value of the received shipments.

**Your task**: Plot the volume of the shipments in relationship to the index (`mseg.index`).

**Your task**: Confirm that five vendors deliver blue paint.

We normalize the number of shipments per vendor. **Make sure to run the following cell before proceeding.**

In [18]:
mseg['order'] = mseg.groupby('LIFNR').cumcount()

**Your task**: Develop a chart that shows the shipment volumes across time. To do this, replace the `<PLACEHOLDER>` in the code below.

In [None]:
g = sns.FacetGrid(mseg, col='<VENDOR>', col_wrap=3);
g.map(plt.plot, 'order', '<VOLUME OF SHIPMENT>');

for a in g.axes:
    a.axhline('<CONTRACTED VOLUME>', alpha=0.5, color='grey');

## Identify the person of interest in the inventory

**Your task**: Left-Join the `MSEG` and `MKPF` tables. Store the result in a variable called `inventory`.

**Your task**: Group the `inventory` by the vendor id and show the unique id's of the inventory employees that managed shipments from each vendor.

**Who is the person of interest?**