# Applied Process Mining Module

This notebook is part of an Applied Process Mining module. The collection of notebooks is a *living document* and subject to change. 

# Hands-On 1 - 'Event Logs and Process Visualization' (R / bupaR)

## Setup

<img src="http://bupar.net/images/logo_text.PNG" alt="bupaR" style="width: 200px;"/>

In this notebook, we are going to need the `tidyverse` and the `bupaR` packages.

In [None]:
## Perform the commented out commands below in a separate R session
# install.packages("tidyverse")
# install.packages("bupaR")

In [None]:
# for larger and readable plots
options(jupyter.plot_scale=1.25)

In [None]:
# the initial execution of these may give you warnings that you can safely ignore
library(tidyverse)
library(bupaR)
library(processanimateR)

## Dataset

The proposed real-life dataset to investigate is the *BPI Challenge 2014* dataset. The dataset is captured from the ITIL process of Rabobank Group ICT and has been subject to the yearly BPI challenge in 2014. Here is more informaation on the dataset and downloads links to the data files:

* [Overview](https://www.win.tue.nl/bpi/doku.php?id=2014:challenge)
* [Dataset](http://dx.doi.org/10.4121/uuid:c3e5d162-0cfd-4bb0-bd82-af5268819c35)
* [Quick Reference](https://www.win.tue.nl/bpi/lib/exe/fetch.php?media=2014:quick_reference_bpi_challenge_2014.pdf)

On the BPI Challenge 2014 website above, there are also several reports that describe and analyze the dataset in detail. We suggest to first explore the dataset without reading the reports.

## Data Loading

To simplify the data loading task, here are the initial steps:

In [None]:
# some warnings are expected here
interaction_data <- read_csv2("https://data.4tu.nl/ndownloader/files/24031670")
incident_data <- read_csv2("https://data.4tu.nl/ndownloader/files/24031637")
activity_log_incidents <- read_csv2("https://data.4tu.nl/ndownloader/files/24060575")
change_data <- read_csv2("https://data.4tu.nl/ndownloader/files/24073421")

In [None]:
interaction_data %>% head()

In [None]:
incident_data %>% head()

In [None]:
activity_log_incidents %>% head()

In [None]:
change_data %>% head()

## Assignment

In this hands-on session, you are going to explore a real-life dataset and apply what was presented in the lecture about event logs and basic process mining visualizations.
The objective is to explore your dataset and as an event log and with the learned process mining visualizations in mind.

* Analyse basic properties of the the process (business process or other process) that has generated it. 
    * What are possible case notions / what is the or what are the case identifiers?
    * What are the activities? Are all activities on the same abstraction level? Can activities be derived from other data?
    * Can activities or actions be derived from other (non-activity) data?
* Discovery a map of the process (or a sub-process) behind it.
    * Are there multiple processes that can be discovered?
    * What is the effect of taking a subset of the data (by incident type, â€¦)? 

### Event Log

Have a look at the excellent `bupaR` documentation: http://bupar.net/creating_eventlogs.html

### Dotted Chart

### Process Maps