## Introduction 

For the analysis of the coffee and beverage logging data we'll need several libraries
- `FSharp.Stats` for regression, statistics, network generation, and signal smoothing 
- `Plotly.NET` for all kinds of visualizations 
- `FSharp.Data` for data import 
- `Cytoscape.NET` for network visualization
The interactive projects of Plotly.NET and Cytoscape.NET are needed to render the charts within a notebook instead of opening a browser

## NuGet
NuGet is the package manager for .NET. #r directives can be copied to `.fsx` scripts or polyglot notebooks to get access to libraries.

In [1]:
// nuget references
#r "nuget: FSharp.Stats, 0.5.1-preview.1"
#r "nuget: Plotly.NET.Interactive, 4.2.1"
#r "nuget: FSharp.Data, 6.3.0"
#r "nuget: Cytoscape.NET.Interactive, 0.2.0"

// the specified library namespaces and submodules are opened 
open FSharp.Stats
open Plotly.NET
open Plotly.NET.StyleParam
open Plotly.NET.LayoutObjects
open FSharp.Data
open Cytoscape.NET
open System

Loading extensions from `C:\Users\bvenn\.nuget\packages\cytoscape.net.interactive\0.2.0\interactive-extensions\dotnet\Cytoscape.NET.Interactive.dll`

Loading extensions from `C:\Users\bvenn\.nuget\packages\plotly.net.interactive\4.2.1\interactive-extensions\dotnet\Plotly.NET.Interactive.dll`

Some of FSharp.Stats functionalities require the usage of [LAPACK](https://www.netlib.org/lapack/) routines. After the initial package download you can find these at `C:\Users\USERNAME\.nuget\packages\fsharp.stats\0.5.1-preview.1\netlib_LAPACK`. In the prepared use cases it is not necessary to load it but if you want, the next two lines do the job 

In [6]:
//FSharp.Stats.ServiceLocator.setEnvironmentPathVariable (@"C:\Users\USERNAME\.nuget\packages\fsharp.stats\0.5.1-preview.1\netlib_LAPACK")
//FSharp.Stats.Algebra.LinearAlgebra.Service()

## Types
To represent a coffee log, we need two custom types. Products within the CSBar system can either be categorized as `Coffee`, `Beer`, `Beverage`, or `Other`. The following `Category` type models this and additionally provides the static member `FromString` that converts an input string to the according type.

In [2]:
type Category =
    | Beer
    | Beverage
    | Coffee
    | Other
    with 
        /// takes a string that describes the category and returns the corresponding category type
        static member FromString (s: string) =
            match s with
            | "Beer" -> Beer
            | "Beverage" -> Beverage
            | "Coffee" -> Coffee
            | _ -> Other

Now we are able to construct an `Order` record type that represents a single line of the coffeeData.txt. We could of course model the departments and producs as types as well, but for now, this type level is sufficient.

In [18]:
type Order = {
    DateTime    : System.DateTime
    Name        : string
    Gender      : char
    Product     : string
    Price       : float
    Department  : string
    Category    : Category
    Amount      : int
    } with
        /// takes data row entities as input and creates a Order record type
        static member Create time (name: string) gender product price department category amount = {
            DateTime  = time
            Name      = name
            Gender    = gender
            Product   = product
            Price     = price
            Department= department
            Category  = category
            Amount    = amount
            }

## Data import
FSharp.Data is used to import the coffeedata.txt. While iterating over the individual rows, instances of the `Order` type are created.

_Note: You could also use Deedle or just System.IO.File.readAllLines to import the data_

In [10]:
// Order [] all of the analysis will be based on
let data = 
    let tmp =
        FSharp.Data.CsvFile
            //.Load(@"..\data\coffeedata.txt")
            .Load(@"C:\Users\bvenn\source\repos\brewing-discoveries-workshop\data\coffeedata.txt")
            .Cache()

    tmp.Rows
    |> Seq.map (fun row -> 
        Order.Create
            (System.DateTime.ParseExact((row.GetColumn "DateTime"),"dd/MM/yyyy HH:mm:ss",null))
            (row.GetColumn "Name")
            (row.GetColumn "Gender" |> char)
            (row.GetColumn "Product")
            (row.GetColumn "Price" |> float) 
            (row.GetColumn "Department")
            ((row.GetColumn "Category") |> Category.FromString)
            (row.GetColumn "Amount" |> int)
        )
    |> Array.ofSeq


For visualization purposes it is handy to be able to color the users based on their department. The following function assigns a HEX color code to each department

In [12]:
/// gets a department from an Order type and returns a unique color in hex format
let getDepartmentColor (department: string) = 
    match department with 
    | "Breakroom Bandits" -> "#2b3ae9"
    | "Genesis"           -> "#f7da41"
    | "We Tried"          -> "#008b66"
    | "No Lucks Given"    -> "#987200"
    | "Toon Squad"        -> "#ff7f0e"
    | "Rumor Spreaders"   -> "#20b2aa"
    | "Risky Biscuits"    -> "#a230ed"
    | "Recruitables"      -> "#d21102"
    | "Employees of the Moment" -> "#19d3f3"
    | "Chargers"          -> "#dea57b"
    | "Kickstarters"      -> "#dea57b"
    | _                   -> "#8b8b8b"

/// Map that assigns a color to each person, based on the department
let persons = 
    data 
    |> Array.map (fun x -> x.Name,getDepartmentColor x.Department) 
    |> Array.distinct
    |> Map.ofArray

We successfully modeled and imported the data set. Now we can take a dive into the data and try to use data science to reveal interesting properties or surpising discoveries.

For a first glance lets check which time frame the data covers:

In [15]:
let firstLog = Array.minBy (fun x -> x.DateTime) data
let lastLog  = Array.maxBy (fun x -> x.DateTime) data

firstLog,lastLog

How many Persons are involved and how many products are available?

In [17]:
let personsCount =
    data
    |> Array.distinctBy (fun x -> x.Name)
    |> Array.length

let productCount =
    data
    |> Array.distinctBy (fun x -> x.Product)
    |> Array.length

$"Persons: {personsCount}\nProducts: {productCount}"

Persons: 155
Products: 212

How many coffees, beers and beverages were ordered?

In [20]:
let coffeelogs   = data |> Array.filter (fun x -> x.Category = Coffee)   |> Array.length
let beerlogs     = data |> Array.filter (fun x -> x.Category = Beer)     |> Array.length
let beveragelogs = data |> Array.filter (fun x -> x.Category = Beverage) |> Array.length

$"Coffee: {coffeelogs}\nBeer: {beerlogs}\nBeverage: {beveragelogs}"

Coffee: 21484
Beer: 4647
Beverage: 10481

What properties do you want to check next?