# Perf Avore: A Rule Based Performance Analysis and Monitoring Tool in FSharp

For my 2021 F# Advent Submission (5 years of submissions!!!!), I developed a Performance Based Monitoring and Analysis Tool called "_Perf-Avore_" that applies user specified __Rules__ that consists of __Conditions__ to match based on Trace Events from either an .ETL trace or a real time session and if the conditions are met, __Actions__ specified in the rule are invoked. Different types of conditions could include to check if the trace event property is an anomaly i.e. a deviant point based on an anomaly detection algorithm or is simply above a threshold value specified in the rule. Similarly, different actions could be specified that lead to different outputs such as printing out the callstacks, charting the data point or just simply alerting the user that a condition is met.  

The __purpose__ of Perf Avore is to provide users an easy and configurable way to detect and diagnose performance issues effectively by specifying details that are pertinent to performance issues in the rule itself. A use case, for example, is detecting spikes in memory allocations that can put unwanted pressure on the Garbage Collector and inevitably slow down the process. By specifying a rule that tracks ``AllocationAmount`` on the ``GC/AllocationTick`` event if it goes above a specified amount and then printing out the callstack for it can shed light on the impetus behind the increased pressure.

## High Level Overview

![High Level Idea](Images/HighlevelIdea.png)

1. Users provide rules.
   1. Rules consist of conditions and actions.
   2. Conditions Include: 
      1. The Name of the Trace Event and the property they'd like to track. 
      2. The condition or case for which they'd like to act on.
2. Trace Events are proffered to the rules engine to apply the rules to.
3. Based on either a given trace or by real time monitoring, conditions are checked for and actions are invoked based on a stream of trace events.
4. Examples of Rules:
   1. ``GC/AllocationTick.AllocationAmount > 200000 : Print Alert``
   2. ``ThreadPoolWorkerThreadAdjustment/Stats.Throughput < 4 : Print CallStack``
   3. ``GC/HeapStats.GenerationSize0 isAnomaly DetectIIDSpike : Print Chart``

The code is available [here](https://github.com/MokoSan/FSharpAdvent_2021/tree/main/src/PerfAvore/PerfAvore.Console). To directly jump into the details without much ado, scroll down to the __Plan__ section.  

## Experience Developing in FSharp 

F#, once again, didn't fail to deliver an incredible development experience! 
Despite not developing in F# for an extended period of time (much to my regret - I kicked myself about this during last year's [submission](https://bit.ly/3hhhRjq) as well), I was able to let the muscle memory from my previous projects kick in and reached a productive state surprisingly quickly; I'd like to underscore that this is more of a testament to the ease of usage of the language speaking volumes about the user-friendly nature of the language itself (and not necessarily my some-what-sophomoric acumen). 

Granted, I didn't make use of all the bells and whistles the language had to offer, what I did make use of was damn easy to get stuff done with. 
The particular aspects of the language that made it easy to develop a Domain Specific Language, a parser for that domain specific language and dynamic application of the actions are Pattern Matching and Immutable Functional Data Structures such as Records and Discriminated Unions that make expressing the domain succinctly and lucidly not only for the developer but also the reader.

An image that typifies the incredibly accessible nature of F# is the following one filched from a presentation by [Don Syme](https://twitter.com/dsymetweets) and [Kathleen Dollard](https://twitter.com/KathleenDollard) during this year's .NET Conf in November:

![Why FSharp](Images/WhyFSharp.jpg)

## Inspiration For the Project

Perf Avore was heavily inspired by [maoni0's](https://twitter.com/maoni0) [realmon](https://github.com/Maoni0/realmon), a monitoring tool that tells you when GCs happen in a process and some characteristics about these GCs. My contributions and associated interactions for realmon definitely were incredibly instrumental in coming up with the idea and it's implementation.

Additionally, as a Perf Engineer, I find that there are times where I need to arduously load traces in Perf View, resolve symbols and wait until all the windows open up to do basic things such as look up a single call stack for a single event or look up the payload value of a single event. By devising a simpler solution, I wish to reduce my perf investigation time as I build on this project.

Now that a basic overview and other auxiliary topics have been covered, without much more ceremony, I'll be diving into how I built Perf Avore. 

## Plan

The plan to get rule applications working is threefold:

1. __Parse Rules__: Convert the user inputted string based rules to a domain defined Rule.
2. __Process Trace Events__: Retrieve trace events from either a trace or a real time process.
3. __Apply Rules__: If the conditions of a rule are met, invoke the action associated with the rule.

![Birds Eye View](Images/BirdsEyeView.png)

However, before we go on further with this implementation, it is of paramount important to define the domain.

## The Domain

A Rule is defined as having a __Condition__ and an __Action__. 

``GC/AllocationTick.AllocationAmount > 200000 : Print Alert``

Here, the user requests that for the said process, an alert will be printed if the ``AllocationAmount`` of the ``GC/AllocationTick`` event is greater than 200,000 bytes. The action if the condition is met is that of alerting the user by outputting a message. 

A rule, more generally, is of the following format: 
``EventName.PropertyName ConditionalOperator ConditionalOperand : ActionOperator ActionOperand``

where:

| Part | Description | 
| ----------- | ----------- |
| Event Name | The event name from the trace / real time analysis for which we want to look up the property | 
| Property Name | A double property (this may change in the future) for which we'd want to construct a rule for | 
| Conditional Operator | An operator that, along with the Conditional Operand, will dictate situation for which we'll invoke an action for. |   
| Conditional Operand | The value or name of the anomaly detection operator along with the Conditional Operator that'll dictate the situation for which we'll invoke an action for. | 
| Action Operator | The operator that, along with the action operand will be invoked if a condition is met. |  
| Action Operand | The operand for which the action operator will be applied to in case a condition is met | 

The __Condition__ is modeled as the following combination of records and discriminated unions:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Domain.fs

type Condition = 
    {  Conditioner      : Conditioner;
       ConditionType    : ConditionType;
       ConditionalValue : ConditionalValue }
and Conditioner = 
    { ConditionerEvent    : ConditionerEvent; 
      ConditionerProperty : ConditionerProperty }
and ConditionType = 
    | LessThan
    | LessThanEqualTo
    | GreaterThan
    | GreaterThanEqualTo
    | Equal
    | NotEqual
    | IsAnomaly
and ConditionalValue =
    | Value of double
    | AnomalyDetectionType of AnomalyDetectionType 
and ConditionerEvent    = string
and ConditionerProperty = string
and AnomalyDetectionType =
    | DetectIIDSpike

To accommodate Anomaly Detection algorithms we add a ``IsAnomaly`` as a ``ConditionType`` which, rather than relying on a hardcoded threshold for the Conditional Value will relegate invoking an action onto an Anomaly Detection algorithm. The one that's implemented for this submission is that of an Independently and Identically Distributed Spike anomaly detection algorithm; more details are given below.

For the sake of completeness, the conditions we define are the following:

| Condition Operation | Description | 
| ----------- | ----------- |
| IsAnomaly | The condition to match on an anomaly detection algorithm. | 
| > >= < <= != = | Self explanatory conditional matching based on the value of the event property specified by the rule |

An Action is modeled as a record of an __ActionOperator__ and an __ActionOperand__:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Domain.fs

type Action = 
    { ActionOperator: ActionOperator; ActionOperand: ActionOperand }
and ActionOperator = 
    |  Print
and ActionOperand =
    | Alert
    | CallStack
    | Chart

The following are the currently implemented action operands:

| Name of Action Operands | Description | 
| ----------- | ----------- |
| Alert | Alerting Mechanism that'll print out pertinent details about the rule invoked and why it was invoked. |
| Call Stack | If a call stack is available, it will be printed out on the console. |
| Chart | A chart of data points preceding and including the one that triggered the condition of the rule is generated and rendered as an html file | 

As of now, ``Print`` is the only operator that simply outputs the operand to the Console.

The Rule, a combination of a Condition and a Action along with an identifier and the original rule passed in by the user and therefore is modeled as:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Domain.fs

type Rule = 
    { Id           : Guid
      Condition    : Condition
      Action       : Action 
      InputRule    : string }

Now that we have gone over the defined the domain, we can comfortably dive into the rule parsing logic that makes extensive use of pattern matching after deserializing a list of rules from a specified JSON file that could look like the following:

```
[ 
    "GC/AllocationTick.AllocationAmount > 108000: Print Alert",
    "GC/AllocationTick.AllocationAmount isAnomaly DetectIIDSpike : Print CallStack"
]
```

## Step 1: Parse Rule

![Step 1](Images/Step1_ParseRule.png)

This first step's goal is take the user inputted rule as a string to a Rule defined in our domain. The parsing logic is broken into two main functions that break up the logic of parsing the Condition and Action separately. The ``parseCondition`` function is defined as the following and constructs the condition based on the aforementioned constituents:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Parser.fs

let parseCondition (conditionAsString : string) : Condition = 

    let splitCondition : string[] = conditionAsString.Split(" ", StringSplitOptions.RemoveEmptyEntries)
    
    // Precondition check
    if splitCondition.Length <> 3
    then invalidArg (nameof conditionAsString) ("Incorrect format of the condition. Format is: Event.Property Condition ConditionalValue. For example: GCEnd.SuspensionTimeMSec >= 298")
    
    // Condition Event and Property
    let parseConditioner : Conditioner = 
        let splitConditioner : string[] = splitCondition.[0].Split(".", StringSplitOptions.RemoveEmptyEntries)
        let parseConditionEvent : ConditionerEvent = splitConditioner.[0]
        let parseConditionProperty : ConditionerProperty = splitConditioner.[1]

        { ConditionerEvent = parseConditionEvent; ConditionerProperty = parseConditionProperty }

    // Condition Type
    let parseConditionType : ConditionType =
        match splitCondition.[1].ToLower() with
        | ">"  | "greaterthan"                                 -> ConditionType.GreaterThan 
        | "<"  | "lessthan"                                    -> ConditionType.LessThan
        | ">=" | "greaterthanequalto" | "greaterthanorequalto" -> ConditionType.GreaterThanEqualTo
        | "<=" | "lessthanequalto"    | "lessthanorequalto"    -> ConditionType.LessThanEqualTo
        | "="  | "equal"              | "equals"               -> ConditionType.Equal
        | "!=" | "notequal"                                    -> ConditionType.NotEqual
        | "isanomaly"                                          -> ConditionType.IsAnomaly
        | _                                                    -> invalidArg (nameof splitCondition) ("${splitCondition.[1]} is an unrecognized condition type.")

    // Condition Value
    let parseConditionValue : ConditionalValue =
        let conditionalValueAsString = splitCondition.[2].ToLower()
        let checkDouble, doubleValue = Double.TryParse conditionalValueAsString 
        match checkDouble, doubleValue with
        | true, v -> ConditionalValue.Value(v)
        | false, _ -> 
            match conditionalValueAsString with
            | "detectiidspike" -> ConditionalValue.AnomalyDetectionType(AnomalyDetectionType.DetectIIDSpike)
            | _                -> invalidArg (nameof splitCondition) ($"{conditionalValueAsString} is an unrecognized anomaly detection type.")
        
    { Conditioner = parseConditioner; ConditionType = parseConditionType; ConditionalValue = parseConditionValue }

Similarly, the action parsing logic is implemented via ``parseAction`` function:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Parser.fs

let parseAction (actionAsAString : string) : Action = 
    let splitAction : string[] = actionAsAString.Split(" ", StringSplitOptions.RemoveEmptyEntries)

    // ActionOperator
    let parseActionOperator : ActionOperator = 
        match splitAction.[0].ToLower() with
        | "print" -> ActionOperator.Print
        | _       -> invalidArg (nameof splitAction) ($"{splitAction.[0]} is an unrecognized Action Operator.")

    // ActionOperand 
    let parseActionOperand : ActionOperand = 
        match splitAction.[1].ToLower() with
        | "alert"     -> ActionOperand.Alert
        | "callstack" -> ActionOperand.CallStack
        | "chart"     -> ActionOperand.Chart
        | _           -> invalidArg (nameof splitAction) ($"{splitAction.[1]} is an unrecognized Action Operand.")

    
    { ActionOperator = parseActionOperator; ActionOperand = parseActionOperand }

Finally, these 2 parsing functions are combined to parse a particular rule:

In [None]:
// src/PerfAvore/PerfAvore.Console/RulesEngine/Parser.fs

let parseRule (ruleAsString : string) : Rule = 
    let splitRuleAsAString : string[] = ruleAsString.Split(":")
    let condition : Condition = parseCondition splitRuleAsAString.[0]
    let action : Action = parseAction splitRuleAsAString.[1]
    { Condition = condition; Action = action; InputRule = ruleAsString; Id = Guid.NewGuid() }

Now that we have the functionality of parsing a rule, we want to move on to Step 2 i.e. Processing Trace Events. 

## Step 2: Process Trace Events

![Step 2: Process Trace Events](Images/Step2_ProcessTraceEvents.png)

Since both reading Trace Events from a .ETL file and real time event processing had to be accomodated for, we made a split in the logic using a command line parameter ``TracePath``; the absence of this command line parameter will indicate we want to kick off the real time processing logic.

We make use of ``Argu``, an F# specific command line argument parsing library that makes it conducive to use pattern matching based the types of the command line args such as the following: 


In [None]:
// src/PerfAvore/PerfAvore.Console/CommandLine.fs

#r "nuget:Argu" // Added specifically for this notebook.

open Argu

type Arguments = 
    | [<Mandatory>] ProcessName of string
    | TracePath of Path : string
    | RulesPath of Path : string

    interface IArgParserTemplate with
        member s.Usage =
            match s with
            | TracePath   _ -> "Specify a Path to the Trace."
            | ProcessName _ -> "Specify a Process Name."
            | RulesPath   _ -> "Specify a Path to a Json File With the Rules."

The usage of the trace path is incorporated like the following:

In [None]:
// src/PerfAvore/PerfAvore.Console/Program.fs
   
let argv              = [| "--tracepath"; "Path.etl"; "--processname"; "Test.exe"|]
let parser            = ArgumentParser.Create<Arguments>()
let parsedCommandline = parser.Parse(inputs = argv)

let containsTracePath : bool = parsedCommandline.Contains TracePath
containsTracePath

To interface with the Trace Events, we use the ``Microsoft.Diagnostics.Tracing.TraceEvent`` library that contains the ``TraceLog`` API that'll help us read events from both the .ETL file and the real time processing. The logic to get the stream of events is achieved by the following two functions based on if the ``tracepath`` is specified as a command line argument.

## Step 3

- go over anomaly detection algorithm in detail
- Printing call stack resolution
- alerting code
- Charting

## Conclusion

## References

1. [Taking Stock of Anomalies with F# And ML.NET](https://www.codesuji.com/2019/05/24/F-and-MLNet-Anomaly/)
2. [A CPU Sampling Profiler in Less Than 200 Lines](https://lowleveldesign.org/2020/10/13/a-cpu-sampling-profiler-in-less-than-200-lines/)
3. [Tutorial: Detect anomalies in time series with ML.NET](https://docs.microsoft.com/en-us/dotnet/machine-learning/tutorials/phone-calls-anomaly-detection)
4. [Plug-in martingales for testing exchangeability on-line: arXiv:1204.3251](https://arxiv.org/pdf/1204.3251.pdf)
5. [Atle Rudshaug's Submission of a Console App](https://atlemann.github.io/fsharp/2021/12/11/fs-crypto.html)