# Part I - Garbage Collector Essentials

Exellent articles about GC Log Anathomy:
 - [Java,JVM Logs, GC Logs, G1GC -Monday with JVM logs - G1GC Stop-the-world phases](https://krzysztofslusarski.github.io/2021/08/10/monday-phases.html)

Garbage collector, JVM memory allocation - series
 - [Stages and levels of Java garbage collection](https://developers.redhat.com/articles/2021/08/20/stages-and-levels-java-garbage-collection#)

# Part II - Initial steps
*(Import all needed [kotlin for data scince](https://kotlinlang.org/docs/data-science-overview.html) libraries)*

In [1]:
%use kandy(0.4.4)
%use dataframe(0.11.0)
import org.jetbrains.kotlinx.dataframe.io.readCSV

//import all gc events from csv file (file was generated via gc-log-wizard application)
val dfRaw = DataFrame.readCSV("../gclogs/pauses.csv")




In [2]:
//convet string columnt to timestamp datatype
dfRaw.convert { startedAt }.toLocalDateTime(kotlinx.datetime.TimeZone.UTC)
//and sort it by timestamp ascending
val df = dfRaw.sortBy { startedAt }


In [3]:
//initial theme for charts
val blankTheme = theme {
    global.line {
        blank = true
    }
    blankAxes()
}

# Part III - Events data structure

*(Now we can use DSL to data examinations). We can use kotlin dataframe feature.*


In [4]:
//list all columns
df.schema()


type: String
event_type: String
startedAt: kotlinx.datetime.Instant
duration: Double
cause: String
cpu_user: Double
cpu_kernel: Double
cpu_wallClock: Double
heap_occupancy_before_coll: Int
heap_occupancy_after_coll: Int
heap_size_before_coll: Int
heap_size_after_coll: Int
perm_or_meta_occupancy_before_coll: Int
perm_or_meta_occupancy_after_coll: Int
perm_or_meta_size_before_coll: Int
perm_or_meta_size_after_coll: Int


### The most useful columns

#### *1. type - (GC Event type)* 
Event type - is visualized on [this chart] (https://github.com/microsoft/gctoolkit/blob/main/images/GCToolkit_Events.png) as a G1GcEvent family

#### *2. event_type*
Column grouping stop-the-world and concurrent events. Allows calculation of pauses duration and comparation them with concurrent phases duration.

#### *3. startedAt* 
Timestamp

#### *4. duration* 
Phase duration **(in seconds!)**

#### *5. cause - GC causes* 

#### *Summary* 
**Based on these columns we can perform many useful aggregations and visualisations!**


# Part IV - Analysis of GC performing

## All phases - examination

In [5]:
//get aggregated info about each gc phase duration
val all = df.count()
val phasesSummary=df.select {type and duration}.groupBy{type}.aggregate {
    count() into "count"
    sum(duration) into "sum [s]"
    max(duration) into "max [s]"
    min(duration) into "min [s]"
    mean(duration) into "avg [s]"
    ((count().toDouble()*100/all))  into "percentage"
 }
phasesSummary

In [6]:
val (w, h) = 700 to 350

phasesSummary.plot {
    pie {
        slice(percentage)
        fillColor(type)
        size=30.0
    }
    layout {
        size = w to h
        theme(blankTheme)
        title="GC phases - percentage"
    }
 }

In [7]:
phasesSummary.plot {
    bars {
        x(type)
        y("sum [s]")
       
        fillColor("sum [s]"<Int>()) {
            scale = continuous(range = Color.YELLOW..Color.RED)
   
        borderLine.width = 0.0
    }
    }
    layout.title = "Phases duration sum [s]"
    layout.size = 950 to 500
}





## Pauses - examination


<img style="float: left; margin-right:20px;" src="idea.png">  **Tip:** Of most interest to us are the phases that cause the application threads to stop.<br> We call these phases of the *stop-the-world* type. <br> We can find them using the filter *event_type=="PauseEvent"*!

In [37]:
//get aggregated info about each gc phase duration
val all = df.filter{event_type=="PauseEvent"}.count()
val pausesSummary=df.filter{event_type=="PauseEvent"}.select {type and duration}.groupBy{type}.aggregate {
    count() into "count"
    sum(duration) into "sum [s]"
    max(duration) into "max [s]"
    min(duration) into "min [s]"
    mean(duration) into "avg [s]"
    ((count().toDouble()*100/all))  into "percentage"
 }
pausesSummary

In [39]:
val (w, h) = 700 to 350

pausesSummary.plot {
    pie {
        slice(count)
        fillColor(type)
        size=30.0
    }
    layout {
        size = w to h
        theme(blankTheme)
        title="Stop the world pauses - percentage"
    }
 }

In [27]:
pausesSummary.plot {
    bars {
        x(type)
        y("sum [s]")
       
        fillColor("sum [s]"<Int>()) {
            scale = continuous(range = Color.YELLOW..Color.RED)
   
        borderLine.width = 0.0
    }
    }
    layout.title = "Stop the world pauses duration sum [s]"
    layout.size = 950 to 500
}

In [10]:
//get the longest stop-the-world pause - select only indicated columns
df.filter{event_type=="PauseEvent"}.select{type and event_type and startedAt and duration and cause}.maxBy{duration}


## GC event causes - examination

<img style="float: left; margin-right:20px;" src="idea.png">  **Tip:** _(screen from artcicle [Monday with JVM logs - G1GC Stop-the-world phases](https://krzysztofslusarski.github.io/2021/08/10/monday-phases.html))_

![](gc_causes.png)

In [50]:
val gcCausesAll=df.filter{cause!=null}.count() 
val gcCauses=df.groupBy{cause}.aggregate{
   count() into "count"
   ((count().toDouble()*100/gcCausesAll))  into "percentage" 
    }.sortBy{it["count"].desc()}

gcCauses


In [53]:
val (w, h) = 700 to 350

gcCauses.plot {
    pie {
        slice(percentage)
        fillColor(cause)
        size=30.0
    }
    layout {
        size = w to h
        theme(blankTheme)
        title="GC causes - percentage"
    }
 }

## App Healthcheck - examination

<img style="float: left; margin-right:20px;" src="idea.png">  **Tip:** TODO

In [13]:
import kotlinx.datetime.TimeZone
import kotlinx.datetime.toJavaInstant
import java.text.SimpleDateFormat
import java.time.Duration

val first = df.first().startedAt.toJavaInstant()
val last = df.last().startedAt.toJavaInstant()
val uptimeInSeconds = Duration.between(first,last).toSeconds()
val pausedDurationInSeconds:Double=df.filter {event_type=="PauseEvent"}.select{duration}.sum().get("duration") as Double

val percentage=(pausedDurationInSeconds * 100)/uptimeInSeconds

println(first)
println(last)
println(pausedDurationInSeconds)
println(percentage)


2023-07-18T16:38:34.550Z
2023-07-19T15:50:04.875Z
83.78576800000022
0.10035425559947327


In [9]:
//Summarize stop-the-world and concurrent phase time

df.groupBy{event_type}.sum{ duration }.sortByDesc{duration}

In [12]:
df.filter { type=="Young" }.select{cause}.groupBy{cause}.count()


# Part V - Analysis of SafePoints

<img style="float: left; margin-right:20px;" src="idea.png">  **Tip:** TODO

# Part VI - Analysis of Heap occupancy

<img style="float: left; margin-right:20px;" src="idea.png">  **Tip:** TODO

In [14]:


df.filter{type=="Young"}.sortBy{startedAt}.plot {
    points {        
        x(startedAt)
        y(heap_occupancy_after_coll)
        size = 1.0
        layout{
            size = 2000 to 950
        }
    }
}