# Percent of developers not running tests locally

Counts how many developers don't run tests on their local machine. This is just a dumb example to demonstrate using dataframe. You can easily use [Kotlin/kandy](https://github.com/Kotlin/kandyhttps://github.com/Kotlin/kandy) if you want to plot something.

In [1]:
%useLatestDescriptors
%use coroutines(v=1.7.1)
%use gradle-enterprise-api-kotlin(version=0.15.1)
%use dataframe(v=0.10.0)

import com.gabrielfeo.gradle.enterprise.api.*
import kotlinx.coroutines.*
import kotlinx.coroutines.flow.*
import java.time.*

### Parameters

Increase `maxBuilds` to get useful data. For testing it can be lowered to fetch faster.

In [2]:
val start = LocalDate.now().minusMonths(1)
val maxBuilds = 5

### Fetch builds

Fetch builds from the API. Usernames obfuscated to protect their privacy (since this is a public example).

In [3]:
import java.util.LinkedList
import java.security.MessageDigest

val md5 = MessageDigest.getInstance("MD5")

val buildsTable = runBlocking {
    val startMilli = start.atStartOfDay().toInstant(ZoneOffset.UTC).toEpochMilli()
    gradleEnterpriseApi.getGradleAttributesFlow(since = startMilli)
        .filter { "CI" !in it.tags }
        .take(maxBuilds)
        .toList(LinkedList())
        .toDataFrame {
            "id" from { it.id }
            "username" from { md5.digest(it.environment.username?.toByteArray()) }
            "tasks" from { it.requestedTasks.joinToString(" ") }
        }
}

// Last statement so that Jupyter will render it
buildsTable

### Process percentage

In [4]:
val usersTable = buildsTable.groupBy("username").aggregate {
    val didRun = any {
        "tasks"<String>().contains("test", ignoreCase = true)
    }
    didRun into "ranLocalTests"
}
usersTable

In [5]:
val total = usersTable.count()
val ranLocalCount = usersTable.count { "ranLocalTests"<Boolean>() == true }
val percent = "%.2f".format(ranLocalCount / total.toDouble() * 100)
HTML("<h4>$percent% of developers don't run tests on their local machine</h4>")