Skip to content

holgerbrandl/kdfutils

Repository files navigation

kdfutils

Download Build Status Gitter

Misc utilities for kotlin-dataframe

An opinionated set of utilities that we loved in krangl are yet/initially/by-design missing in kotlin-dataframe

Example datasets

KDF lacks example data to produce reproducible examples

import org.jetbrains.kotlinx.dataframe.datasets

irisData.head()

sleepData.count()

flightsdata.columnNames()

Bidirectional conversion between krangl and kotlin-dataframe

irisData.toKotlinDF().toKranglDF()

Note: kdfutils does not have a runtme dependency on krangl. It's up to the user to add it if bi-directional conversion is needed

Extended unfold support

KDF supports an API to unfold object columns. However, as it still lacks some convenience, here we support of properties from object column with more control over the unfolding process (optionally keep original column, cherry which attributes to unfold, add prefix)

data class City(val name:String, val code:Int)
data class Person(val name:String, val address:City)

val persons : List<Person> = listOf(
    Person("Max", City("Dresden", 12309)),
    Person("Anna", City("Berlin", 10115))
)

val personsDF: DataFrame = persons.asDataFrame()
personsDF.unfold<City>("address") 

// or selectively via property reference
personsDF.unfold<City>("address", properties= listOf(City::name), keep = true, addPrefix = true ) 

Naming Conventions & Conversions

There are many ways to name columns. To ease the transition (between camel, snake, ..) and to create names complying with compiler conventions, this library provides some renaming utilities

Typically, this works best by first renaming columns to camel case

Gradle

To get started, simply add it as a dependency:

dependencies {
    implementation "com.github.holgerbrandl:kdfutils:1.4.3"
}

Note that kdfutils does not depend on krangl any longer

Builds are hosted on maven-central supported by the great folks at sonatype.