# Preserving Dataâ€“Statistic Bijection in Lets-Plot-Kotlin

Some statistical geometries in Lets-Plot-Kotlin (such as `geomSina()`) generate their own statistical data, while still keeping a one-to-one correspondence with the original input data points.
Previously, this correspondence was not preserved in the mapping: if you mapped an aesthetic (e.g., `color`) to a column from the original dataset, all points could end up with an aggregated value.  

Now, Lets-Plot-Kotlin preserves the **bijection between data and statistics** for such geometries. This means you can safely map aesthetics to variables from the original dataset, and they will be correctly aligned with the statistical output.

In [1]:
%useLatestDescriptors
%use dataframe
%use lets-plot

In [2]:
LetsPlot.getInfo()

Lets-Plot Kotlin API v.0.0.0-SNAPSHOT. Frontend: Notebook with dynamically loaded JS. Lets-Plot JS v.4.8.0.

In [3]:
val url = "https://raw.githubusercontent.com/JetBrains/lets-plot-docs/refs/heads/master/data/mpg.csv"
val df = DataFrame.readCSV(url)
val data = df.toMap()
println("${df.rowsCount()} x ${df.columnsCount()}")
df.head()

234 x 12


untitled,manufacturer,model,displ,year,cyl,trans,drv,cty,hwy,fl,class
1,audi,a4,1800000,1999,4,auto(l5),f,18,29,p,compact
2,audi,a4,1800000,1999,4,manual(m5),f,21,29,p,compact
3,audi,a4,2000000,2008,4,manual(m6),f,20,31,p,compact
4,audi,a4,2000000,2008,4,auto(av),f,21,30,p,compact
5,audi,a4,2800000,1999,6,auto(l5),f,16,26,p,compact


## Map Columns to the Aesthetics

### Sina Stat

In [4]:
letsPlot(data) { x = "drv"; y = "hwy" } +
    geomViolin() +
    geomSina(seed = 42) {
        color = "displ"
        size = "cyl"
    } +
    scaleSize(range = 2.0 to 4.0)    

### Q-Q Stat

In [5]:
letsPlot(data) +
    geomQQ {
        sample = "hwy"
        color = "displ"
        size = "cyl"
    } +
    scaleSize(range = 3.0 to 6.0)

## Show Column Values in Tooltips

For the above-mentioned statistics, the tooltips can display not only the mapped values, but also any columns from the original dataframe.

In [6]:
letsPlot(data) { sample = "hwy" } +
    geomQQLine(color = "teal") +
    geomQQ(
        size = 3.0,
        shape = 21,
        color = "black",
        fill = "gold",
        alpha = 0.5,
        tooltips = layerTooltips()
            .title("@manufacturer @model")
            .line("theoretical|@..theoretical..")
            .line("highway mileage (sample)|@..sample..")
            .line("city mileage|@cty")
            .line("engine displacement in liters|@displ")
            .line("year of manufacturing|@year")
            .line("number of cylinders|@cyl")
            .line("type of transmission|@trans")
            .line("drive type|@drv")
            .line("fuel type|@fl")
            .line("vehicle class|@class")
            .format("year", "d")
            .minWidth(300)
            .anchor("bottom_right")
    ) +
    ggsize(1000, 600)