## ggplot

- knihovna napsaná pro R
- gg ~ Grammar of Graphics - idea spočívá v použití trochu jiné sémantiky při produkci grafů: graf je formálně rozdělen po vrstvách a či jiných prvcích a pomocí definovaných operací je skládáme dohromady.
- hrubá struktura vypadá zhruba takto:
```
data_k_vykresleni + prirazeni_os + geometricke_objekty + dalsi_objekty
```

V Pythonu máme balík `plotnine`, který replikuje chovádí knihovny `ggplot` z jazyka `R`
```
pip3 install plotnine
```

https://github.com/rstudio/cheatsheets/blob/master/data-visualization-2.1.pdf

## uvodni ukazky

In [None]:
from plotnine.data import economics
from plotnine import *

data = economics
data

In [None]:
ggplot(data) + aes(x = "date", y = "pop") + geom_line() + scale_y_log10()

In [None]:
data_lastyear = data.loc[data["date"] > '2014-01-01']
(
    ggplot(data_lastyear)
    + aes(x = "date", y = "pop")
    + geom_point(shape = '.', size = 20, color = "red")
    + xlab("Date")
    + ylab("Population (million)")
    + labs(title = "US population growth")
    + theme_matplotlib() + theme(axis_text_x=element_text(angle=45))
    + scale_x_datetime()
    + scale_y_continuous(labels = lambda l: ["%d" % (int(i)/1000) for i in l])
    # + theme_dark()
    # + theme_xkcd()
    # + theme_minimal()
    # + theme_linedraw()
)

## line plot

In [None]:
(
    ggplot(economics)
    + aes(x = "date", y = "pop")
    + geom_line(color = "black", size = 5)
    + geom_line(color = "pink", size = 2)
    + labs(title = "US population growth", x = "Date", y = "Population (million)")
    + theme_linedraw() + theme(axis_text_x=element_text(angle=45))
    + scale_y_continuous(breaks = lambda b: [i for i in np.linspace(b[0], b[1], 10)],labels = lambda l: ["%.2f" % (int(i)/1000) for i in l])
)

## Scatterplot

In [None]:
import numpy as np
data = {
    'a': np.arange(50),
    'c': np.random.randint(0, 50, 50),
    'd': np.random.randn(50)
}

data['b'] = data['a'] + 10 * np.random.randn(50)
data['d'] = np.abs(data['d']) * 5

p = (
    ggplot()
    + aes(x = data['a'], y = data['b'])
    + geom_point(size = data['d'], mapping = aes(color = data['c']), stroke = 0)
)
p = p + labs(color = "Barva", x = "x", y = "y", title = "Scatterplot")
p += theme_xkcd()
p

In [None]:
p.save("scatterplot.png", dpi=600)

## bar plot

In [None]:
from plotnine.data import meat

data2012 = meat.loc[meat["date"] >= '2012-01-01']
summed = data2012.sum()
vals = summed.to_numpy()
colors = vals / vals.max()

(
    ggplot()
    + aes(summed.index, summed.to_numpy(), fill = colors)
    + geom_col(show_legend = False)
    + labs(y = "per capita consumption (kg/year)")
    + coord_flip()
    + scale_fill_gradient(low = 'black', high = 'red')
    + theme_xkcd()
)

## pie chart
nejspis v `plotline` neni

## boxplot

In [None]:
from plotnine.data import mpg

mpg.head()

In [None]:
(
    ggplot(mpg) + aes(x = 'manufacturer', y = 'cty') + geom_boxplot() + coord_flip() + theme_matplotlib()
)

## ostatni

In [None]:
(
    ggplot(mpg)
    + facet_grid(facets="year~class")
    + aes(x="displ", y="hwy")
    + labs(
        x="Engine Size",
        y="Miles per Gallon",
        title="Miles per Gallon for Each Year and Vehicle Class",
    )
    + geom_point()
)