# Plotnine

 In this tutorial, you’ll learn how to use ggplot in Python to create data


```python
plotnine==0.14.5
```

visualizations using a grammar of graphics. A grammar of graphics is a high-level tool that allows you to create data plots in an efficient and consistent way. It abstracts most low-level details, letting you focus on creating meaningful and beautiful visualizations for your data.

There are several Python packages that provide a grammar of graphics. This tutorial focuses on plotnine since it’s one of the most mature ones. plotnine is based on ggplot2 from the R programming language, so if you have a background in R, then you can consider plotnine as the equivalent of ggplot2 in Python.

 https://realpython.com/ggplot-python/

![xkcd](https://imgs.xkcd.com/comics/data_quality.png)

In [None]:
!pip freeze | grep -e plotnine

In [None]:
# If necessary
#!pip install plotnine==0.14.5

In [None]:
from plotnine import ggplot, ggtitle, aes, labs
from plotnine import geom_col, geom_line, geom_boxplot, geom_bar, geom_point, geom_smooth, geom_histogram
from plotnine import scale_x_timedelta, scale_shape_manual, scale_x_datetime
from plotnine import stat_bin, stat_smooth
from plotnine import facet_wrap, facet_grid
from plotnine import coord_flip
from plotnine import scale_color_hue, theme_dark

from mizani.breaks import date_breaks           # required for date_breaks

import pandas as pd
import numpy as np

In [None]:
%matplotlib inline

Load some datasets

In [None]:
from plotnine.data import economics # a panda dataframe
economics.head()

In [None]:
from plotnine.data import mpg
mpg.head()

In [None]:
from plotnine.data import mtcars
mtcars.head()

In [None]:
from plotnine.data import huron
huron.head()

Basics

1. You import the economics dataset.
2. You import the ggplot() class as well as some useful functions from plotnine, aes() and geom_line().
3. You create a plot object using ggplot(), passing the economics DataFrame to the constructor.
4.  You add aes() to set the variable to use for each axis, in this case date and pop.
5. You add geom_line() to specify that the chart should be drawn as a line graph.

In [None]:
(
    ggplot(economics)              # What data to use
    + aes(x="date", y="pop")       # What variable to use
    + geom_line()                  # Geometric object to use for drawing
)

In [None]:
(
    ggplot(economics)
    + aes(x="date", y="pop")
    + scale_x_datetime(
        name="Years since 1970",
        breaks=date_breaks(width='10 years'),
        date_labels="%Y"
    )
    + labs(title="Population Evolution", y="Population")
    + geom_line()
)

In [None]:
(
    ggplot(mpg)
    + aes(x="class")
    + geom_bar()
)

In [None]:
(
  ggplot(huron)
  + aes(x="factor(decade)", y="level")
  + geom_boxplot()
)

In [None]:
(
    ggplot(mpg)
    + aes(x="class", y="hwy")
    + geom_point()
)

In [None]:
(
    ggplot(mpg)
    + aes(x="cyl", y="hwy", color="class")
    + labs(
        x="Engine Cylinders",
        y="Miles per Gallon",
        color="Vehicle Class",
        title="Miles per Gallon for Engine Cylinders and Vehicle Classes",
    )
    + geom_point()
)

Aesthetics

In [None]:
(
    ggplot(mpg)
    + aes(x="class")
    + geom_bar()
    + coord_flip()
)

Scales

In [None]:
df2 = pd.DataFrame({
    'letter': ['Alpha', 'Beta', 'Delta', 'Gamma'] * 2,
    'pos': [1, 2, 3, 4] * 2,
    'num_of_letters': [5.0, 4.0, 5.0, 5.0] * 2  # Convertido a float
})

df2.loc[4:, 'num_of_letters'] += 0.8

(
    ggplot(df2)
      + geom_col(aes(x='letter',y='pos', fill='letter'))
      + geom_line(aes(x='letter', y='num_of_letters', color='letter'), size=1)
      + scale_color_hue(l=0.45)                                                      # some contrast to make the lines stick out
      + ggtitle('Greek Letter Analysis')
)

Geometrics

In [None]:
# Using Latex
mixed_shapes = (
    r'$\mathrm{A}$',
    r'$\mathrm{B}$',
    r'$\mathrm{C}$',
    r'$\mathrm{D}$',
)

(
    ggplot(mtcars, aes('wt', 'mpg', shape='factor(gear)', colour='factor(gear)'))
    + geom_point(size=6)
    + scale_shape_manual(values=mixed_shapes)
)

Smoothers

In [None]:
(
    ggplot(mpg, aes(x='displ', y='hwy', color='factor(drv)'))
    + geom_point()
    + geom_smooth(method='lm')
    + labs(x='displacement', y='horsepower')
)

In [None]:
(
    ggplot(mtcars, aes('wt', 'mpg', color='factor(gear)'))
    + geom_point()
    + stat_smooth(method='lm')
    + facet_wrap('~gear')
)

Statistics

In [None]:
(
  ggplot(huron)
    + aes(x="level")
    + stat_bin(bins=10)
    + geom_bar()
)

In [None]:
(
    ggplot(mpg, aes(x = 'displ'))
    + geom_histogram()
)

In [None]:
(
    ggplot(mpg, aes(x = 'displ'))
    + geom_histogram(binwidth=10)
)

Facets

In [None]:
#facet_grid(
#    facets="row~col",     # Fórmula para dividir: "var1~var2" o "var1~." o ".~var2"
#    scales="fixed",       # "fixed", "free", "free_x", "free_y"
#    space="fixed",       # "fixed", "free", "free_x", "free_y"
#    shrink=True,         # Ajustar escalas a datos en cada panel
#    labeller="label_value"  # Función para etiquetar facetas
#)

(
    ggplot(mpg)
    + facet_grid("year~class")
    + aes(x="displ", y="hwy")
    + labs(
        x="Engine Size",
        y="Miles per Gallon",
        title="Miles per Gallon for Each Year and Vehicle Class",
    )
    + geom_point()
)

Themes

In [None]:
(
    ggplot(mpg)
    + facet_grid("year~class")
    + aes(x="displ", y="hwy")
    + labs(
        x="Engine Size",
        y="Miles per Gallon",
        title="Miles per Gallon for Each Year and Vehicle Class",
    )
    + geom_point()
    + theme_dark()
)