This repository has some files with Python codes that I created while doing the Seaborn courses by DataCamp.
Introduction to Data Visualization with Seaborn
1. Introduction to Seaborn
2. Visualizing two Quantitative Variables
3. Visualizing a Categorical and a Quantitative Variable
4. Customizing Seaborn Plots
Intermediate Data Visualization with Seaborn
- Introduction to Seaborn
- Customizing Seaborn Plots
- Additional Plot Types
- Create Plots on Data Aware Grids
Plots with Lists
Creating a Scatter Plot
Creating a Count Plot
Plots with Pandas DataFrames
Creating a Scatter Plot
"Hue" will define a third variable and hue_order will set the order accordingly.
The first variable is Total Bill (x axis), the second variable is Tip (y axis) and the third variable is Smoker (Y/N), shown in the graph by colors: orange or blue.
Creating a Scatter Plot
Creating a Count Plot
We can define the colors we want to be shown in the plot by creating a dictionary mapping subgroup.
In this example, the dictionary maps the value "Rural" to the color green and the value "Urban" to the color blue.
Relational Plots with relplot()
The Seaborn Relational Plot (relplot) allows us to visualise how variables within a dataset relate to each other.
Creating a Scatter Plot
In this example, you can make subplots based on the study time by defining the variable "col":
Changing Styles in Scatter Plots
You can use Hue and Style to create different colored points and also to change the style of the points:
You can set the variable alpha to change the transparency of the points. This is very useful to help with the data visualization when you have many points in your Scatterplot.
Changing Styles in Line Plots
You can use Hue and Style to create different styles and colors of lines for subgroups:
You can use Markers to create different marker styles on the subgroup lines:
Count Plots and Bar Plots
- catplot() for Categorical Plots: comparisons between groups - count plots and bar plots - kind = "count"
- catplot() for Bar Plots: displays the mean of quantitative variable per category - kind = "bar". The confidence interval is automatically set as 95% but can be changed via "ci"
Box Plots
A box plot shows the distribution of quantitative data. The color box represents the 25th to 75th percentile and the line in the middle of the box represents the median. The whiskers give a sense of the spread of the distribution and the floating points, the outliers.
Point Plots
They show the mean of a quantitative variable for the observations in each category, plotted as a single point. The vertical lines show the interval of confidence. One of the axis is a categorical variable.
- Style: you can customize your graph with different styles: whitegrid, ticks, dark, darkgrid.
- Palletes: you can customize the color palettes with diverging, sequential or your own palette.
- Scales: you can change the scale of your plot with the function "set context" - the options are: options are: "paper", "notebook", "talk" and "poster".