**Python for Data Visualization**

Presenter: Larissa do Carmo-Inácio (RCDM)

In [None]:
# Import libraries: Let's set up our toolbox

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


In [None]:
# How do we load our data? Remember to call the functions before giving the command

puppy_df = pd.read_csv('puppy_data_master.csv') # I am using a relative path, because the file I want Python to read is in the same directory of my script
puppy_df.head()

In [None]:
# If I want more information about the continous variables of my dataset, I use the function .describe()

puppy_df.describe()

In [None]:
# Histogram: An excellent start point

histogram = plt.hist(puppy_df['age_intake'])

In [None]:
# Scatter plot
scatter = plt.scatter (x = puppy_df['age_intake'], y = puppy_df['age_outcome']) #Since my data is 

In [None]:
# Box Plot

box = puppy_df.boxplot(by = 'specie', column = "age_intake") #Did you notice any difference in the syntax here?

Now, it is time to personalize our graphs!

In [None]:
# Change the figure size

box = puppy_df.boxplot(by = 'specie', column = ["age_intake"], figsize = (20,10))

In [None]:
# Change the titles and the axis labels, and rotate the x axis labels

box = puppy_df.boxplot(by = 'specie', column = ["age_intake"], figsize = (20,10))
plt.xlabel('Species')
plt.ylabel('Frequency')
plt.xticks(rotation=45)
plt.title('Box Plot of Animals\' Ages (Intakes) grouped by Species')

In [None]:
# Change the bins width

histogram = plt.hist(puppy_df['age_intake'], bins = 5)#The smaller the number, the wider the columns.
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.title('Animals\' Ages (Intakes)')

In [None]:
# Change the colors - my favorite part!

histogram = plt.hist(puppy_df['age_intake'], bins = 5, color = 'purple') # You can use the names of the colors
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.title('Animals\' Ages (Intakes)')
# Be simple. Don't expect Python to recognize 'Avocado Green no. 3'

In [None]:
# Change the colors - my favorite part!

histogram = plt.hist(puppy_df['age_intake'], bins = 5, color = '#884EA0') # You can also use the hexadecimal code of the colors
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.title('Animals\' Ages (Intakes)')
# Don't forget to use the strings

In [None]:
# Using Seaborn to change the aesthetic 

plt.figure(figsize=(20,10)) # First, create a new figure, then set the size BEFORE plotting with seaborn
box = sns.boxplot(hue = 'specie', y = 'age_intake', data = puppy_df, palette='deep') # Seaborn often uses a single color or a palette based on the theme. But if you want to use a specific set of colors for different categories, you can use the palette argument and apply a predefined palettes, like 'deep,' 'pastel,' or 'flare'
plt.xlabel('Species')
plt.ylabel('Age')
plt.title('Animals\' ages by species')
plt.savefig('boxplot.png')  # Save as PNG with higher resolution


In [None]:
# How to save your plots as images or file

plt.figure(figsize=(20,10)) 
box = sns.boxplot(hue = 'specie', y = 'age_intake', data = puppy_df, palette='deep')
plt.ylabel('Age')
plt.title('Animals\' ages by species')
plt.savefig('boxplot.png')  # Save as PNG for a better resolution


*Now, would you like to try by youyselves?*

**Exercise: Create a Bar Chart to Visualize Daily Temperatures**


*Goal:* You will create a bar chart to visualize the daily temperatures for a week. By the end of this exercise, you'll understand how to use matplotlib to plot data in a simple and clear bar chart format, and how to customize the chart with titles and labels.

Steps:

1. Import the necessary libraries.
2. Prepare the data.
3. Create the bar chart.
4. Add titles and labels.
5. Display the chart.

In [None]:
#1 Import the libraries
import matplotlib.pyplot as plt

#2 Prepare the data
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
temperatures = [17, 23, 24, 17, 16, 20, 16]

#3 Create the bar chart
plt.figure(figsize=(10, 5))
plt.bar(days, temperatures, color='skyblue')

#4 Add titles and labels
plt.title('Daily Temperatures')
plt.xlabel('Day')
plt.ylabel('Temperature (°C)')

#5 Display the chart
plt.show()
