# Data Visualization with Python
In today’s world, a lot of data is being generated on a daily basis. And sometimes to analyze this data for certain trends, patterns may become difficult if the data is in its raw format. To overcome this data visualization comes into play. Data visualization provides a good, organized pictorial representation of the data which makes it easier to understand, observe, analyze. In this tutorial, we will discuss how to visualize data using Python.

Python provides various libraries that come with different features for visualizing data. All these libraries come with different features and can support various types of graphs. In this tutorial, we will be discussing four such libraries.

- Matplotlib

- Seaborn


We will discuss these libraries one by one and will plot some most commonly used graphs. 

## Database Used

In [7]:
!pip install pandas

import pandas as pd
import matplotlib.pyplot as plt
# reading the database
tips = pd.read_csv("tips.csv")
 
# printing the top 10 rows
display(tips.head(10))



ModuleNotFoundError: No module named 'matplotlib'

### Basic Filtering with pandas

### **Example 1**: Filter the data for smokers vs. non-smokers

In [None]:
smoker_tips = tips[tips['smoker'] == 'Yes']
non_smoker_tips = tips[tips['smoker'] == 'No']

print("\nNumber of tips for smokers:", len(smoker_tips))
print("Number of tips for non-smokers:", len(non_smoker_tips))

### **Example 2**: Group by day and smoker status to calculate average tip

In [None]:
# Example 2: Group by day and smoker status to calculate average tip
avg_tip_by_day = tips.groupby(['day', 'smoker'])['tip'].mean().unstack()
print("\nAverage tip by day and smoker status:")
print(avg_tip_by_day)


In [None]:
# ---------------------------
# Visualizing the Results
# ---------------------------

# Visualization 1: Bar Plot of Average Tip by Day for Smokers vs. Non-Smokers
plt.figure(figsize=(8,6))
avg_tip_by_day.plot(kind='bar')
plt.xlabel('Day of the Week')
plt.ylabel('Average Tip')
plt.title('Cole M+ Average Tip by Day and Smoker Status')
plt.legend(title='Smoker')
plt.grid(axis='y', linestyle='--', alpha=0.7)
plt.show()

### Simple select

In [None]:
# Scatter plot with day against tip
plt.scatter(tips['day'], tips['tip'])
 
# Adding Title to the Plot
plt.title("Cole M+ Scatter Plot")
 
# Setting the X and Y labels
plt.xlabel('Day')
plt.ylabel('Tip')
 
plt.show()

### **Example 3**: Filter the data for a specific day (e.g., "Sun")

In [None]:
# Example 3: Filter the data for a specific day (e.g., "Sun")
sunday_tips = tips[tips['day'] == 'Sun']
print("\nNumber of tips on Sunday:", len(sunday_tips))

# Visualization 2: Scatter Plot for Total Bill vs. Tip on Sunday
plt.figure(figsize=(8,6))
plt.scatter(sunday_tips['total_bill'], sunday_tips['tip'], color='purple', alpha=0.7)
plt.xlabel('Total Bill')
plt.ylabel('Tip')
plt.title('Cole M+ Total Bill vs Tip on Sunday')
plt.grid(True, linestyle='--', alpha=0.7)
plt.show()



### Example 3: If the 'time' column exists, filter for Lunch time and visualize

In [None]:
# Example 4: 
if 'time' in tips.columns:
    lunch_tips = tips[tips['time'] == 'Lunch']
    print("\nNumber of tips during Lunch time:", len(lunch_tips))
    
    # Visualization 3: Histogram of Tips during Lunch
    plt.figure(figsize=(8,6))
    plt.hist(lunch_tips['tip'], bins=15, color='green', edgecolor='black', alpha=0.7)
    plt.xlabel('Tip Amount')
    plt.ylabel('Frequency')
    plt.title('Distribution of Tips during Lunch Time')
    plt.grid(True, linestyle='--', alpha=0.7)
    plt.show()
else:
    print("\nThe 'time' column is not present in the dataset.")