In [1]:
In Matplotlib, a Box and Whiskers Chart (or Box Plot) is created using the plt.boxplot() function. This type of plot provides a summary of a dataset’s distribution, allowing you to visualize the range, median, interquartile range (IQR), and potential outliers. Box plots are especially helpful when comparing distributions across multiple categories.

Key Elements of a Box Plot
A Matplotlib box plot displays data in a five-number summary:

Minimum: The smallest data point within 1.5 times the IQR below the first quartile.
First Quartile (Q1): The 25th percentile of the data.
Median (Q2): The median (50th percentile) of the dataset.
Third Quartile (Q3): The 75th percentile.
Maximum: The largest data point within 1.5 times the IQR above the third quartile.
Outliers: Data points beyond the "whiskers" (outside 1.5 times the IQR from Q1 or Q3) are considered outliers and are displayed as individual points.
Creating a Basic Box Plot in Matplotlib
Here’s how to create a simple box plot with Matplotlib using sample data.

SyntaxError: invalid character '’' (U+2019) (586908743.py, line 1)

In [None]:

import matplotlib.pyplot as plt

# Sample data for demonstration
data = [75, 85, 90, 92, 95, 97, 100, 101, 103, 105, 107, 108, 109, 110, 115]

# Creating the box plot
plt.boxplot(data)

# Adding labels
plt.title("Box Plot of Sample Data")
plt.ylabel("Values")

# Display the plot
plt.show()
Example: Comparing Multiple Datasets
You can compare multiple datasets by providing a list of lists, with each inner list representing a dataset.

python
Copy code
import matplotlib.pyplot as plt

# Sample datasets
data = [
    [75, 80, 85, 90, 95, 100, 105],
    [70, 78, 82, 88, 92, 96, 99, 104],
    [65, 68, 72, 80, 85, 88, 94, 99, 105]
]

# Create box plot for multiple datasets
plt.boxplot(data, labels=["Dataset 1", "Dataset 2", "Dataset 3"])

# Adding labels and title
plt.title("Comparison of Multiple Datasets")
plt.xlabel("Dataset")
plt.ylabel("Values")

# Display the plot
plt.show()
Customizing the Box Plot
Matplotlib allows for extensive customization of box plots, making them more visually informative and tailored to specific needs.

Changing Colors of the Boxes
python
Copy code
# Create a box plot and customize colors
box = plt.boxplot(data, labels=["Dataset 1", "Dataset 2", "Dataset 3"], patch_artist=True)

# Set different colors for each box
colors = ['lightblue', 'lightgreen', 'lightcoral']
for patch, color in zip(box['boxes'], colors):
    patch.set_facecolor(color)

plt.title("Customized Box Plot with Colors")
plt.xlabel("Dataset")
plt.ylabel("Values")
plt.show()
Displaying Horizontal Box Plots
For a different orientation, you can set the vert=False parameter to display a horizontal box plot.

python
Copy code
plt.boxplot(data, labels=["Dataset 1", "Dataset 2", "Dataset 3"], vert=False, patch_artist=True)

plt.title("Horizontal Box Plot")
plt.xlabel("Values")
plt.ylabel("Dataset")
plt.show()
Box Plot Parameters
Some useful parameters in plt.boxplot() include:

vert: Whether the box plot should be vertical (True) or horizontal (False).
patch_artist: If True, it fills the boxes with color, allowing for color customization.
notch: If True, creates a notch around the median line, which can provide a visual indication of the variability around the median.
widths: Specifies the width of the boxes.
Adding Additional Customization: Notched Box Plots
Notched box plots include notches around the median, giving a visual indication of the uncertainty or variability of the median.

python
Copy code
plt.boxplot(data, labels=["Dataset 1", "Dataset 2", "Dataset 3"], patch_artist=True, notch=True)

plt.title("Notched Box Plot")
plt.xlabel("Dataset")
plt.ylabel("Values")
plt.show()