Skip to content

Latest commit

 

History

History
232 lines (203 loc) · 5.76 KB

cheatsheet.md

File metadata and controls

232 lines (203 loc) · 5.76 KB

Here is a refined and comprehensive Matplotlib cheat sheet focusing only on Matplotlib, NumPy, and Pandas, while paying close attention to the specific examples and comments from your notebooks:


Matplotlib Visualization Cheat Sheet

What is Matplotlib?

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It integrates seamlessly with NumPy and Pandas for powerful data visualization.

Relevance in Machine Learning

  • Exploratory Data Analysis (EDA): Understand data distributions and relationships.
  • Feature Engineering: Visualize feature transformations.
  • Model Interpretation: Plot feature importance and performance metrics.

1. Essential Setup

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

2. Line Plots

Basic Line Plot

x = [1, 2, 3]
y = [2, 4, 6]
plt.plot(x, y, label='Linear Growth')
plt.title('Basic Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.show()

Customized Line Plot

x = np.arange(0, 10, 1)
y = x ** 2
plt.plot(x, y, label='y = x^2', color='orange', linestyle='--', marker='o')
plt.title('Customized Line Plot')
plt.xlabel('X values')
plt.ylabel('y values')
plt.legend()
plt.savefig('line_plot.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Always label your axes and title your graph.
  • Use markers to highlight data points.

3. Bar Charts

Basic Bar Chart

categories = ['A', 'B', 'C']
values = [5, 7, 3]
plt.bar(categories, values, color='blue')
plt.title('Basic Bar Chart')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.tight_layout()
plt.savefig('bar_chart.png', dpi=300)
plt.show()

Grouped Bar Chart

categories = ['A', 'B', 'C']
values1 = [5, 7, 3]
values2 = [6, 8, 4]
width = 0.4
x = np.arange(len(categories))

plt.bar(x - width/2, values1, width=width, label='Group 1')
plt.bar(x + width/2, values2, width=width, label='Group 2')
plt.xticks(x, categories)
plt.title('Grouped Bar Chart')
plt.legend()
plt.tight_layout()
plt.savefig('grouped_bar_chart.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Ensure bar widths and labels are adjusted for readability.
  • Save graphs with a high resolution.

4. Scatter Plots

Basic Scatter Plot

x = np.random.random(50) * 100
y = np.random.random(50) * 100
plt.scatter(x, y, color='orange', alpha=0.6, marker='x')
plt.title('Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.tight_layout()
plt.savefig('scatter_plot.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Use alpha to adjust transparency for overlapping points.
  • Scatter plots are ideal for showing relationships between two variables.

5. Histograms

Customized Histogram

data = np.random.normal(0, 1, 1000)
bins = np.arange(-3, 3.5, 0.5)
plt.hist(data, bins=bins, color='purple', edgecolor='black')
plt.title('Data Distribution')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.xticks(bins)
plt.tight_layout()
plt.savefig('histogram.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Customize bin sizes to highlight key patterns in the data.
  • Edge colors enhance clarity.

6. Pie Charts

Basic Pie Chart

sizes = [15, 30, 45, 10]
labels = ['A', 'B', 'C', 'D']
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
plt.title('Basic Pie Chart')
plt.tight_layout()
plt.savefig('pie_chart.png', dpi=300)
plt.show()

Exploded Pie Chart

sizes = [10, 20, 30, 40]
labels = ['Group 1', 'Group 2', 'Group 3', 'Group 4']
explode = (0, 0.1, 0, 0.1)  # Offset Group 2 and 4
plt.pie(sizes, labels=labels, autopct='%1.1f%%', explode=explode)
plt.title('Exploded Pie Chart')
plt.tight_layout()
plt.savefig('exploded_pie_chart.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Use autopct to display percentages directly on the chart.
  • Explode slices to emphasize specific categories.

7. Box Plots

Comparing Distributions

group1 = np.random.normal(0, 1, 100)
group2 = np.random.normal(1, 1.5, 100)
plt.boxplot([group1, group2], labels=['Group 1', 'Group 2'])
plt.title('Box Plot Comparison')
plt.ylabel('Values')
plt.tight_layout()
plt.savefig('box_plot.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Box plots are great for visualizing medians, quartiles, and outliers.

8. 3D Visualizations

3D Scatter Plot

from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
x = np.random.rand(50)
y = np.random.rand(50)
z = np.random.rand(50)
ax.scatter(x, y, z, c='green', marker='o')
ax.set_title('3D Scatter Plot')
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_zlabel('Z-axis')
plt.savefig('3d_scatter_plot.png', dpi=300)
plt.show()

3D Surface Plot

x = np.linspace(-5, 5, 50)
y = np.linspace(-5, 5, 50)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2))

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot_surface(X, Y, Z, cmap='viridis')
ax.set_title('3D Surface Plot')
plt.savefig('3d_surface_plot.png', dpi=300)
plt.show()

Key Comments from Notebooks:

  • Use meshgrid for structured grids in 3D plots.
  • Adjust cmap for better visual contrast.

9. Additional Techniques

  1. Annotation:
plt.annotate('Peak', xy=(2, 6), xytext=(3, 8),
             arrowprops=dict(facecolor='black', arrowstyle='->'))
  1. Save High-Quality Visuals:
    • Use dpi=300 for professional-quality plots.
  2. Iterate Over Data:
    • Loop through datasets for automated visualizations:
    for col in df.columns[1:]:
        plt.plot(df['Year'], df[col], label=col)