Exploratory Data Analysis (EDA) involves visually exploring and understanding the underlying patterns, relationships, and distributions in the data. Here are some commonly used data visualization techniques for EDA:

Histograms: Visualize the distribution of a single variable by dividing the data into bins and displaying the frequency or count of observations in each bin.

Box plots: Summarize the distribution of a variable using quartiles, median, and outliers. They are useful for comparing the distribution of multiple variables or groups.

Scatter plots: Plot points to visualize the relationship between two continuous variables. They can indicate patterns, trends, or correlations.

Line plots: Display the relationship between two variables over time or any continuous sequence. They are particularly useful for time series analysis.

Bar plots: Show the distribution or comparison of categorical variables. They can represent counts or proportions for each category.

Heatmaps: Visualize the correlation or relationship between multiple variables using a grid of colored cells. They help identify patterns and relationships among variables.

Pair plots: Display pairwise relationships between multiple variables in a grid of scatter plots. They are useful for quickly identifying patterns and correlations.

Violin plots: Combine a box plot and kernel density plot to display the distribution of a variable across different categories or groups.

Pie charts: Represent the proportion or percentage of different categories within a variable. They are useful for displaying relative frequencies.

Area plots: Display the cumulative contribution of different variables or categories over a continuous sequence.

# Line Plot

In [None]:
import matplotlib.pyplot as plt

# Create data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a line plot
plt.plot(x, y)
plt.title("Line Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

# Scatter Plot

In [None]:
import matplotlib.pyplot as plt

# Create data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

# Create a scatter plot
plt.scatter(x, y)
plt.title("Scatter Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()


# Bar Plot

In [None]:
import matplotlib.pyplot as plt

# Create data
categories = ['A', 'B', 'C', 'D']
values = [10, 7, 5, 8]

# Create a bar plot
plt.bar(categories, values)
plt.title("Bar Plot")
plt.xlabel("Categories")
plt.ylabel("Values")
plt.show()


# Histogram

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Create data
data = np.random.randn(1000)

# Create a histogram
plt.hist(data, bins=30)
plt.title("Histogram")
plt.xlabel("Values")
plt.ylabel("Frequency")
plt.show()


# Box Plot

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Create data
data = [np.random.normal(0, std, 100) for std in range(1, 4)]

# Create a box plot
plt.boxplot(data, labels=['A', 'B', 'C'])
plt.title("Box Plot")
plt.xlabel("Categories")
plt.ylabel("Values")
plt.show()


# Pie Chart

In [None]:
import matplotlib.pyplot as plt

# Create data
labels = ['A', 'B', 'C', 'D']
sizes = [15, 30, 45, 10]

# Create a pie chart
plt.pie(sizes, labels=labels, autopct='%1.1f%%', shadow=True)
plt.title("Pie Chart")
plt.show()


# Heatmap

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Create data
data = np.random.rand(5, 5)

# Create a heatmap
plt.imshow(data, cmap='hot', interpolation='nearest')
plt.title("Heatmap")
plt.colorbar()
plt.show()


# Area Plot

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Create data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = np.cos(x)

# Create an area plot
plt.fill_between(x, y1, y2, color='blue', alpha=0.3)
plt.title("Area Plot")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()
