# Exploratory Data Analysis (EDA) Charts
This notebook provides explanations and Python code for various charts used in EDA.

## 1. Histogram

A histogram represents the distribution of a single continuous variable. It groups data into bins and shows the frequency of data points in each bin.

In [None]:
import matplotlib.pyplot as plt

data = [your_data]  # Replace with your data
plt.hist(data, bins=10)
plt.title('Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()

## 2. Scatter Plot

Scatter plots are used to examine the relationship between two continuous variables. Each point represents an observation.

In [None]:
plt.scatter(data_x, data_y)  # Replace data_x and data_y with your data
plt.title('Scatter Plot')
plt.xlabel('X-axis label')
plt.ylabel('Y-axis label')
plt.show()

## 3. Box Plot

Box plots summarize the distribution of a continuous variable, showing median, quartiles, and outliers.

In [None]:
plt.boxplot(data)  # Replace data with your data
plt.title('Box Plot')
plt.ylabel('Value')
plt.show()

## 4. Bar Chart

Bar charts are used for categorical data, showing the frequency or count of items in each category.

In [None]:
categories = ['Category1', 'Category2', 'Category3']  # Replace with your categories
values = [value1, value2, value3]  # Replace with your values

plt.bar(categories, values)
plt.title('Bar Chart')
plt.xlabel('Category')
plt.ylabel('Value')
plt.show()

## 5. Line Graph

Line graphs are suitable for time series data, showing trends over time.

In [None]:
plt.plot(time_data, value_data)  # Replace time_data and value_data with your data
plt.title('Line Graph')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()

## 6. Heat Map

Heat maps represent data in a matrix with color-coded values.

In [None]:
import seaborn as sns

data_matrix = [[value11, value12], [value21, value22]]  # Replace with your matrix data
sns.heatmap(data_matrix, annot=True)
plt.title('Heat Map')
plt.show()

## 7. Pie Chart

Pie charts show the proportional distribution of categories as parts of a whole.

In [None]:
sizes = [size1, size2, size3]  # Replace with your sizes
labels = ['Label1', 'Label2', 'Label3']  # Replace with your labels

plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title('Pie Chart')
plt.show()

## 8. Violin Plot

Violin plots are similar to box plots but include a kernel density estimation.

In [None]:
import seaborn as sns

data = [your_data]  # Replace with your data
sns.violinplot(data=data)
plt.title('Violin Plot')
plt.ylabel('Value')
plt.show()

## 9. Area Chart

Area charts are similar to line charts but fill the area under the line, emphasizing the magnitude of values.

In [None]:
plt.fill_between(time_data, value_data)  # Replace time_data and value_data with your data
plt.title('Area Chart')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()

## 10. Stacked Bar Chart

Stacked bar charts divide each bar into multiple segments representing sub-categories.

In [None]:
import numpy as np
bar_width = 0.35
index = np.arange(len(categories))

plt.bar(index, values1, bar_width, label='Label1')  # Replace values1 and Label1
plt.bar(index, values2, bar_width, bottom=values1, label='Label2')  # Replace values2 and Label2

plt.xlabel('Category')
plt.ylabel('Values')
plt.title('Stacked Bar Chart')
plt.xticks(index, categories)
plt.legend()
plt.show()

## 11. Dot Plot

Dot plots are similar to bar charts but use dots instead of bars, often providing a clearer view for small datasets.

In [None]:
for i, value in enumerate(values):
    plt.plot(value, i, 'o')  # 'o' creates a dot

plt.yticks(range(len(values)), ['Label1', 'Label2', 'Label3'])  # Replace labels
plt.title('Dot Plot')
plt.xlabel('Value')
plt.show()

## 12. Bubble Chart

Bubble charts are like scatter plots but include a third dimension, represented by the size of the dots.

In [None]:
bubble_sizes = [size1, size2, size3]  # Replace with your sizes

plt.scatter(data_x, data_y, s=bubble_sizes)  # s for size
plt.title('Bubble Chart')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

## 13. Radar Chart

Radar charts display multivariate data with axes starting from the same point, useful for comparing multiple items.

In [None]:
from math import pi

# Number of variables
categories = ['A', 'B', 'C', 'D']  # Replace with your categories
N = len(categories)

# Values for each category
values = [value1, value2, value3, value4]  # Replace with your values
values += values[:1]  # repeat the first value to close the circular graph

# Angles for each axis
angles = [n / float(N) * 2 * pi for n in range(N)]
angles += angles[:1]

ax = plt.subplot(111, polar=True)
plt.xticks(angles[:-1], categories)
ax.plot(angles, values)
ax.fill(angles, values, alpha=0.3)
plt.show()

## 14. Tree Map

Tree maps display hierarchical data with nested rectangles.

In [None]:
import squarify  # pip install squarify

sizes = [size1, size2, size3, size4]  # Replace with your sizes
labels = ['Label1', 'Label2', 'Label3', 'Label4']  # Replace with your labels

squarify.plot(sizes=sizes, label=labels, alpha=0.7)
plt.axis('off')
plt.show()

## 15. Parallel Coordinates Plot

Parallel coordinates are used for plotting multivariate data.

In [None]:
from pandas.plotting import parallel_coordinates
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'A': [value1, value2, value3],
    'B': [value4, value5, value6],
    'C': [value7, value8, value9]
})

parallel_coordinates(df, class_column='A')
plt.title('Parallel Coordinates Plot')
plt.show()

## 16. Swarm Plot

Swarm plots display points for categories, ensuring they do not overlap.

In [None]:
sns.swarmplot(x='category', y='value', data=df)  # Replace category and value with your column names
plt.title('Swarm Plot')
plt.xlabel('Category')
plt.ylabel('Value')
plt.show()

## 17. Density Plot

Density plots show the distribution of a continuous variable with a smooth curve.

In [None]:
sns.kdeplot(data, shade=True)  # Replace data with your data
plt.title('Density Plot')
plt.xlabel('Value')
plt.show()

## 18. Boxen Plot

Boxen plots are enhanced box plots designed for larger datasets.

In [None]:
sns.boxenplot(x='category', y='value', data=df)  # Replace category and value
plt.title('Boxen Plot')
plt.xlabel('Category')
plt.ylabel('Value')
plt.show()