# Bar Chart

- **Type**: **Comparison**
- **Purpose**: A bar chart is used to visualize **categorical data** by comparing the size of different categories. Each bar represents a category, and the height (or length) represents the value associated with that category.

- **How It Works**:
  - The x-axis (or y-axis for horizontal bars) represents the **categories**, while the y-axis (or x-axis) represents the **values** or **counts** for each category.
  - The length of the bars shows the magnitude of the value for each category, allowing for easy comparison.
  
- **Common Use Cases**:
  - Comparing **sales figures** across different products.
  - Showing the **frequency** of different categories, such as **survey responses**, **population by age group**, etc.
  - Summarizing categorical data or discrete variables.

## Customization Parameters

### **Matplotlib Customization**

- **`color`**: Sets the color of the bars.
- **`width`**: Sets the width of the bars.
- **`edgecolor`**: Defines the color of the bar borders.
- **`align`**: Controls whether bars are aligned at the center (`'center'`) or edge (`'edge'`).
- **`orientation`**: Switches between vertical (`'vertical'`) and horizontal (`'horizontal'`) bars.

### **Seaborn Customization**

- **`palette`**: Defines the color palette for the bars.
- **`hue`**: Differentiates bars by color according to a categorical variable.
- **`ci`**: Adds confidence intervals to the bars (default is `95%`).
- **`orient`**: Controls the orientation of the bars (`'v'` for vertical or `'h'` for horizontal).
- **`dodge`**: If `True`, bars are separated side-by-side for each level of the `hue` variable.



In [2]:
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import datasets
import seaborn as sns

In [None]:
iris = datasets.load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df["type"] = pd.DataFrame(data=iris.target)
# Define a function to map the values
def map_flower_type(type_value: int):
    if type_value == 0: return 'setosa'
    if type_value == 1: return 'versicolor'
    if type_value == 2: return 'virginica'
    else: return 'Unknown'

df['flower'] = df['type'].apply(map_flower_type)

In [None]:
# Bar chart for sepal width
plt.figure(figsize=(8, 6))
plt.bar(df.flower, df['sepal width (cm)'], color='orange')
plt.title('Sepal Width')
plt.xlabel('Index')
plt.ylabel('Sepal Width (cm)')
plt.show()

In [None]:
sns.barplot(
    x=df["flower"],
    y=df['sepal width (cm)'],
    linewidth=2,
    hue="flower",
    estimator="median",
    errorbar=("ci", 95),
    palette="cool",
    legend=False,
    orient='v',
    dodge=False,
    data=df,
)

plt.title("Sepal Width")
plt.xlabel("Index")
plt.ylabel("Sepal Width (cm)")
plt.show()

In [None]:
# Calculate mean sepal width for each flower type
mean_sepal_width = df.groupby('flower')['sepal width (cm)'].mean()

plt.figure(figsize=(8, 6))
plt.bar(
    x=mean_sepal_width.index,
    height=mean_sepal_width,
    color="orange",
    width=0.8,
    align='center',
)
plt.title("Sepal Width")
plt.xlabel("Index")
plt.ylabel("Sepal Width (cm)")
plt.show()