### What Is a Histogram?
A *histogram* is a type of bar plot that shows the frequency distribution of a single numerical variable by grouping the data points into bins (intervals) and plotting how many data points fall into each bin.

- The x-axis represents the bins (ranges of values).

- The y-axis represents the count (or frequency) of data points per bin.

This visualization helps us quickly grasp the shape and characteristics of data.

### Why Are Histograms Important?
Histograms reveal aspects of data that summary statistics (like mean or median) might miss:

**Distribution Shape**:

- Whether data is symmetric (like the classic bell-shaped normal distribution).

- If the data clusters around a central value or is spread out.

**Skewness**:

- If one tail of the data is longer (skewed left or right), which means the data is not symmetrical.

- This skewness affects statistical modeling and sometimes calls for data transformations.

**Outliers**:

- Histograms highlight rare or extreme values that fall far from the main data cluster.

- Outliers require investigation and sometimes data cleaning or transformation.

This makes histograms vital during exploratory data analysis (EDA) before deeper modeling or inference.

### Creating Histograms with pandas

    DataFrame.hist(column=None, by=None, bins=10, figsize=None, grid=True, ... )

- column: Name(s) of the column(s) to plot. If omitted, plots all numeric columns.

- bins: Number of bins (intervals).

- figsize: Tuple specifying figure size.

- grid: Show grid or not.

### Interpreting Histograms
1. **Data Shape**
- Symmetric / Normal Distribution: Histogram forms a bell curve.
- Uniform: Flat-topped, values spread evenly across bins.
- Bimodal: Two distinct peaks → possibly a mixture of two groups.

2. **Skewness**

- Right (Positive) Skew: Longer tail on the right; bulk of data on the left.
- Left (Negative) Skew: Longer tail on the left; bulk of data on the right.

3. **Outliers**
- Isolated bars far from the main cluster.
- Important to identify and investigate for quality or transformational decision.

In [1]:
# Sources:
# [1](https://www.geeksforgeeks.org/python/pandas-dataframe-hist/)
# [2](https://sparkbyexamples.com/pandas/pandas-plot-a-histogram/)
# [3](https://docs.kanaries.net/topics/Pandas/dataframe-histogram)
# [4](https://www.programiz.com/python-programming/pandas/histogram)
# [5](https://mode.com/example-gallery/python_histogram/)
# [6](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.hist.html)
# [7](https://www.fabi.ai/blog/pandas-histogram-creating-histogram-in-python-with-examples)
# [8](https://data36.com/plot-histogram-python-pandas/)
# [9](https://www.w3schools.com/python/matplotlib_histograms.asp)
# [10](https://www.youtube.com/watch?v=zNvxJNQhmRs)