# Mode
The mode is the value that appears most frequently in a dataset. Unlike mean and median, the mode can be used with both numerical and categorical data.

- **Most frequent value:** Identifies the most common observation
- **Works with categorical data:** The only measure of central tendency that works with non-numerical data
- **Multiple modes possible:** A dataset can have no mode, one mode, or multiple modes
- **Not affected by outliers:** Extreme values don't influence the mode

**When to Use Mode**

Use Mode If:
- Categorical data (colors, brands, types)
- Identifying peaks in distributions
- Most common category or preference
- Nominal data where mathematical operations don't make sense
- Quick insight into popular choices

Don't Use Mode If:
- You need a mathematical average for calculations
- Data has many unique values with low frequencies
- You're working with continuous numerical data without clear peaks



**Types of Mode**
1. **Unimodal:** One value appears most frequently, `[1, 2, 2, 3, 4] → Mode = 2`
2. **Bimodal:** Two values appear with the same highest frequency, `[1, 2, 2, 3, 3, 4] → Modes = 2, 3`
3. **Multimodal:** More than two values appear with the same highest frequency, `[1, 1, 2, 2, 3, 3, 4] → Modes = 1, 2, 3`
4. **No Mode:** All values appear with equal frequency, `[1, 2, 3, 4, 5] → No mode`



## Implementation

### 1. Using SciPy Stats Module

In [4]:
from scipy import stats
import numpy as np

# Numerical data with mode
data = [1, 2, 2, 3, 4, 4, 4, 5]

# Calculate mode - returns ModeResult object
mode_result = stats.mode(data)

print(f"Mode value: {mode_result.mode}")        # [4]
print(f"Mode count: {mode_result.count}")       # [3]
print(f"Full result: {mode_result}")           # ModeResult(mode=array([4]), count=array([3]))

# Working with the result
mode_value = mode_result.mode
mode_count = mode_result.count
print(f"The value {mode_value} appears {mode_count} times")

Mode value: 4
Mode count: 3
Full result: ModeResult(mode=np.int64(4), count=np.int64(3))
The value 4 appears 3 times


### 2. Using pandas (for Data Analysis)

In [5]:
import pandas as pd
import numpy as np

# Create a DataFrame with mixed data types
data = {
    'product_category': ['Electronics', 'Clothing', 'Electronics', 'Books', 'Electronics', 'Clothing'],
    'price': [100, 50, 150, 30, 120, 45],
    'customer_rating': [4, 5, 4, 3, 4, 5]
}
df = pd.DataFrame(data)

print("DataFrame:")
print(df)

# Mode for categorical data
category_mode = df['product_category'].mode()
print(f"\nMost common product category: {category_mode.values}")  # ['Electronics']

# Mode for numerical data
rating_mode = df['customer_rating'].mode()
print(f"Most common rating: {rating_mode.values}")  # [4]

# Mode for entire DataFrame (returns mode for each column)
df_modes = df.mode()
print("\nModes for each column:")
print(df_modes)

DataFrame:
  product_category  price  customer_rating
0      Electronics    100                4
1         Clothing     50                5
2      Electronics    150                4
3            Books     30                3
4      Electronics    120                4
5         Clothing     45                5

Most common product category: ['Electronics']
Most common rating: [4]

Modes for each column:
  product_category  price  customer_rating
0      Electronics     30              4.0
1              NaN     45              NaN
2              NaN     50              NaN
3              NaN    100              NaN
4              NaN    120              NaN
5              NaN    150              NaN


## Advanced Usage

### 1. Mode with Grouped Data

In [16]:
import pandas as pd

# Sample data with groups
data = {
    'department': ['HR', 'HR', 'HR', 'Engineering', 'Engineering', 'Engineering', 'Sales', 'Sales'],
    'programming_language': ['Python', 'Python', 'Java', 'Python', 'JavaScript', 'Python', 'Java', 'Python']
}

df_tech = pd.DataFrame(data)

print("Original Data:")
print(df_tech)

# Mode by group
department_modes = df_tech.groupby('department')['programming_language'].agg(
    lambda x: x.mode().iloc[0] if not x.mode().empty else 'No mode'
)

print("\nMost common programming language by department:")
print(department_modes)

Original Data:
    department programming_language
0           HR               Python
1           HR               Python
2           HR                 Java
3  Engineering               Python
4  Engineering           JavaScript
5  Engineering               Python
6        Sales                 Java
7        Sales               Python

Most common programming language by department:
department
Engineering    Python
HR             Python
Sales            Java
Name: programming_language, dtype: object


### 2. Mode with DateTime Data

In [18]:
import pandas as pd

# Most common day of the week for sales
dates = pd.date_range('2024-01-01', '2024-03-31', freq='D')
sales_days = np.random.choice(dates, 50)  # Random sales dates

# Convert to day names
day_names = [pd.Timestamp(day).strftime('%A') for day in sales_days]

day_series = pd.Series(day_names)
most_common_day = day_series.mode()

print(f"Sales occurred on {len(set(day_names))} different days")
print(f"Most common sales day(s): {list(most_common_day)}")

Sales occurred on 7 different days
Most common sales day(s): ['جمعرات']



#### Key Python Libraries Summary

| Library      | Function(s)                          | Best For                                         |
|--------------|--------------------------------------|--------------------------------------------------|
| Statistics   | `mode()`, `multimode()`              | Basic mode calculations, educational use         |
| SciPy        | `stats.mode()`                       | Numerical arrays, returns count as well          |
| pandas       | `Series.mode()`, `DataFrame.mode()`  | Data analysis, handles multiple modes well       |
| Collections  | `Counter().most_common()`            | Frequency analysis, custom mode logic            |


#### Practical Tips
- **For categorical data:** Always use mode instead of mean/median
- **Check for multiple modes:** They can reveal important patterns in your data
- **With continuous data:** Consider binning first to find ranges with highest frequency
- **In machine learning:** Mode is used for imputing missing categorical values
- **For business insights:** Mode reveals the most common customer behavior or preference

The mode is particularly valuable when you need to understand what's "typical" or "most popular" in your data, especially with categorical information!

