## In a Star Schema:

- The center table (Fact Table) = the main “event log” with numbers (e.g., sales amount, quantity, salary paid).
- The outer tables (Dimension Tables) = details that describe those events (e.g., customer info, product details, dates).

## It’s called a star because:

- The Fact Table sits in the middle.
- The Dimension Tables are connected around it like points of a star ⭐.

## Example: Online Store

- Fact Table (center): Sales → order_id, product_id, customer_id, date_id, quantity, total_amount
- Dimension Tables (around it):
- Customers → customer_id, name, gender, location
- Products → product_id, name, category, price
- Date → date_id, day, month, year

<strong>So: “Customer A bought Product X on Date Y for $Z”.</strong?>

## Columns name

- Age,Attrition,BusinessTravel,DailyRate,Department,DistanceFromHome,Education,EducationField,EmployeeCount,EmployeeNumber,
- EnvironmentSatisfaction,Gender,HourlyRate,JobInvolvement,JobLevel,JobRole,JobSatisfaction,MaritalStatus,MonthlyIncome,
- MonthlyRate,NumCompaniesWorked,Over18,OverTime,PercentSalaryHike,PerformanceRating,RelationshipSatisfaction,StandardHours,
- StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,
- YearsSinceLastPromotion,YearsWithCurrManager

## 🏛 Fact vs Dimension in HR Data

### ✅ Facts (numeric, additive measures – things you aggregate)

- DailyRate → (can be averaged/summed)
- HourlyRate → (can be averaged/summed)
- MonthlyIncome → (key salary measure)
- MonthlyRate → (pay metric, can aggregate)
- NumCompaniesWorked → (numeric, but more like history → could be dimension if treated categorically)
- PercentSalaryHike → (numeric but measure-like, often aggregated)
- PerformanceRating → (usually categorical 1–4, but sometimes treated as fact if you analyze avg rating)
- StandardHours → (same for all, so not useful — can ignore)
- TotalWorkingYears → (measure, can avg/sum across employees)
- TrainingTimesLastYear → (measure)
- YearsAtCompany → (measure, avg/min/max used)
- YearsInCurrentRole
- YearsSinceLastPromotion
- YearsWithCurrManager

<strong>👉 These belong to Fact Table (like “Employee Metrics Fact”) along with a foreign key to Employee Dimension.</strong>

### ✅ Dimensions (descriptive/contextual attributes – used to filter/group facts)

- Age → descriptive (group employees into age bands).
- Attrition → categorical (Yes/No).
- BusinessTravel → categorical.
- Department → categorical.
- DistanceFromHome → descriptive, often grouped.
- Education → categorical (e.g., 1–5 → "Bachelor’s, Master’s").
- EducationField → categorical.
- EmployeeCount → not useful (always 1 per row, can ignore).
- EmployeeNumber → employee identifier (Primary Key).
- EnvironmentSatisfaction → categorical (rating 1–4).
- Gender → categorical.
- JobInvolvement → categorical.
- JobLevel → categorical (1–5).
- JobRole → categorical.
- JobSatisfaction → categorical (rating 1–4).
- MaritalStatus → categorical.
- Over18 → categorical (constant in IBM dataset → ignore).
- OverTime → categorical (Yes/No).
- RelationshipSatisfaction → categorical.
- StockOptionLevel → categorical.
- WorkLifeBalance → categorical.

<strong> 👉 These belong to Dimension Tables (like Employee, Department, Demographics). </strong>

## 🔢 Why Facts are Numeric & Additive

### Facts = the “things you measure.”

- They usually answer questions like:
    - How much?
    - How many?
    - What is the total?

### 📊 Example (E-commerce):

- Sales Amount = ₹500
- Quantity Sold = 3

<strong> 👉 These values are numbers because we want to calculate, sum, or average them. </strong>

#### 📈 They’re additive because you can combine them across time or categories:
- Sales from Jan + Sales from Feb = Total Sales (makes sense ✅)
- Quantity sold by Customer A + Customer B = Total Quantity (makes sense ✅)
- That’s why Facts = numeric, additive measures.

## 🏷 Why Dimensions are Descriptive/Contextual

### Dimensions = the “details that explain the facts.”

- They usually answer questions like:
    - Who?
    - What?
    - When?
    - Where?

### 📌 Example (E-commerce):

- Product Name = "iPhone 15"
- Customer Age = 28
- Order Date = "2025-09-02"

<strong>👉 These values are descriptions, categories, or labels, not something you sum up.</strong>

- If you add them, it makes no sense:

    - "Customer A" + "Customer B" ≠ "Customer C" ❌
    - "Product A" + "Product B" ≠ 2 Products ❌

-  Instead, we use them to slice and dice facts:
    - Sales by product
    - Sales by customer age group
    - Sales by month


<strong> That’s why Dimensions = descriptive/contextual attributes. </strong>

#### ✅ In short:

- Facts = numbers you calculate with.
- Dimensions = labels you filter or group by.