# Types of Data

Data is of two types:
1. Quantitative
2. Qualitative

![image.png](attachment:image.png)

## Quantitative vs Qualitative Data

In statistics, **data** is information collected for analysis.  
It can be broadly classified into **Quantitative** and **Qualitative** data.

---

### 1. Quantitative Data

- **Definition**: 
Data that can be **measured and expressed numerically**.  

- **Key Features**:
  - Can perform arithmetic operations (add, subtract, average).  
  - Usually represented as numbers.  
  - Can be further classified into:
    - **Discrete** → Countable values (e.g., number of students, cars).  
    - **Continuous** → Measurable values (e.g., height, weight, temperature).  

- **Examples:**

    - Age of employees: 25, 30, 40  
    - Number of items sold: 10, 20, 50  
    - Temperature in Celsius: 36.5, 37.2  

- **Uses:**
    - Statistical calculations like mean, median, standard deviation.  
    - Useful for trend analysis, forecasting, and numerical comparisons.  

---

### 2. Qualitative Data

- **Definition**: 
Data that **describes attributes or categories** and is **non-numeric**.  

- **Key Features**:
  - Cannot perform arithmetic operations.  
  - Usually represented as words, labels, or categories.  
  - Can be further classified into:
    - **Nominal** → Categories without order (e.g., gender, color, city).  
    - **Ordinal** → Categories with a natural order (e.g., ratings: poor, good, excellent).  

- **Examples:**
- Customer satisfaction: "Satisfied" ,"Neutral", "Dissatisfied"  
- Eye color: Blue, Brown, Green  
- Product type: Electronics, Clothing, Food  

- **Uses:**
- Understanding preferences, behaviors, and patterns.  
- Useful for surveys, polls, and categorical analysis.  

---

### 3. Comparison Table

| Feature                 | Quantitative Data                  | Qualitative Data                   |
|-------------------------|----------------------------------|----------------------------------|
| Nature                  | Numerical                        | Categorical/Descriptive           |
| Arithmetic operations   | Possible                         | Not possible                       |
| Subtypes                | Discrete, Continuous             | Nominal, Ordinal                  |
| Example                 | Age, Salary, Temperature         | Gender, Brand, Rating             |
| Visualization           | Histogram, Boxplot, Line chart   | Pie chart, Bar chart              |
| Use Case                | Trend analysis, Forecasting      | Understanding behavior, preference |


## *Types of Quantitative Data*

In statistics, **quantitative data** (numerical data) can be further classified into **Discrete** and **Continuous** data.  

---

### 1. Discrete Data

- **Definition**: 
Discrete data consists of **countable, separate values**.  

- **Key Features**:
  - Can take **only specific values** (integers).  
  - Often obtained by **counting**.  
  - Cannot take fractional values between two points.  

- **Examples**:
  - Number of students in a class: 25, 30, 40  
  - Number of cars in a parking lot: 5, 12, 20  
  - Number of defective items in a batch: 0, 1, 2, 3  

- **Visualization**: Bar chart, Pie chart  

---

### 2. Continuous Data

- **Definition**: Continuous data can take **any value within a given range**.  

- **Key Features**:
  - Obtained by **measuring**.  
  - Can have **fractional/decimal values**.  
  - Can take infinite possible values between any two points.  

- **Examples**:
  - Height of students: 160.5 cm, 165.2 cm  
  - Weight: 55.3 kg, 60.7 kg  
  - Temperature: 36.5°C, 37.8°C  

- **Visualization**: Histogram, Line chart, Boxplot  

---

### 3. Comparison Table

| Feature              | Discrete Data                     | Continuous Data                    |
|---------------------|-----------------------------------|-----------------------------------|
| Nature              | Countable numbers                 | Measurable numbers                |
| Values              | Specific, separate integers       | Any value in a range (including decimals) |
| Source              | Counting                           | Measuring                         |
| Examples            | Number of students, cars, votes  | Height, weight, temperature       |
| Visualization       | Bar chart, Pie chart              | Histogram, Line chart, Boxplot    |
| Fractional Values   | Not possible                       | Possible                           |

---

### 4. Summary
- **Discrete data** → Countable, separate, integers  
- **Continuous data** → Measurable, infinite possible values, can include fractions  


## *Types of Qualitative Data*

In statistics, **quantitative data** (numerical data) can be further classified into **Discrete** and **Continuous** data.  

---

### 1. Discrete Data

- **Definition**: Discrete data consists of **countable, separate values**.  

- **Key Features**:
  - Can take **only specific values** (integers).  
  - Often obtained by **counting**.  
  - Cannot take fractional values between two points.  

- **Examples**:
  - Number of students in a class: 25, 30, 40  
  - Number of cars in a parking lot: 5, 12, 20  
  - Number of defective items in a batch: 0, 1, 2, 3  

- **Visualization**: Bar chart, Pie chart  

---

### 2. Continuous Data

- **Definition**: Continuous data can take **any value within a given range**.  

- **Key Features**:
  - Obtained by **measuring**.  
  - Can have **fractional/decimal values**.  
  - Can take infinite possible values between any two points.  

- **Examples**:
  - Height of students: 160.5 cm, 165.2 cm  
  - Weight: 55.3 kg, 60.7 kg  
  - Temperature: 36.5°C, 37.8°C  

- **Visualization**: Histogram, Line chart, Boxplot  

---

### 3. Comparison Table

| Feature              | Discrete Data                     | Continuous Data                    |
|---------------------|-----------------------------------|-----------------------------------|
| Nature              | Countable numbers                 | Measurable numbers                |
| Values              | Specific, separate integers       | Any value in a range (including decimals) |
| Source              | Counting                           | Measuring                         |
| Examples            | Number of students, cars, votes  | Height, weight, temperature       |
| Visualization       | Bar chart, Pie chart              | Histogram, Line chart, Boxplot    |
| Fractional Values   | Not possible                       | Possible                           |

---

### 4. Summary
- **Discrete data** → Countable, separate, integers  
- **Continuous data** → Measurable, infinite possible values, can include fractions  


# Scale of measurements 

1. Nominal Scale Data
2. Ordinal Scale Data
3. Interval Scale Data
4. Ratio Scale Data

![image.png](attachment:image.png)

## 1. Nominal Scale Data:

- Just naming categories. Example – Gender (Male, Female, Other) or Payment Method (UPI, Card, Cash). 
- **Nominal data** is the simplest form of measurement scale.
- It is used for **categorical variables** where values are just labels or names.
- There is **no order or ranking** between categories.

- Examples:
  - Gender: Male, Female, Other
  - Colors: Red, Blue, Green
  - Payment Method: Cash, UPI, Card

- **Example Table**

| Customer ID | Payment Method |
|-------------|----------------|
| 101         | UPI            |
| 102         | Cash           |
| 103         | Card           |
| 104         | UPI            |
| 105         | Cash           |


- **Key Points:**
  - Numbers (if used) are only for labeling, not for mathematical meaning.  
    Example: 1 = UPI, 2 = Cash, 3 = Card.  
  - You cannot calculate mean or median, but you can find **mode** (most frequent category) or count frequencies.


## 2. Ordinal Scale Data

- **Ordinal data** shows categories **with a clear order or ranking**.  
- The difference between categories is **not measurable exactly**, only the order matters.  
- Examples:  
  - Customer satisfaction: Poor, Average, Good, Excellent  
  - Education Level: High School, Graduate, Postgraduate, PhD  
  - Movie Ratings: 1 Star, 2 Stars, 3 Stars, 4 Stars, 5 Stars  

- **Example Table**

| Customer ID | Satisfaction Level |
|-------------|---------------------|
| 201         | Poor                |
| 202         | Good                |
| 203         | Excellent           |
| 204         | Average             |
| 205         | Good                |

- **Key Points:**
    - Categories have **meaningful order** (Excellent > Good > Average > Poor).  
    - We can identify **highest/lowest** and calculate **median or mode**.  
    - Exact differences between categories (like gap between "Good" and "Excellent") are **not precise** and cannot be measured.


## 3. Interval Scale Data

- **Interval data** represents numbers where the **difference between values is meaningful**, which means rank and order has a meaning.
- Difference can be measured.
- However, there is **no true zero point** (zero does not mean absence).  
- It doesn't have '0' starting value compulsorily.

- Examples:  
  - Temperature in Celsius or Fahrenheit (e.g.,-24°C, -5℃, 10°C, 20°C, 30°C)  
  - Calendar years (e.g., 1990, 2000, 2010)  
  - IQ scores  
  - Revenue of the company

- **Example Table**

| City      | Temperature (°C) |
|-----------|------------------|
| Delhi     | 20               |
| Mumbai    | 25               |
| Chennai   | 30               |
| Kolkata   | 15               |
| Bangalore | 18               |

- **Key Points:**
    - We can calculate **mean, median, mode, and standard deviation**.  
    - We can compare **differences** (e.g., Mumbai is 5°C warmer than Delhi).  
    - But **ratios don’t make sense** (30°C is not “twice as hot” as 15°C).  
    - Zero does not indicate “no temperature”, it’s just another point on the scale.


## 4. Ratio Scale Data

- **Ratio data** is the most informative scale.  
- It has:  
  - **Equal intervals** between values  
  - A **true zero point** (zero means complete absence)  
  - Supports **all mathematical operations** (addition, subtraction, multiplication, division)  

- Examples:  
  - Height, Weight, Age  
  - Income, Sales, Price  
  - Distance, Time taken  

- **Example Table**

| Person | Age (Years) | Monthly Income (₹) |
|--------|-------------|---------------------|
| A      | 25          | 30,000              |
| B      | 30          | 45,000              |
| C      | 35          | 60,000              |
| D      | 40          | 80,000              |
| E      | 50          | 1,00,000            |

- **Key Points:**
    - **All statistical measures** (mean, median, mode, variance, standard deviation, correlation, regression) are valid.  
    - Ratios make sense → (Person C earns **twice** as much as Person A).  
    - Zero is absolute → (₹0 means **no income**, Age 0 means **newborn**).  
    - Ratio scale is the **strongest level of measurement**.


### Difference Between Nominal, Ordinal, Interval, and Ratio Data

| Feature                | Nominal                     | Ordinal                         | Interval                          | Ratio                          |
|-------------------------|-----------------------------|----------------------------------|-----------------------------------|--------------------------------|
| **Meaning**            | Categories (labels)        | Categories with order           | Numeric values with equal intervals | Numeric values with equal intervals & true zero |
| **Order / Ranking**    | ❌ No order                 | ✅ Order exists                  | ✅ Order exists                   | ✅ Order exists                |
| **Measurebale Difference**    | ❌ No mes. dif                | ❌ No mes. dif                  | ✅ Mes dif exists                   | ✅ Mes Dif exists                 |
| **Equal Intervals**    | ❌ Not applicable           | ❌ Not guaranteed                | ✅ Yes                            | ✅ Yes                         |
| **True Zero Point**    | ❌ No                       | ❌ No                            | ❌ No                             | ✅ Yes                         |
| **Examples**           | Gender, Blood Group        | Satisfaction Level (Low/High)   | Temperature in °C, Years          | Income, Weight, Height, Age    |
| **Allowed Operations** | Count, Mode                | Median, Rank, Non-parametric tests | Mean, Median, Std. Dev., Difference | All operations (Mean, Ratios, % comparisons) |


### Combined Use Case: Customer Data in an E-commerce Company  

This dataset includes **all four types of data scales** in one table.  

| Customer ID | Gender (Nominal) | Satisfaction Level (Ordinal) | Join Year (Interval) | Monthly Income (Ratio) |
|-------------|------------------|------------------------------|-----------------------|-------------------------|
| C001        | Female           | High                         | 2018                  | 30,000                  |
| C002        | Male             | Medium                       | 2020                  | 45,000                  |
| C003        | Female           | Low                          | 2019                  | 25,000                  |
| C004        | Male             | High                         | 2021                  | 60,000                  |
| C005        | Female           | Medium                       | 2017                  | 80,000                  |

**Explanation of Scales in this Table:**
    - **Nominal (Gender):** Categories with no numeric meaning (Male, Female).  
    - **Ordinal (Satisfaction Level):** Ordered categories (Low < Medium < High).  
    - **Interval (Join Year):** Difference between years is meaningful (2020–2018 = 2 years), but no true zero.  
    - **Ratio (Monthly Income):** Has a true zero and supports all math (₹60,000 is twice ₹30,000).  
