<p>
    -

### **Q1: What is Data Encoding? How is it Useful in Data Science?**
**Data encoding** is the process of converting categorical data into numerical format so that machine learning models can process it.

**Usefulness in Data Science:**
- **Ensures compatibility** with ML models (which only understand numbers).
- **Improves model performance** by properly representing categorical features.
- **Reduces memory usage** when using efficient encoding techniques.

---

### **Q2: What is Nominal Encoding? Provide a Real-World Example.**
**Nominal encoding** (also called **Label Encoding**) assigns a unique integer to each category in a feature.

**Example:**  
For a fruit dataset:  
| Fruit  | Encoded Value |
|--------|--------------|
| Apple  | 0            |
| Banana | 1            |
| Mango  | 2            |

**Real-world Use Case:**  
Nominal encoding is useful when **categorical values have no intrinsic order** (e.g., customer ID, car brands, or types of food).

---

### **Q3: When is Nominal Encoding Preferred Over One-Hot Encoding? Provide a Practical Example.**
Nominal encoding is preferred when:
1. There are **many unique categories** (e.g., thousands of city names).
2. The categorical feature is **ordinal** (e.g., "low", "medium", "high").
3. **Memory efficiency** is needed (One-Hot Encoding creates many columns).

**Example:**  
In a **restaurant dataset** with 500+ cuisine types, using One-Hot Encoding would create 500 extra columns, making the dataset inefficient.

---

### **Q4: Encoding Categorical Data with 5 Unique Values**
If a categorical feature has **5 unique values**, the best encoding depends on:
- **If the feature is nominal (unordered)** → Use **One-Hot Encoding**.
- **If the feature is ordinal (ordered)** → Use **Ordinal Encoding**.

If using **One-Hot Encoding**, the feature will be transformed into **4 binary columns** (n-1 encoding to avoid multicollinearity).

---

### **Q5: Encoding a Dataset with 1000 Rows and 5 Columns**
- **Given:**  
  - 2 categorical columns.
  - 3 numerical columns.
- **Using Nominal Encoding (Label Encoding)**:
  - Each categorical feature is converted into **1 numerical column**.
  - **Total Columns After Encoding** = **5** (same as original).

However, if using **One-Hot Encoding**:
- Suppose **each categorical column has 4 unique values**.
- One-Hot Encoding would create **(4-1) + (4-1) = 6 new columns**.
- **Total Columns After Encoding = 3 + 6 = 9.**

---

### **Q6: Encoding Categorical Data for an Animal Dataset**
The dataset contains categorical features:  
- **Species (e.g., Lion, Elephant, Tiger)**
- **Habitat (e.g., Forest, Desert, Ocean)**
- **Diet (e.g., Carnivore, Herbivore, Omnivore)**

**Best Encoding Technique:**
- **One-Hot Encoding** → If the number of unique categories is small.
- **Target Encoding** → If the categorical variables are related to the target variable.

**Justification:**  
- Since species, habitat, and diet **have no natural order**, One-Hot Encoding is ideal.
- Target Encoding can be useful if we want to capture relationships between categories and the target variable.

---

### **Q7: Encoding for Customer Churn Prediction**
Dataset features:
1. **Gender (Categorical)**
2. **Age (Numerical)**
3. **Contract Type (Categorical)**
4. **Monthly Charges (Numerical)**
5. **Tenure (Numerical)**

**Encoding Steps:**
1. **Gender** (Binary Category: Male/Female) → Use **Label Encoding** (0 = Male, 1 = Female).
2. **Contract Type** (e.g., Month-to-month, Yearly, Two-Year) → Use **Ordinal Encoding** (since it has' **Share the public repository link as required.**

Let me know if you need further explanations or Python code examples! 🚀</p>

In [5]:
"""import pandas as pd
from sklearn.preprocessing import LabelEncoder, OrdinalEncoder

# Sample dataset
data = pd.DataFrame({
    'Gender': ['Male', 'Female', 'Female', 'Male'],
    'Contract Type': ['Month-to-month', 'Yearly', 'Two-Year', 'Yearly'],
    'Age': [25, 40, 35, 30],
    'Monthly Charges': [50, 80, 60, 90],
    'Tenure': [5, 24, 12, 36]
})

# Encoding Gender (Label Encoding)
label_encoder = LabelEncoder()
data['Gender'] = label_encoder.fit_transform(data['Gender'])

# Encoding Contract Type (Ordinal Encoding)
contract_mapping = {'Month-to-month': 0, 'Yearly': 1, 'Two-Year': 2}
data['Contract Type'] = data['Contract Type'].map(contract_mapping)
print(data)"""

"import pandas as pd\nfrom sklearn.preprocessing import LabelEncoder, OrdinalEncoder\n\n# Sample dataset\ndata = pd.DataFrame({\n    'Gender': ['Male', 'Female', 'Female', 'Male'],\n    'Contract Type': ['Month-to-month', 'Yearly', 'Two-Year', 'Yearly'],\n    'Age': [25, 40, 35, 30],\n    'Monthly Charges': [50, 80, 60, 90],\n    'Tenure': [5, 24, 12, 36]\n})\n\n# Encoding Gender (Label Encoding)\nlabel_encoder = LabelEncoder()\ndata['Gender'] = label_encoder.fit_transform(data['Gender'])\n\n# Encoding Contract Type (Ordinal Encoding)\ncontract_mapping = {'Month-to-month': 0, 'Yearly': 1, 'Two-Year': 2}\ndata['Contract Type'] = data['Contract Type'].map(contract_mapping)\nprint(data)"