# Pandas Advanced Quiz

---

### **1. How can you re-index a pandas DataFrame in Python?**

- **Options:**
  - [ ] by using index  
  - [x] by using re-index  
  - [ ] by using pandas  
  - [ ] none of the above  

**Explanation:**  
You can re-index a pandas DataFrame using the `reindex()` method or by assigning to the `index` property.  

---

### **2. What is the output of the following code?**

```python
import pandas as pd

data = {'name': ['John', 'Jane', 'Bob'], 'age': [25, 30, 35]}
df = pd.DataFrame(data)

df_reindexed = df.reindex([2, 1, 0])

print(df_reindexed)
```

- **Options:**
  - [ ] A DataFrame with rows in the original order  
  - [x] A DataFrame with rows in reverse order  
  - [ ] A DataFrame with rows sorted by age  
  - [ ] A DataFrame with columns in reverse order  

**Explanation:**  
The `reindex()` method is used to change the order of rows in the DataFrame. Passing a list of row labels in reverse order results in rows being reversed.

---

### **3. What is the output of the following code?**

```python
import pandas as pd

data = {'name': ['John', 'Jane', 'Bob'], 'age': [25, 30, 35]}
df = pd.DataFrame(data)

for index, row in df.iterrows():
    print(row['name'], row['age'])
```

- **Options:**
  - [x] John 25, Jane 30, Bob 35  
  - [ ] name John, age 25, name Jane, age 30, name Bob, age 35  
  - [ ] ['John', 25], ['Jane', 30], ['Bob', 35]  
  - [ ] None of the above  

**Explanation:**  
The `iterrows()` method iterates over rows of the DataFrame. Here, it prints the values of the 'name' and 'age' columns for each row.

---

### **4. What is the best way to iterate over the rows of a Pandas DataFrame?**

- **Options:**
  - [ ] Using a for loop to iterate through the rows by index  
  - [ ] Using the apply() method to apply a function to each row  
  - [x] Using the `iterrows()` method  
  - [ ] Using the `itertuples()` method  

**Explanation:**  
The `iterrows()` method is a convenient way to loop over rows of a DataFrame. However, it can be slow for large DataFrames.

---

### **5. What is the difference between the `iterrows()` and `itertuples()` methods?**

- **Options:**
  - [ ] The `iterrows()` method is faster but yields a Series, while the `itertuples()` method is slower but yields a named tuple.  
  - [ ] The `iterrows()` method yields a Series, while the `itertuples()` method yields a DataFrame.  
  - [x] The `iterrows()` method yields a Series, while the `itertuples()` method yields a named tuple.  
  - [ ] The `iterrows()` method yields a DataFrame, while the `itertuples()` method yields a Series.  

**Explanation:**  
The `iterrows()` method yields rows as a Series, while `itertuples()` yields rows as named tuples. `itertuples()` is faster but less flexible than `iterrows()`.

---

### **6. What is the output of the following code?**

```python
import pandas as pd

data = {'name': ['John', 'Jane', 'Bob'], 'age': [25, 30, 35]}
df = pd.DataFrame(data)

df['name_upper'] = df['name'].str.upper()

print(df)
```

- **Options:**
  - [x] A DataFrame with an extra column containing uppercase names  
  - [ ] A DataFrame with the 'name' column modified to contain uppercase names  
  - [ ] A DataFrame with an error  
  - [ ] None of the above  

**Explanation:**  
The `str.upper()` method converts the 'name' column to uppercase, and the result is assigned to a new column called `name_upper`.

---

### **7. How can you sort a pandas DataFrame by a specific column in ascending order?**

- **Options:**
  - [ ] `df.sort(column_name)`  
  - [x] `df.sort_values(column_name)`  
  - [ ] `df.sort_ascending(column_name)`  
  - [ ] `df.sort_up(column_name)`  

**Explanation:**  
The `sort_values()` method sorts a DataFrame by one or multiple columns. It sorts in ascending order by default.

---

### **8. Which of the following can be used to clean text data?**

- **Options:**
  - [ ] Removing special characters  
  - [ ] Converting all text to lowercase  
  - [ ] Removing stop words  
  - [x] All of the above  

**Explanation:**  
Common text cleaning steps include removing special characters, converting text to lowercase, and removing stop words.

---

### **9. What is the output of the following code?**

```python
import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie'], 'age': [30, 25, 40]}
df = pd.DataFrame(data)

df_subset = df.loc[1:2, 'name']

print(df_subset)
```

- **Options:**
  - [ ] A DataFrame with rows 1 and 2 and the 'name' column  
  - [ ] A DataFrame with the 'name' column for rows 1 and 2  
  - [x] A Series with the 'name' values for rows 1 and 2  
  - [ ] An error  

**Explanation:**  
The `loc[]` method selects a subset of rows and columns. Here, it selects rows 1 and 2 for the 'name' column, resulting in a Series.

---

### **10. What is the output of the following code?**

```python
import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie'], 'age': [30, 25, 40]}
df = pd.DataFrame(data)

max_age = df['age'].max()

print(max_age)
```

- **Options:**
  - [x] The maximum age  
  - [ ] The median age  
  - [ ] The mean age  
  - [ ] The mode age  

**Explanation:**  
The `max()` method computes the maximum value of the 'age' column, which is 40 in this case.

---

### **11. What is the output of the following code?**

```python
import pandas as pd

data = {'date': ['2022-01-01', '2022-02-01', '2022-03-01'], 'sales': [100, 200, 300]}
df = pd.DataFrame(data)

df['date'] = pd.to_datetime(df['date'])
df['month'] = df['date'].dt.month

print(df)
```

- **Options:**
  - [x] A DataFrame with an extra column containing the month  
  - [ ] A DataFrame with an error  
  - [ ] None of the above  

**Explanation:**  
The `to_datetime()` method converts the 'date' column to datetime format, and `dt.month` extracts the month, adding it to a new column called `month`.