# Pandas Basics — Column & Row Selection

This notebook introduces **column selection, row selection, and filtering** in Pandas.

We’ll use the Titanic dataset as an example throughout.

---

In [1]:
import pandas as pd

url = 'https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv'
df = pd.read_csv(url)
df.head()

Unnamed: 0,PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
0,1,0,3,"Braund, Mr. Owen Harris",male,22.0,1,0,A/5 21171,7.25,,S
1,2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",female,38.0,1,0,PC 17599,71.2833,C85,C
2,3,1,3,"Heikkinen, Miss. Laina",female,26.0,0,0,STON/O2. 3101282,7.925,,S
3,4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35.0,1,0,113803,53.1,C123,S
4,5,0,3,"Allen, Mr. William Henry",male,35.0,0,0,373450,8.05,,S


## 🔹 Column Selection

1. **Single column** → returns a `Series`
```python
df['col']
```

2. **Multiple columns** → returns a `DataFrame`
```python
df[['col1', 'col2']]
```

⚠️ Avoid `df.col` (dot notation) since it may break for certain column names.


In [2]:
df['Name'].head()  # single column

Unnamed: 0,Name
0,"Braund, Mr. Owen Harris"
1,"Cumings, Mrs. John Bradley (Florence Briggs Th..."
2,"Heikkinen, Miss. Laina"
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)"
4,"Allen, Mr. William Henry"


In [3]:


df[['Name','Age']].head()  # multiple columns

Unnamed: 0,Name,Age
0,"Braund, Mr. Owen Harris",22.0
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",38.0
2,"Heikkinen, Miss. Laina",26.0
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",35.0
4,"Allen, Mr. William Henry",35.0


## 🔹 Row Selection

### 1. `.loc[]` → Label-based (index name)
```python
df.loc[2]
df.loc[2, 'Name']
df.loc[2:5, ['Name','Age']]
```

### 2. `.iloc[]` → Position-based (row/col numbers)
```python
df.iloc[0]
df.iloc[0, 1]
df.iloc[0:3, 0:2]
```


In [9]:
print(df.loc[0, ['Name','Age']])# label based

Name    Braund, Mr. Owen Harris
Age                        22.0
Name: 0, dtype: object


In [12]:

print(df.iloc[0:5, 0:5])       # position based

   PassengerId  Survived  Pclass  \
0            1         0       3   
1            2         1       1   
2            3         1       3   
3            4         1       1   
4            5         0       3   

                                                Name     Sex  
0                            Braund, Mr. Owen Harris    male  
1  Cumings, Mrs. John Bradley (Florence Briggs Th...  female  
2                             Heikkinen, Miss. Laina  female  
3       Futrelle, Mrs. Jacques Heath (Lily May Peel)  female  
4                           Allen, Mr. William Henry    male  


## 🔹 Filtering Rows (Conditional Selection)

1. **Single condition**
```python
df[df['Age'] > 30]
```

2. **Multiple conditions**
```python
df[(df['Age'] > 30) & (df['Sex'] == 'male')]
```

👉 Use `&` (AND), `|` (OR), `~` (NOT) with **parentheses**.


In [13]:
adults = df[df['Age'] >= 18]
print(adults.shape)

female_first_class = df[(df['Sex']=='female') & (df['Pclass']==1)]
female_first_class[['Name','Age']].head()

(601, 12)


Unnamed: 0,Name,Age
1,"Cumings, Mrs. John Bradley (Florence Briggs Th...",38.0
3,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",35.0
11,"Bonnell, Miss. Elizabeth",58.0
31,"Spencer, Mrs. William Augustus (Marie Eugenie)",
52,"Harper, Mrs. Henry Sleeper (Myna Haxtun)",49.0


## ✅ Best Practice

⚠️ Avoid chained indexing:
```python
df['col'][0]   # ❌ may cause errors
```

✔️ Use `.loc[]` or `.iloc[]` instead:
```python
df.loc[0, 'col']   # ✅ safe
```


## 📝 Exercises

1. Select the `Sex` and `Age` columns for the first 10 rows.
2. Find all passengers with `Fare > 100`.
3. Use `.iloc[]` to select the first 5 rows and first 3 columns.
4. Find all male passengers under the age of 18.
