# Pandas

In [1]:
import pandas as pd


````python
# Colonnes
df.columns

# Toutes les valeurs
df.values

# Les index
df.index

# Nombre de valeurs nulles par colonnes
df.isna().sum()

# Dictionnaires du nombres de valeurs différentes par colonnes
def count_unique_values(df):
    unique_counts = {}
    for column in df.columns:
        unique_counts[column] = df[column].nunique()
    return unique_counts

count_unique_values(df)
````

## Using `data.iterrows()`

- The `data.iterrows()` function in pandas is utilized for iterating over DataFrame rows as (index, Series) pairs. 
- This iterator yields `the index` of the row and the `data` in each row as a pandas Series. 
- It's particularly useful for scenarios where you need to perform operations on each row individually.

### How it works:

When you call `data.iterrows()`, it returns an iterator yielding pairs (index, Series) for each row in the DataFrame. The index is the row index in the DataFrame, and the Series contains the row data.

### Example Usage:

```python
# Sample DataFrame
data = pd.DataFrame({
    'A': [1, 2, 3],
    'B': ['a', 'b', 'c']
})

# Iterating over rows
for i, row in data.iterrows():
    print(f"Index: {i}")
    print(f"Row:\n{row}\n")
```

```
Index: 0
Row:
A    1
B    a
Name: 0, dtype: object

Index: 1
Row:
A    2
B    b
Name: 1, dtype: object

Index: 2
Row:
A    3
B    c
Name: 2, dtype: object
```

```python
# Sample DataFrame with city names
data = pd.DataFrame({
    'city1': ['New York', 'Los Angeles', 'Chicago'],
    'city2': ['Boston', 'San Francisco', 'Houston']
})

# Iterating over rows using iterrows()
for i, row in data.iterrows():
    city1 = row['city1']
    print(f"Row {i} - City1: {city1}")
```

```
Row 0 - City1: New York
Row 1 - City1: Los Angeles
Row 2 - City1: Chicago
```

### Performance:

- `iterrows()` can be slower on large datasets due to its row-wise operation and returning each row as a Series. For operations that can be performed on the entire dataset at once, **vectorized solutions** or methods like `apply()` are recommended for efficiency.

### Data Types:

- Each row is returned as a **Series** by `iterrows()`, which means if your DataFrame contains data of different types, each data type is preserved in the iteration. This can be useful when you need to maintain data type integrity for each column in the row.

### Best Use:

- `iterrows()` is best utilized for more complex operations that **cannot be easily vectorized**, especially when the readability of iterating over rows is preferred over performance. It offers a straightforward way to loop through rows for custom operations, making it a valuable tool for specific scenarios where row-wise manipulation is necessary.

