| Function                    | Meaning  |
| --------------------------- | ----------------------------- |
| `df['col'].sum()`           | total          |
| `df['col'].mean()`          | average  value             |
| `df['col'].median()`        | median value            |
| `df['col'].min()` / `max()` | minimum / maximum             |
| `df['col'].std()`           | standard deviation (spread)   |
| `df['col'].count()`         | non-null count                |
| `df['col'].nunique()`       | unique value count       |

| Function             | Kaay karto?                                |
| -------------------- | ------------------------------------------ |
| `idxmax()`           | Max **index** (like `argmax` in NumPy) |
| `idxmin()`           | Min index                              |
| `df['col'].argmax()` | Similar to `idxmax()` (but returns int)    |


| Function         | Meaning                            |
| ---------------- | -------------------------------------- |
| `value_counts()` | count of each value in the dataframe |
| `unique()`       | unique values                    |
| `isin([..])`     | check if value belongs to a set        |


| Function                 | Meaning              |
| ------------------------ | ------------------------- |
| `isnull()` / `notnull()` | missing values   |
| `dropna()`               | missing rows  |
| `fillna(value)`          | replace missing values  |


| Function              | Meaning       |
| --------------------- | ------------------- |
| `rename(columns={})`  | column name change  |
| `drop('col', axis=1)` | remove column |
| `astype()`            | type change         |


In [1]:
import pandas as pd

In [3]:
df=pd.DataFrame({'Name':['Alice', 'Bob', 'Charlie', 'Bob', 'Alice', 'David'],
                 'Age':[25, 30, 35, 30, 25, 40],
                 'Score':[88, 92, 85, 92, 88, 95]})

In [4]:
df.head()

Unnamed: 0,Name,Age,Score
0,Alice,25,88
1,Bob,30,92
2,Charlie,35,85
3,Bob,30,92
4,Alice,25,88


# How many unique names are there in the Name column?

In [7]:
df['Name'].nunique()  # Tells the count of unique names in the dataframe

4

# What is the most frequent name and how many times does it occur?

In [18]:
frequent_name=df['Name'].value_counts().idxmax() # To print the first occurance of max value
# Both Alice and Bob have equal count but idxmax() will return the value of the index which comes first (depending on the order)
# Alice comes first and then Bob . Alice is returned
print(frequent_name)
df['Name'].value_counts().max() # To print how many times it occured

Alice


2

# What is the average Score for each Name? 

In [26]:
df.groupby('Name').agg({'Score':'mean'})

Unnamed: 0_level_0,Score
Name,Unnamed: 1_level_1
Alice,88.0
Bob,92.0
Charlie,85.0
David,95.0


# Which name has the highest average Score?

In [29]:
df.groupby('Name').agg({'Score':'mean'}).idxmax()

Score    David
dtype: object

# Show all rows where Name is either 'Alice' or 'Bob'

In [31]:
df[df['Name'].isin(['Alice','Bob'])]

Unnamed: 0,Name,Age,Score
0,Alice,25,88
1,Bob,30,92
3,Bob,30,92
4,Alice,25,88


# Find the Name(s) who have the maximum Score

In [37]:
df[df['Score']==df['Score'].max()]  # Will find max score for everyone

Unnamed: 0,Name,Age,Score
5,David,40,95


In [34]:
df.groupby('Name').agg({'Score':'sum'}).idxmax() # will return names who have maximum Score

Score    Bob
dtype: object

# Get all the distinct scores from the Score column

In [35]:
df['Score'].unique()

array([88, 92, 85, 95], dtype=int64)