<a href="https://colab.research.google.com/github/ShilpaVasista/Exploratory-Data-Analytics/blob/main/nlargest_and_nsmallest.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



**Example DataFrame**

In [1]:
import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve', 'Frank', 'Grace'],
    'Age': [25, 30, 22, 35, 28, 40, 27],
    'Score': [85, 92, 78, 95, 88, 70, 90],
    'City': ['New York', 'London', 'Paris', 'Tokyo', 'Sydney', 'Berlin', 'Rome']
}

df = pd.DataFrame(data)
print("Original DataFrame:\n", df)

Original DataFrame:
       Name  Age  Score      City
0    Alice   25     85  New York
1      Bob   30     92    London
2  Charlie   22     78     Paris
3    David   35     95     Tokyo
4      Eve   28     88    Sydney
5    Frank   40     70    Berlin
6    Grace   27     90      Rome


**1. `nlargest(n, columns)`**

In [2]:
# Get the top 3 people with the highest scores
top_3_scores = df.nlargest(3, 'Score')
print("\nTop 3 Scores:\n", top_3_scores)

# Get the top 2 oldest people
top_2_oldest = df.nlargest(2, 'Age')
print("\nTop 2 Oldest:\n", top_2_oldest)


Top 3 Scores:
     Name  Age  Score    City
3  David   35     95   Tokyo
1    Bob   30     92  London
6  Grace   27     90    Rome

Top 2 Oldest:
     Name  Age  Score    City
5  Frank   40     70  Berlin
3  David   35     95   Tokyo


**2. `nsmallest(n, columns)`**

In [3]:
# Get the 2 youngest people
bottom_2_age = df.nsmallest(2, 'Age')
print("\nBottom 2 Youngest:\n", bottom_2_age)

# Get the 3 people with the lowest scores
bottom_3_scores = df.nsmallest(3, 'Score')
print("\nBottom 3 Scores:\n", bottom_3_scores)


Bottom 2 Youngest:
       Name  Age  Score      City
2  Charlie   22     78     Paris
0    Alice   25     85  New York

Bottom 3 Scores:
       Name  Age  Score      City
5    Frank   40     70    Berlin
2  Charlie   22     78     Paris
0    Alice   25     85  New York


**3. `sort_values(by, ascending=True)`**

In [4]:
# Sort by 'Score' in descending order and get the top 3
sorted_by_score_top3 = df.sort_values(by='Score', ascending=False).head(3)
print("\nSorted by Score (Top 3):\n", sorted_by_score_top3)

# Sort by 'Age' in ascending order
sorted_by_age = df.sort_values(by='Age')
print("\nSorted by Age:\n", sorted_by_age)

# Sort by 'City' (alphabetical order)
sorted_by_city = df.sort_values(by='City')
print("\nSorted by City:\n", sorted_by_city)


Sorted by Score (Top 3):
     Name  Age  Score    City
3  David   35     95   Tokyo
1    Bob   30     92  London
6  Grace   27     90    Rome

Sorted by Age:
       Name  Age  Score      City
2  Charlie   22     78     Paris
0    Alice   25     85  New York
6    Grace   27     90      Rome
4      Eve   28     88    Sydney
1      Bob   30     92    London
3    David   35     95     Tokyo
5    Frank   40     70    Berlin

Sorted by City:
       Name  Age  Score      City
5    Frank   40     70    Berlin
1      Bob   30     92    London
0    Alice   25     85  New York
2  Charlie   22     78     Paris
6    Grace   27     90      Rome
4      Eve   28     88    Sydney
3    David   35     95     Tokyo


**4. `head(n)`**

In [5]:
# Get the first 2 rows
first_2_rows = df.head(2)
print("\nFirst 2 Rows:\n", first_2_rows)


First 2 Rows:
     Name  Age  Score      City
0  Alice   25     85  New York
1    Bob   30     92    London


**5. `tail(n)`**

In [6]:
# Get the last 2 rows
last_2_rows = df.tail(2)
print("\nLast 2 Rows:\n", last_2_rows)


Last 2 Rows:
     Name  Age  Score    City
5  Frank   40     70  Berlin
6  Grace   27     90    Rome


**6. `loc[]` (Filtering and Selecting)**

In [7]:
# Get people with scores greater than 90
high_scorers = df.loc[df['Score'] > 90]
print("\nHigh Scorers:\n", high_scorers)

# Get people who are older than 30 and live in Tokyo
specific_people = df.loc[(df['Age'] > 30) & (df['City'] == 'Tokyo')]
print("\nSpecific People:\n", specific_people)


High Scorers:
     Name  Age  Score    City
1    Bob   30     92  London
3  David   35     95   Tokyo

Specific People:
     Name  Age  Score   City
3  David   35     95  Tokyo


**7. `query(expr)`**

In [8]:
# Get people with scores greater than 90 using query
high_scorers_query = df.query('Score > 90')
print("\nHigh Scorers (Query):\n", high_scorers_query)

# Get people who are older than 30 and live in Tokyo using query
specific_people_query = df.query('Age > 30 and City == "Tokyo"')
print("\nSpecific People (Query):\n", specific_people_query)


High Scorers (Query):
     Name  Age  Score    City
1    Bob   30     92  London
3  David   35     95   Tokyo

Specific People (Query):
     Name  Age  Score   City
3  David   35     95  Tokyo


**Key Takeaways**

* `nlargest` and `nsmallest` are concise for getting top/bottom `n` based on a single column.
* `sort_values` is more versatile for sorting by multiple columns and controlling order.
* `head` and `tail` are for quickly viewing portions of the DataFrame.
* `loc` and `query` are powerful for filtering and selecting based on conditions.