# Indexing and slicing in Pandas:

Pandas offers versatile indexing capabilities, allowing for both label-based and positional-based indexing. You can access individual elements using index labels or slices, similar to Python lists. Additionally, Pandas supports boolean indexing, where you can filter data based on conditional expressions, facilitating complex data selection operations.

# Example Usage:

In [28]:
import pandas as pd

# Sample data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['London', 'New York', 'Paris', 'Paris', 'Sydney'],
    'Salary': [60000, 75000, 80000, 70000, 65000]
}

In [29]:
# Creating a DataFrame
df = pd.DataFrame(data)


**DataFrame Creation:** We create a Pandas DataFrame named 'df' from the provided dictionary 'data'. This DataFrame contains information about employees, including their names, ages, cities, and salaries.

In [30]:
# Displaying the DataFrame
df

Unnamed: 0,Name,Age,City,Salary
0,Alice,25,London,60000
1,Bob,30,New York,75000
2,Charlie,35,Paris,80000
3,David,28,Paris,70000
4,Emma,32,Sydney,65000


**Displaying the DataFrame:** We print the DataFrame to visualize its contents. This helps us understand the structure of the data and verify that it has been loaded correctly.

In [31]:
# Indexing and Slicing
print("Accessing data using index labels:")
df.loc[1:3, 'Name':'City']  # Using index labels for rows and columns

Accessing data using index labels:


Unnamed: 0,Name,Age,City
1,Bob,30,New York
2,Charlie,35,Paris
3,David,28,Paris


**Indexing and Slicing:** Using index labels (loc): We use the loc accessor to access rows and columns using their index labels. For example, df.loc[1:3, 'Name':'City'] selects rows 1 to 3 and columns 'Name' to 'City'.

In [32]:
print("Accessing data using integer positions:")
df.iloc[1:3, 0:3]  # Using integer positions for rows and columns


Accessing data using integer positions:


Unnamed: 0,Name,Age,City
1,Bob,30,New York
2,Charlie,35,Paris


We use the iloc accessor to access rows and columns using their integer positions. For example, df.iloc[1:3, 0:3] selects rows 1 to 2 and columns at integer positions 0 to 2.

In [33]:
print("Boolean indexing:")
df.loc[1:3, [True, True, False, True]]  # Using boolean values for rows and/or columns

Boolean indexing:


Unnamed: 0,Name,Age,Salary
1,Bob,30,75000
2,Charlie,35,80000
3,David,28,70000


We can also use booleans when indexing to hide or show rows or columns.

In [34]:
print("Filtering:")
df[df['Salary'] > 70000]  # Filtering data based on a condition

Filtering:


Unnamed: 0,Name,Age,City,Salary
1,Bob,30,New York,75000
2,Charlie,35,Paris,80000


We demonstrate filtering the DataFrame based on a condition. In this case, we select rows where the 'Salary' column has a value greater than 70000 using df[df['Salary'] > 70000]. This operation returns a DataFrame containing only the rows that meet the specified condition.

# Assigment

In this lab exercise, you will explore a dataset containing information about employees in a company using Pandas. You will perform various data manipulation and analysis tasks to gain insights into the employee data.

**Dataset:**
You are provided with a dictionary containing employee information:

In [35]:
# Sample data
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emma'],
    'Age': [25, 30, 35, 28, 32],
    'City': ['London', 'New York', 'Paris', 'Paris', 'Sydney'],
    'Salary': [60000, 75000, 80000, 70000, 65000]
}

## Tasks:

**DataFrame Creation:**
- Create a Pandas DataFrame named 'employees' from the provided dictionary 'data'.

**Indexing and Slicing:**
- Access the last 2 rows and the 'Name' and 'Salary' columns using both label-based and positional-based indexing.
- Extract the data for employees with indices 0 to 2 and all columns using integer-based indexing.
**Boolean Indexing:**
- Filter the data to include only employees aged 30 or above.

In [36]:
import pandas as pd

In [37]:
employees = pd.DataFrame(data)

In [38]:
employees

Unnamed: 0,Name,Age,City,Salary
0,Alice,25,London,60000
1,Bob,30,New York,75000
2,Charlie,35,Paris,80000
3,David,28,Paris,70000
4,Emma,32,Sydney,65000


In [39]:
print("Last 2 rows and Name and Salary columns using label-based:")
employees.loc[3:4, ['Name', 'Salary']]

Last 2 rows and Name and Salary columns using label-based:


Unnamed: 0,Name,Salary
3,David,70000
4,Emma,65000


In [40]:
print("Last 2 rows and Name and Salary columns using integer-based: ")
employees.iloc[3:5, [0, 3]]

Last 2 rows and Name and Salary columns using integer-based: 


Unnamed: 0,Name,Salary
3,David,70000
4,Emma,65000


In [41]:
print("Employees from indices 0 to 2 and all columns using integer-based: ")
employees.iloc[0:2, 0:5]

Employees from indices 0 to 2 and all columns using integer-based: 


Unnamed: 0,Name,Age,City,Salary
0,Alice,25,London,60000
1,Bob,30,New York,75000


In [42]:
print("Filter by age older than or equal to 30: ")
employees[employees['Age'] >= 30]

Filter by age older than or equal to 30: 


Unnamed: 0,Name,Age,City,Salary
1,Bob,30,New York,75000
2,Charlie,35,Paris,80000
4,Emma,32,Sydney,65000
