#**Guided Lab 343.3.11 - Slicing Pandas Dataframe’s Data**

## **Lab Introduction:**
Pandas is a powerful Python library for data analysis and manipulation. One of its key features is the ability to slice data from DataFrames, allowing you to extract specific subsets of data for further analysis. This lab will guide you through the essential techniques for slicing Pandas DataFrames using the iloc and loc indexers.

**Why is slicing important?**

Data Exploration: Slicing lets you quickly examine specific parts of your data, such as individual rows, columns, or ranges of values.
Data Cleaning: You can use slicing to remove unwanted data or select only the data you need for your analysis.
Data Transformation: Slicing can be used to create new DataFrames with specific columns or rows, enabling you to reshape your data for different purposes.
Data Analysis: By isolating specific subsets of data, slicing allows you to perform targeted analyses on smaller portions of your dataset.

##**Learning Objective:**
By the end of this lab, learner will be able to:
- Slicing Pandas Dataframe’s Data
- Use the iloc indexer to slice data by row and column positions.
- Use the loc indexer to slice data by row and column labels.
- Select specific rows and columns from a DataFrame using both iloc and loc methods.
- Identify the first non-empty row in a Pandas Series or column using first_valid_index()
- Effectively manipulate and extract desired data subsets from Pandas DataFrames for analysis and further processing.


##**Instructions:**



##**Method #1: Slicing Dataframe using DataFrame.iloc[]**

**Example 1.1: Slicing by rows**

In the below example, we will slice:
- The only first row from the dataframe.
- The first four rows (from index 0 to 3) from the dataframe.


In [None]:
# importing pandas library
import pandas as pd

# Initializing the nested list with Data set
employee_list = [['James', 36, 75, 5428000],
               ['Villers', 38, 74, 3428000],
               ['VKole', 31, 70, 8428000],
               ['Smith', 34, 80, 4428000],
               ['Gayle', 40, 100, 4528000],
               ['Rooter', 33, 72, 7028000],
               ['Peterson', 42, 85, 2528000],
               ['John', 41, 85, 1528000],

]

# creating a pandas dataframe
df = pd.DataFrame(employee_list, columns=['Name', 'Age', 'Weight', 'Salary'])

print(' ------data frame before slicing-----')
print(df)

print(' ------ Select First Row by Index-----')
# Select First Row by Index
print(df.iloc[:1])

print(' ------ Select First 4 Row by Index-----')
# Slicing first 4 rows from dataframe
df1 = df.iloc[0:4]
# This above line used the iloc indexer to slice (select) the first 4 rows of the #original DataFrame (df) and assigns the result to a new DataFrame named df1.

print(' ------data frame after slicing----')
print(df1)


**Example 1.2 - Slicing by columns or index label**

In the below example, we will slice the columns from the data frame.




In [None]:
# Initializing the nested list with Data set
employee_list = [['James', 36, 75, 5428000],
               ['Villers', 38, 74, 3428000],
               ['VKole', 31, 70, 8428000],
               ['Smith', 34, 80, 4428000],
               ['Gayle', 40, 100, 4528000],
               ['Rooter', 33, 72, 7028000],
               ['Peterson', 42, 85, 2528000],
               ['John', 41, 85, 1528000]]

# creating a pandas dataframe
df = pd.DataFrame(employee_list, columns=['Name', 'Age', 'Weight', 'Salary'])

# data frame before slicing
print(df)
print( '====Slicing columns in dataframe======')

emp_df =df.iloc[:, 0:2]
print(emp_df)


**Example 1.3**

Select values from row index 0 to 2(exclusive) and column position 0 to 2(exclusive)






In [None]:
print(df1.iloc[0:2, 0:2])

## **Method #2 - Slicing Dataframe using DataFrame.loc[]**

Creating Demo Data Set for Dataframe



In [None]:
import pandas as pd

#create DataFrame with six columns
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'points': [18, 22, 19, 14, 14, 11, 20, 28],
                   'assists': [5, 7, 7, 9, 12, 9, 9, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12],
                   'steals': [4, 3, 3, 2, 5, 4, 3, 8],
                   'blocks': [1, 0, 0, 3, 2, 2, 1, 5]})

#view DataFrame
print(df)


**Example 2.1- Slice by Specific Column Names**

We can use the following syntax to create a new DataFrame that only contains the columns team and rebounds:



In [None]:
#slice columns team and rebounds
df_new = df.loc[:, ['team', 'rebounds']]

#view new DataFrame
print(df_new)


**Example 2.2 - Slice by Column Names in Range**

We can use the following example to create a new DataFrame that only contains the columns in the range between team and rebounds:




In [None]:

#slice columns between team and rebounds
df_new = df.loc[:, 'team':'rebounds']

#view new DataFrame
print(df_new)


**Example 2.3 - Select values from row index 0 to 2 and 'Name' column**

Select values from row index 0 to 2 and 'Name' column




In [None]:
print(df.loc[3:6, ['team']])


 ## **Example 3: Identify the first non-empty row in a Pandas Series or column**

To identify the first non-empty row in a Pandas Series or column, you can use the `first_valid_index()` method. This method returns the index label of the first non-null (non-empty) value in the Series. Here's how you can use it:

In [None]:
import pandas as pd

# Example Pandas Series
data = pd.Series([None, None, 5, 10, None, 20])

# Find the index label of the first non-empty row
first_non_empty_index = data.first_valid_index()

print("Index of the first non-empty row:", first_non_empty_index)
print("Value of the first non-empty row:", data[first_non_empty_index])


##**Submission Instructions**
- Submit your completed lab using the Start Assignment button on the assignment page in Canvas.
- Your submission can be include:
  - if you are using notebook then, all tasks should be written and submitted in a single notebook file, for example: (**your_name_labname.ipynb**).
  - if you are using python script file, all tasks should be written and submitted in a single python script file for example: **(your_name_labname.py)**.
- Add appropriate comments and any additional instructions if required.

