## Different ways to iterate over rows in Pandas Dataframe

https://www.geeksforgeeks.org/different-ways-to-iterate-over-rows-in-pandas-dataframe/

#### Discussion on StackOverflow about performance of iterating rows of dataframe

https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas

https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas

In [5]:
import pandas as pd

# Define a dictionary containing students data
data = {
    "Name": ["Ankit", "Amit", "Aishwarya", "Priyanka"],
    "Age": [21, 19, 20, 18],
    "Stream": ["Math", "Commerce", "Arts", "Biology"],
    "Percentage": [88, 92, 95, 70]
}

# Convert the dictionary into DataFrame
df = pd.DataFrame(data)

print("Given Dataframe :\n", df)

Given Dataframe :
         Name  Age    Stream  Percentage
0      Ankit   21      Math          88
1       Amit   19  Commerce          92
2  Aishwarya   20      Arts          95
3   Priyanka   18   Biology          70


#### Method #1 : Using index attribute of the Dataframe

In [9]:
help(pd.DataFrame.index)

Help on AxisProperty:

    The index (row labels) of the DataFrame.



In [15]:
print(type(df.index))
print(df.index)
for i in df.index:
    print(i)

<class 'pandas.core.indexes.range.RangeIndex'>
RangeIndex(start=0, stop=4, step=1)
0
1
2
3


In [7]:
print("Iterating over rows using index attributte :\n")

for index in df.index:
    print(df["Name"][index], df["Stream"][index])

Iterating over rows using index attributte :

Ankit Math
Amit Commerce
Aishwarya Arts
Priyanka Biology


#### Method #2 : Using loc[] function of the Dataframe

In [None]:
# Refer to help doc for more detailed examples.
# It also supports multi-indxing with tuple structure.
help(pd.DataFrame.loc)

In [21]:
# Single label for row and column
for index in range(len(df)):
    print(df.loc[index, "Name"], df.loc[index, "Age"])

Ankit 21
Amit 19
Aishwarya 20
Priyanka 18


In [22]:
# Single label. Note this returns the row as a Series.
print(df.loc[1])

Name              Amit
Age                 19
Stream        Commerce
Percentage          92
Name: 1, dtype: object


In [24]:
# List of labels. Note using ``[[]]`` returns a DataFrame
partial_df = df.loc[[1, 2]]
print(partial_df)

        Name  Age    Stream  Percentage
1       Amit   19  Commerce          92
2  Aishwarya   20      Arts          95


In [30]:
# Slice with labels for row and single label for column. As mentioned
#  above, note that both the start and stop of the slice are included.

print(df.loc[1:3, "Name":"Stream"])

        Name  Age    Stream
1       Amit   19  Commerce
2  Aishwarya   20      Arts
3   Priyanka   18   Biology


In [32]:
# Boolean list with the same length as the row axis
print(df.loc[[False, False, False, True]])

       Name  Age   Stream  Percentage
3  Priyanka   18  Biology          70


In [33]:
# Conditional that returns a boolean Series
print(df.loc[df.Age > 18])

        Name  Age    Stream  Percentage
0      Ankit   21      Math          88
1       Amit   19  Commerce          92
2  Aishwarya   20      Arts          95


In [35]:
# Conditional that returns a boolean Series with column labels specified
print(df.loc[df.Age > 18, "Name"])

0        Ankit
1         Amit
2    Aishwarya
Name: Name, dtype: object


#### Method #3 : Using iloc[] function of the Dataframe

In [None]:
# Refer to help doc for more detailed examples.
help(pd.DataFrame.iloc)

In [40]:
for row_index in range(len(df)):
    row_string = ""
    for colunm_index in range(len(df.index)):
        row_string += str(df.iloc[row_index, colunm_index]) + " "
    print(row_string)

Ankit 21 Math 88 
Amit 19 Commerce 92 
Aishwarya 20 Arts 95 
Priyanka 18 Biology 70 


#### Method #4 : Using iterrows() method of the Dataframe

In [None]:
# Refer to help doc for more detailed examples.

# iterrows(self) -> 'Iterable[Tuple[Label, Series]]'
    # Iterate over DataFrame rows as (index, Series) pairs.
help(pd.DataFrame.iterrows)

In [49]:
for index, row in df.iterrows():
    print(row["Name"], row["Age"])

Ankit 21
Amit 19
Aishwarya 20
Priyanka 18


#### Method #5 : Using itertuples() method of the Dataframe

In [None]:
# Refer to help doc for more detailed examples.

# itertuples(self, index: 'bool' = True, name: 'Optional[str]' = 'Pandas')
#     Iterate over DataFrame rows as namedtuples.

# Parameters
#     ----------
#     index : bool, default True
#         If True, return the index as the first element of the tuple.
#     name : str or None, default "Pandas"
#         The name of the returned namedtuples or None to return regular
#         tuples.
    
#     Returns
#     -------
#     iterator
#         An object to iterate over namedtuples for each row in the
#         DataFrame with the first field possibly being the index and
#         following fields being the column values.
help(pd.DataFrame.itertuples)

In [53]:
for student in df.itertuples(index=False, name="Student"):
    print(student.Name, student.Age)

Ankit 21
Amit 19
Aishwarya 20
Priyanka 18


#### Method #6 : Using apply() method of the Dataframe

In [None]:
# Refer to help doc for more detailed examples.

help(pd.DataFrame.apply)

In [58]:
# Apply the function against the row axis=0
print(df.apply(lambda row: row[1], axis=0))

Name              Amit
Age                 19
Stream        Commerce
Percentage          92
dtype: object


In [60]:
# Apply the function against the colunm axis=1
print(df.apply(lambda row: row["Name"] + " " + str(row["Percentage"]), axis=1))

0        Ankit 88
1         Amit 92
2    Aishwarya 95
3     Priyanka 70
dtype: object
