**<h1><center>Python Pandas - Iteration</h1></center>**

When iterating Pandas data structures have behaves differently.When iterating over series it behaves like array produces values. When iterating over dataframe it behaves like dictionary produces keys of dataframe.

Basic iteration (for i in object) produces −

1. Series − values

2. DataFrame − column labels


<u>**Iteration over series**</u>

Iterating over series produces values.

In [None]:
import pandas as pd

list_values = [1,2,3,4,5]

series_values = pd.Series(list_values)

print(series_values)

for value in series_values:
    print("printing values", value)

0    1
1    2
2    3
3    4
4    5
dtype: int64
printing values 1
printing values 2
printing values 3
printing values 4
printing values 5


<u>**Iterating a DataFrame**</u>

Iterating a DataFrame gives column names. Let us consider the following example to understand the same.

In [None]:
import pandas as pd
import numpy as np

N=20
df = pd.DataFrame({
   'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
   'x': np.linspace(0,stop=N-1,num=N),
   'y': np.random.rand(N),
   'D': np.random.normal(100, 10, size=(N)).tolist()
   })

for col in df:
    print(col)

A
x
y
D


To iterate over the rows of the DataFrame, we can use the following functions −

- iteritems() − to iterate over the (key,value) pairs

- iterrows() − iterate over the rows as (index,series) pairs

- itertuples() − iterate over the rows as namedtuples
    
**1. iteritems() or items()**

This iteritems function gives key-value pair values where key is columns and value is a series object.

In [None]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(4,3),columns=['col1','col2','col3'])

print(df,"\n")

for key,value in df.iteritems():
    print("key as column", key)
    print("value as series object\n", value, "\n")
    print([i for i in value])

       col1      col2      col3
0 -0.080511  0.292706 -0.366647
1 -1.072506  1.247320 -1.635414
2  0.092438 -0.369984  2.307514
3 -0.673693 -0.881343  1.701580 

key as column col1
value as series object
 0   -0.080511
1   -1.072506
2    0.092438
3   -0.673693
Name: col1, dtype: float64 

[-0.08051139420045698, -1.072505650592466, 0.09243780939818382, -0.6736934665097092]
key as column col2
value as series object
 0    0.292706
1    1.247320
2   -0.369984
3   -0.881343
Name: col2, dtype: float64 

[0.2927059253556793, 1.247319690784595, -0.3699839317457915, -0.8813434778556513]
key as column col3
value as series object
 0   -0.366647
1   -1.635414
2    2.307514
3    1.701580
Name: col3, dtype: float64 

[-0.36664684680684234, -1.6354139099701293, 2.3075143905804225, 1.7015797645550859]


In [None]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(4,3),columns=['col1','col2','col3'])

print(df,"\n")

for key,value in df.items():
    print("key as column", key)
    print("value as series object\n", value, "\n")
    print([i for i in value])

       col1      col2      col3
0 -0.105143  0.196275  0.153485
1  0.557227 -0.057168 -0.423523
2  0.186203  1.340924  0.844358
3 -0.061546  0.025585  1.012121 

key as column col1
value as series object
 0   -0.105143
1    0.557227
2    0.186203
3   -0.061546
Name: col1, dtype: float64 

[-0.10514323678851252, 0.557227062231512, 0.18620269226738875, -0.06154619220663479]
key as column col2
value as series object
 0    0.196275
1   -0.057168
2    1.340924
3    0.025585
Name: col2, dtype: float64 

[0.19627518369362232, -0.05716782059311099, 1.3409240308813513, 0.025584847208424282]
key as column col3
value as series object
 0    0.153485
1   -0.423523
2    0.844358
3    1.012121
Name: col3, dtype: float64 

[0.15348507414468263, -0.4235226138798879, 0.8443580046779426, 1.01212088200572]


**iterrows()**

iterrows() function returns index value along with series containing the data in each row.

In [3]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(4,3),
                  columns = ['col1','col2','col3'])

print(df)

for row_index,row in df.iterrows():
    print("row index",row_index)
    print("row values\n",row)
    print(row['col1'])#we can get specific coumn values
    print(type(row))

       col1      col2      col3
0 -0.744555 -0.020344 -2.009734
1 -0.823350  0.979995 -0.951101
2  0.081841 -1.153763 -1.031191
3 -0.581035  1.310333  0.111782
row index 0
row values
 col1   -0.744555
col2   -0.020344
col3   -2.009734
Name: 0, dtype: float64
-0.7445554566611337
<class 'pandas.core.series.Series'>
row index 1
row values
 col1   -0.823350
col2    0.979995
col3   -0.951101
Name: 1, dtype: float64
-0.8233497307733929
<class 'pandas.core.series.Series'>
row index 2
row values
 col1    0.081841
col2   -1.153763
col3   -1.031191
Name: 2, dtype: float64
0.0818412622193199
<class 'pandas.core.series.Series'>
row index 3
row values
 col1   -0.581035
col2    1.310333
col3    0.111782
Name: 3, dtype: float64
-0.5810349166513812
<class 'pandas.core.series.Series'>


<u>**itertuples()**</u>

itertuples() method will give tuple for each row in the dataframe. The first element of the tuple will be the row’s index value, while the remaining values are the row values.

In [6]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randn(4,3),
                  columns = ['col1','col2','col3'])

for row in df.itertuples():
    print(row, type(row))
    print("index value is ", row[0])
    for value in range(1, len(row)):
        print("col{} value is {}".format(value, row[value]))
    print("\n")

Pandas(Index=0, col1=-0.32579076991670974, col2=0.30886983003191043, col3=-0.740947124876969) <class 'pandas.core.frame.Pandas'>
index value is  0
col1 value is -0.32579076991670974
col2 value is 0.30886983003191043
col3 value is -0.740947124876969


Pandas(Index=1, col1=-0.9133208493075972, col2=0.9063908445381984, col3=-1.6038073000251714) <class 'pandas.core.frame.Pandas'>
index value is  1
col1 value is -0.9133208493075972
col2 value is 0.9063908445381984
col3 value is -1.6038073000251714


Pandas(Index=2, col1=-0.6358277015857411, col2=0.8497430036431264, col3=-0.6116526661575516) <class 'pandas.core.frame.Pandas'>
index value is  2
col1 value is -0.6358277015857411
col2 value is 0.8497430036431264
col3 value is -0.6116526661575516


Pandas(Index=3, col1=-0.32544136845331956, col2=-0.8798237140633848, col3=1.1123385878269256) <class 'pandas.core.frame.Pandas'>
index value is  3
col1 value is -0.32544136845331956
col2 value is -0.8798237140633848
col3 value is 1.1123385878269256


