# Iteration

The behavior of basic iteration over Pandas objects depends on the type. When iterating over a Series, it is regarded as array-like, and basic iteration produces the values. Other data structures, like DataFrame and Panel, follow the dict-like convention of iterating over the keys of the objects.

- In short, basic iteration (for i in object) produces
- Series − values
- DataFrame − column labels
- Panel − item labels

## Iterating a DataFrame
Iterating a DataFrame gives column names. 
Let us consider the following example to understand the same.

In [2]:
import pandas as pd
import numpy as np
 
N=20
df = pd.DataFrame({
   'A': pd.date_range(start='2016-01-01',periods=N,freq='D'),
   'x': np.linspace(0,stop=N-1,num=N),
   'y': np.random.rand(N),
   'C': np.random.choice(['Low','Medium','High'],N).tolist(),
   'D': np.random.normal(100, 10, size=(N)).tolist()
   })

df.head()

Unnamed: 0,A,x,y,C,D
0,2016-01-01,0.0,0.642738,Low,93.215327
1,2016-01-02,1.0,0.174944,Medium,101.398828
2,2016-01-03,2.0,0.479303,High,100.732741
3,2016-01-04,3.0,0.9417,High,97.458413
4,2016-01-05,4.0,0.469969,Medium,107.346749


In [7]:
for col in df:
    print(col)
    
# DF iterates through the column names

A
x
y
C
D


# Ways methods to iterate through DF
To iterate over the rows of the DataFrame, we can use the following functions:

- iteritems() − to iterate over the (key,value) pairs
- iterrows() − iterate over the rows as (index,series) pairs
- itertuples() − iterate over the rows as namedtuples

# 1. iteritems() 
Iterates over each column as key, value pair with label as key and column value as a Series object.

In [9]:
df = pd.DataFrame(np.random.randn(4,3),columns=['col1','col2','col3'])
df

Unnamed: 0,col1,col2,col3
0,0.67193,-0.285315,-1.131972
1,0.929258,0.361816,-1.079748
2,-0.876544,-1.169859,0.083812
3,1.332908,-0.509854,-0.90043


In [11]:
for col,values in df.iteritems():
    print(col,values)

# TEMPLATE
#for key,value in df.iteritems():
#    print(key,values)

col1 0    0.671930
1    0.929258
2   -0.876544
3    1.332908
Name: col1, dtype: float64
col2 0   -0.285315
1    0.361816
2   -1.169859
3   -0.509854
Name: col2, dtype: float64
col3 0   -1.131972
1   -1.079748
2    0.083812
3   -0.900430
Name: col3, dtype: float64


Observe, each column is iterated separately as a key-value pair in a Series.

# iterrows()
iterrows() returns the iterator yielding each index value along with a series containing the data in each row.

In [12]:
df = pd.DataFrame(np.random.randn(4,3),columns = ['col1','col2','col3'])
df

Unnamed: 0,col1,col2,col3
0,1.251304,0.788113,0.658501
1,-1.405875,-1.319791,-0.880908
2,0.470828,-1.093917,-0.855566
3,-0.306974,-0.716046,0.206188


In [13]:
for row_index,values in df.iterrows():
    print(row_index,values)

0 col1    1.251304
col2    0.788113
col3    0.658501
Name: 0, dtype: float64
1 col1   -1.405875
col2   -1.319791
col3   -0.880908
Name: 1, dtype: float64
2 col1    0.470828
col2   -1.093917
col3   -0.855566
Name: 2, dtype: float64
3 col1   -0.306974
col2   -0.716046
col3    0.206188
Name: 3, dtype: float64


**Note** Because iterrows() iterate over the rows, it doesn't preserve the data type across the row. 0,1,2 are the row indices and col1,col2,col3 are column indices.


# 3. itertuples()
itertuples() method will return an iterator giving a named tuple for each row in the DataFrame. The first element of the tuple will be the row’s index value, while the remaining values are the row values.

In [15]:
for row in df.itertuples():
    print(row)

Pandas(Index=0, col1=1.251304499686695, col2=0.7881133100243172, col3=0.6585012373382707)
Pandas(Index=1, col1=-1.405875315951901, col2=-1.3197907550357248, col3=-0.8809080556903238)
Pandas(Index=2, col1=0.4708277692341507, col2=-1.0939168115852627, col3=-0.8555660760366817)
Pandas(Index=3, col1=-0.30697415101770115, col2=-0.7160457805585031, col3=0.20618770478421672)


**Note** − Do not try to modify any object while iterating. Iterating is meant for reading and the iterator returns a copy of the original object (a view), thus the changes will not reflect on the original object.

### Iterating and changing values

In [24]:
df

Unnamed: 0,col1,col2,col3
0,1.251304,0.788113,0.658501
1,-1.405875,-1.319791,-0.880908
2,0.470828,-1.093917,-0.855566
3,-0.306974,-0.716046,0.206188


In [31]:
for index,row_values in df.iterrows():
    row_values['col1'] = 10
    print(index,row_values)

0 col1    10.000000
col2     0.788113
col3     0.658501
Name: 0, dtype: float64
1 col1    10.000000
col2    -1.319791
col3    -0.880908
Name: 1, dtype: float64
2 col1    10.000000
col2    -1.093917
col3    -0.855566
Name: 2, dtype: float64
3 col1    10.000000
col2    -0.716046
col3     0.206188
Name: 3, dtype: float64


End 