# Pandas中的迭代

### 1. 迭代DataFrame

In [1]:
import pandas as pd
import numpy as np

N = 20
df = pd.DataFrame({
        'A':pd.date_range(start='2016-01-01',periods=N,freq='D'),
        'x':np.linspace(0,stop=N-1,num=N),
        'y':np.random.rand(N),
        'C':np.random.choice(['low','Medium','High'],N).tolist(),
        'D':np.random.normal(100,10,size=(N)).tolist()
    })

for col in df:
    print col

A
C
D
x
y


## 迭代DataFrame中的所有行，我们可以使用：
- iteritems() -- 迭代 (key,value)对
- iterrows() -- 迭代(index，series)对的行
- itertuples() -- 迭代作为namedtuples的行

### 2. iteritems()
Iterates over each column as key, value pair with label as key and column value as a Series object.

In [9]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(0,12).reshape(4,3),columns=['col1','col2','col3'])
print df,'\n'
for key,value in df.iteritems():
    print key,value,'\n'

   col1  col2  col3
0     0     1     2
1     3     4     5
2     6     7     8
3     9    10    11 

col1 0    0
1    3
2    6
3    9
Name: col1, dtype: int32 

col2 0     1
1     4
2     7
3    10
Name: col2, dtype: int32 

col3 0     2
1     5
2     8
3    11
Name: col3, dtype: int32 



### 3. iterrows()
iterrows() returns the iterator yielding each index value along with a series containing the data in each row.

In [10]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(0,12).reshape(4,3),columns=['col1','col2','col3'])
print df,'\n'
for row_index,row in df.iterrows():
    print row_index,row

   col1  col2  col3
0     0     1     2
1     3     4     5
2     6     7     8
3     9    10    11 

0 col1    0
col2    1
col3    2
Name: 0, dtype: int32
1 col1    3
col2    4
col3    5
Name: 1, dtype: int32
2 col1    6
col2    7
col3    8
Name: 2, dtype: int32
3 col1     9
col2    10
col3    11
Name: 3, dtype: int32


**Note** − Because iterrows() iterate over the rows, it doesn't preserve the data type across the row. 0,1,2 are the row indices and col1,col2,col3 are column indices.

### 4. itertuples()
itertuples() method will return an iterator yielding a named tuple for each row in the DataFrame. The first element of the tuple will be the row’s corresponding index value, while the remaining values are the row values.

In [11]:
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(0,12).reshape(4,3),columns=['col1','col2','col3'])
print df,'\n'
for row in df.itertuples():
    print row

   col1  col2  col3
0     0     1     2
1     3     4     5
2     6     7     8
3     9    10    11 

Pandas(Index=0, col1=0, col2=1, col3=2)
Pandas(Index=1, col1=3, col2=4, col3=5)
Pandas(Index=2, col1=6, col2=7, col3=8)
Pandas(Index=3, col1=9, col2=10, col3=11)


#### 注意： 迭代的意义是为了便于阅读，不要尝试迭代过程中修改任何对象，因为迭代器返回的只是原始对象的拷贝