## Pandas 介紹

- pandas overview
- pandas DataFrame
- how to create DataFrame
- accessing and modifying data

Pandas是一個Python語言的程式庫，提供資料分析的能力。資料處理經常以表格型式處理資料，如sql的關聯表或Excel的表格。Pandas提供載入、處理、計算、及分析表格化的資料，與matplotlib及seaborn結合提供資料視覺化能力。 Pandas的主要資料結構為資料框(DataFrame)是二維表格化資料結構，每行包含相同型態資料。DataFrame的結構包括列標籤(row labels)、欄位標籤(column labels，可視為欄位名稱)、及資料項目(data items)。以下為一個典型的資料框範例，描述學生資料

|   |sid | name | phone | AI | datastructure | python
|----|----|----|-----|-----:|-------:|----:
101 | 11375001 | 李大明 | 0922555123 | 66 | 86 | 69
102 | 11375010 | 陳傑憲 | 0919123456 | 96 | 88 | 77
103 | 11375022 | 林玉明 | 0955235743 | 88 | 77 | 66
104 | 11375055 | 姜昆雨 | 0931239097 | 77 | 87 | 98
105 | 11375199 | 陳辰威 | 0932098543 | 67 | 78 | 89

- row labels: 101, 102, 103, .., 105
- column labels: sid, name, phone, AI, datastructure, pyhton
- data items: 每一筆(row)資料的內容

<img src="dataframe1.png" width="600" height="ˋ400"/>

- pandas適用於處理表格化的資料(資料表)
- pandas的表格由 row labels, column labels, 及 data items 所組成
- dataframe可以由 dictionary, list, 或表格化的檔案讀入建立(如.csv, sql, excel, ...)

In [1]:
# create table with dictionary
student = {
    'sid': ['11375001', '11375010', '11375022', '11375055', '11375199'],
    'name': [ '李大明', '陳傑憲', '林玉明', '姜昆雨', '陳辰威'],
    'phone': ['0922555123', '0919123456', '0955235743', '0931239097', '0932098543'],
    'AI': [66, 96, 88, 77, 67],
    'datastructure': [86, 88, 77, 87, 78],
    'python': [69, 77, 66, 98, 89]
}
row_labels = [101, 102, 103, 104, 105]

In [2]:
#work with pandas by importing pandas
import pandas as pd

In [3]:
#create DataFrame with DataFrame()
df = pd.DataFrame(student, index=row_labels)

In [4]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python
101,11375001,李大明,922555123,66,86,69
102,11375010,陳傑憲,919123456,96,88,77
103,11375022,林玉明,955235743,88,77,66
104,11375055,姜昆雨,931239097,77,87,98
105,11375199,陳辰威,932098543,67,78,89


In [5]:
#讀取某一欄資料
df['name'] #same as df.name

101    李大明
102    陳傑憲
103    林玉明
104    姜昆雨
105    陳辰威
Name: name, dtype: object

資料框中每一欄資料為一個 `pandas.Series` 的物件，包含一維資料及列標籤

In [6]:
#讀取某一列資料
df.loc[103]

sid                11375022
name                    林玉明
phone            0955235743
AI                       88
datastructure            77
python                   66
Name: 103, dtype: object

資料框的 .loc[x] 方法可用來讀取列標籤為x的資料內容，其輸出結構為一個 `pandas.Series`，輸出資料包含所有欄位名稱（欄標籤）

## 建立DataFrame
- dictionary
- list
- Numpy Array
- files

**建立資料框時要注意欄標籤、列標籤、及資料項**

In [7]:
#create dataframe with list
l = [[1, 2, 100],
     [2, 4, 200],
     [3, 5, 300]]


In [8]:
df1 = pd.DataFrame(l, columns=['x', 'y', 'z']) #加入 x, y, z 的欄位名稱

In [9]:
df1

Unnamed: 0,x,y,z
0,1,2,100
1,2,4,200
2,3,5,300


row labels = [0, 1, 2], column labels = ['x', 'y', 'z']

In [10]:
#create dataframe with numpy arrays
import numpy as np
arr = np.array([[1, 2, 200],
               [2, 4, 400],
               [3, 5, 500]])               

In [11]:
df2 = pd.DataFrame(arr, columns=['x', 'y', 'z'])

In [12]:
df2

Unnamed: 0,x,y,z
0,1,2,200
1,2,4,400
2,3,5,500


In [13]:
arr[0,0] = 999 #copy option is set to False, if you modify the array, then your DataFrame will change too

In [14]:
df2

Unnamed: 0,x,y,z
0,999,2,200
1,2,4,400
2,3,5,500


In [15]:
l[0][0] = 99 #in list no copy 

In [16]:
df1

Unnamed: 0,x,y,z
0,1,2,100
1,2,4,200
2,3,5,300


 ### 由.csv檔案建立資料框

In [17]:
# save dataframe to a csv file
df.to_csv('data.csv')

data.csv

,sid,name,phone,AI,datastructure,python<br>
101,11375001,李大明,0922555123,66,86,69<br>
102,11375010,陳傑憲,0919123456,96,88,77<br>
103,11375022,林玉明,0955235743,88,77,66<br>
104,11375055,姜昆雨,0931239097,77,87,98<br>
105,11375199,陳辰威,0932098543,67,78,89

In [18]:
pd.read_csv('data.csv')

Unnamed: 0.1,Unnamed: 0,sid,name,phone,AI,datastructure,python
0,101,11375001,李大明,922555123,66,86,69
1,102,11375010,陳傑憲,919123456,96,88,77
2,103,11375022,林玉明,955235743,88,77,66
3,104,11375055,姜昆雨,931239097,77,87,98
4,105,11375199,陳辰威,932098543,67,78,89


In [19]:
pd.read_csv("data.csv", index_col=0) #index_col=0 定義第一欄為row labels

Unnamed: 0,sid,name,phone,AI,datastructure,python
101,11375001,李大明,922555123,66,86,69
102,11375010,陳傑憲,919123456,96,88,77
103,11375022,林玉明,955235743,88,77,66
104,11375055,姜昆雨,931239097,77,87,98
105,11375199,陳辰威,932098543,67,78,89


### dataframe 屬性

In [20]:
df.index

Index([101, 102, 103, 104, 105], dtype='int64')

In [21]:
df.columns

Index(['sid', 'name', 'phone', 'AI', 'datastructure', 'python'], dtype='object')

In [22]:
df.columns[1]

'name'

In [23]:
df.index = np.arange(10, 15)

In [24]:
df #note that row labels of df change to 10, 11, 12, 13, 14

Unnamed: 0,sid,name,phone,AI,datastructure,python
10,11375001,李大明,922555123,66,86,69
11,11375010,陳傑憲,919123456,96,88,77
12,11375022,林玉明,955235743,88,77,66
13,11375055,姜昆雨,931239097,77,87,98
14,11375199,陳辰威,932098543,67,78,89


In [25]:
# save dataframe to numpy array
df.to_numpy()

array([['11375001', '李大明', '0922555123', 66, 86, 69],
       ['11375010', '陳傑憲', '0919123456', 96, 88, 77],
       ['11375022', '林玉明', '0955235743', 88, 77, 66],
       ['11375055', '姜昆雨', '0931239097', 77, 87, 98],
       ['11375199', '陳辰威', '0932098543', 67, 78, 89]], dtype=object)

In [26]:
df.values

array([['11375001', '李大明', '0922555123', 66, 86, 69],
       ['11375010', '陳傑憲', '0919123456', 96, 88, 77],
       ['11375022', '林玉明', '0955235743', 88, 77, 66],
       ['11375055', '姜昆雨', '0931239097', 77, 87, 98],
       ['11375199', '陳辰威', '0932098543', 67, 78, 89]], dtype=object)

In [27]:
df.dtypes # read data type of each column

sid              object
name             object
phone            object
AI                int64
datastructure     int64
python            int64
dtype: object

In [28]:
df3 = df.astype(dtype={'AI': np.int32})

In [29]:
df3.dtypes #AI屬性資料型別轉為int32

sid              object
name             object
phone            object
AI                int32
datastructure     int64
python            int64
dtype: object

### dataframe size

In [30]:
df.ndim

2

In [31]:
df.shape # 5rows by 6columns 

(5, 6)

In [32]:
df.size #total number of elements

30

In [33]:
df['name'].ndim

1

In [34]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python
10,11375001,李大明,922555123,66,86,69
11,11375010,陳傑憲,919123456,96,88,77
12,11375022,林玉明,955235743,88,77,66
13,11375055,姜昆雨,931239097,77,87,98
14,11375199,陳辰威,932098543,67,78,89


### 資料框資料讀取及修改

In [35]:
df['name'] #read column data

10    李大明
11    陳傑憲
12    林玉明
13    姜昆雨
14    陳辰威
Name: name, dtype: object

In [36]:
df.loc[12] #read row data

sid                11375022
name                    林玉明
phone            0955235743
AI                       88
datastructure            77
python                   66
Name: 12, dtype: object

`df.iloc[12]` #IndexError: single positional indexer is out-of-bounds

In [37]:
df.iloc[0]

sid                11375001
name                    李大明
phone            0922555123
AI                       66
datastructure            86
python                   69
Name: 10, dtype: object

- **.loc[]**: accessing data with row label and column label
- **.iloc[]**: accessing data with row index and column index
- **.at[]**: find out single value with rows and columns labels
- **.iat[]**: find out single values with indeics of row and column

In [38]:
df.loc[11, 'phone']

'0919123456'

In [39]:
df.loc[12, ['name','phone','AI']]

name            林玉明
phone    0955235743
AI               88
Name: 12, dtype: object

In [40]:
df.loc[[12, 14], ['name','phone','AI']]

Unnamed: 0,name,phone,AI
12,林玉明,955235743,88
14,陳辰威,932098543,67


In [41]:
df.iloc[1:3, [1,2,4]]

Unnamed: 0,name,phone,datastructure
11,陳傑憲,919123456,88
12,林玉明,955235743,77


In [42]:
df.iloc[:, [1, 3, 5]]

Unnamed: 0,name,AI,python
10,李大明,66,69
11,陳傑憲,96,77
12,林玉明,88,66
13,姜昆雨,77,98
14,陳辰威,67,89


In [43]:
df.iloc[1:6:2, [1,3]]

Unnamed: 0,name,AI
11,陳傑憲,96
13,姜昆雨,77


In [44]:
df.iloc[slice(1,6,2), 1]

11    陳傑憲
13    姜昆雨
Name: name, dtype: object

In [45]:
df.iloc[np.s_[1:6:2],1]

11    陳傑憲
13    姜昆雨
Name: name, dtype: object

In [46]:
df.iloc[pd.IndexSlice[1:6:2],1]

11    陳傑憲
13    姜昆雨
Name: name, dtype: object

In [47]:
df.at[12, 'name']

'林玉明'

In [48]:
df.iat[2, 1]

'林玉明'

In [49]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python
10,11375001,李大明,922555123,66,86,69
11,11375010,陳傑憲,919123456,96,88,77
12,11375022,林玉明,955235743,88,77,66
13,11375055,姜昆雨,931239097,77,87,98
14,11375199,陳辰威,932098543,67,78,89


In [50]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python
10,11375001,李大明,922555123,66,86,69
11,11375010,陳傑憲,919123456,96,88,77
12,11375022,林玉明,955235743,88,77,66
13,11375055,姜昆雨,931239097,77,87,98
14,11375199,陳辰威,932098543,67,78,89


### 新增row及欄

In [51]:
#新增一個欄位名為'game'預設值=0
df['game']=0

In [52]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python,game
10,11375001,李大明,922555123,66,86,69,0
11,11375010,陳傑憲,919123456,96,88,77,0
12,11375022,林玉明,955235743,88,77,66,0
13,11375055,姜昆雨,931239097,77,87,98,0
14,11375199,陳辰威,932098543,67,78,89,0


In [53]:
#在df最後新增一筆資料 with list
df.loc[len(df)]=['11075005', '何偉宏', '0955456852', 80, 85, 90, 85]
df


Unnamed: 0,sid,name,phone,AI,datastructure,python,game
10,11375001,李大明,922555123,66,86,69,0
11,11375010,陳傑憲,919123456,96,88,77,0
12,11375022,林玉明,955235743,88,77,66,0
13,11375055,姜昆雨,931239097,77,87,98,0
14,11375199,陳辰威,932098543,67,78,89,0
5,11075005,何偉宏,955456852,80,85,90,85


In [54]:
#修改dataframe element
df.loc[5, 'name']='何大發'


In [55]:
#adding row using concat() two dataframes
mary = {'sid': '10975001', 'name': '安瑪莉', 'phone': '0920562000', 'AI': 86, 'datastructure': 77, 'python': 80, 'game': 90}
mary_df = pd.DataFrame(mary, index=[20])
df = pd.concat([df, mary_df], ignore_index=True)
print(df)

        sid name       phone  AI  datastructure  python  game
0  11375001  李大明  0922555123  66             86      69     0
1  11375010  陳傑憲  0919123456  96             88      77     0
2  11375022  林玉明  0955235743  88             77      66     0
3  11375055  姜昆雨  0931239097  77             87      98     0
4  11375199  陳辰威  0932098543  67             78      89     0
5  11075005  何大發  0955456852  80             85      90    85
6  10975001  安瑪莉  0920562000  86             77      80    90


In [56]:
df['total']=0 #insert a new column with default value = 0

In [57]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python,game,total
0,11375001,李大明,922555123,66,86,69,0,0
1,11375010,陳傑憲,919123456,96,88,77,0,0
2,11375022,林玉明,955235743,88,77,66,0,0
3,11375055,姜昆雨,931239097,77,87,98,0,0
4,11375199,陳辰威,932098543,67,78,89,0,0
5,11075005,何大發,955456852,80,85,90,85,0
6,10975001,安瑪莉,920562000,86,77,80,90,0


In [58]:
df.drop(labels=[5], inplace=True) #刪除紀錄，labels定義列號(List)，inplace=True代表改變原有dataframe

In [59]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python,game,total
0,11375001,李大明,922555123,66,86,69,0,0
1,11375010,陳傑憲,919123456,96,88,77,0,0
2,11375022,林玉明,955235743,88,77,66,0,0
3,11375055,姜昆雨,931239097,77,87,98,0,0
4,11375199,陳辰威,932098543,67,78,89,0,0
6,10975001,安瑪莉,920562000,86,77,80,90,0


In [60]:
del df['game'] #刪除game欄

In [61]:
df

Unnamed: 0,sid,name,phone,AI,datastructure,python,total
0,11375001,李大明,922555123,66,86,69,0
1,11375010,陳傑憲,919123456,96,88,77,0
2,11375022,林玉明,955235743,88,77,66,0
3,11375055,姜昆雨,931239097,77,87,98,0
4,11375199,陳辰威,932098543,67,78,89,0
6,10975001,安瑪莉,920562000,86,77,80,0


In [None]:
# .insert()插入一行資料，loc=4定義插入位置，column設定欄位名稱，value定義數值(np.array or list 都可)
df.insert(loc=4, column='django-score', value=np.array([89.0, 81.0, 78.0, 88.0, 74.0, 70.0]))
df

In [65]:
del df['total']

In [66]:
df

Unnamed: 0,sid,name,phone,AI,django-score,datastructure,python
0,11375001,李大明,922555123,66,89.0,86,69
1,11375010,陳傑憲,919123456,96,81.0,88,77
2,11375022,林玉明,955235743,88,78.0,77,66
3,11375055,姜昆雨,931239097,77,88.0,87,98
4,11375199,陳辰威,932098543,67,74.0,78,89
6,10975001,安瑪莉,920562000,86,70.0,77,80


In [67]:
df=df.drop(labels='AI', axis=1) #del column with .drop(), axis=1 for column
df

Unnamed: 0,sid,name,phone,django-score,datastructure,python
0,11375001,李大明,922555123,89.0,86,69
1,11375010,陳傑憲,919123456,81.0,88,77
2,11375022,林玉明,955235743,78.0,77,66
3,11375055,姜昆雨,931239097,88.0,87,98
4,11375199,陳辰威,932098543,74.0,78,89
6,10975001,安瑪莉,920562000,70.0,77,80


In [68]:
#arithmatic operations for columns
df['total']=0.4*df['django-score']+0.3*df['datastructure']+0.3*df['python']
df

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86,69,82.1
1,11375010,陳傑憲,919123456,81.0,88,77,81.9
2,11375022,林玉明,955235743,78.0,77,66,74.1
3,11375055,姜昆雨,931239097,88.0,87,98,90.7
4,11375199,陳辰威,932098543,74.0,78,89,79.7
6,10975001,安瑪莉,920562000,70.0,77,80,75.1


In [69]:
score = df.iloc[:, 3:6]
score

Unnamed: 0,django-score,datastructure,python
0,89.0,86,69
1,81.0,88,77
2,78.0,77,66
3,88.0,87,98
4,74.0,78,89
6,70.0,77,80


In [70]:
import numpy as np
np.average(score, axis=1, weights=[0.5, 0.2, 0.3]) # can also use np.average() function

array([82.4, 81.2, 74.2, 90.8, 79.3, 74.4])

In [71]:
#排序 .sort_values()
df.sort_values(by='datastructure', ascending=False)

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
1,11375010,陳傑憲,919123456,81.0,88,77,81.9
3,11375055,姜昆雨,931239097,88.0,87,98,90.7
0,11375001,李大明,922555123,89.0,86,69,82.1
4,11375199,陳辰威,932098543,74.0,78,89,79.7
2,11375022,林玉明,955235743,78.0,77,66,74.1
6,10975001,安瑪莉,920562000,70.0,77,80,75.1


In [72]:
df.sort_values(by=['datastructure', 'python'], ascending=[False, False])

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
1,11375010,陳傑憲,919123456,81.0,88,77,81.9
3,11375055,姜昆雨,931239097,88.0,87,98,90.7
0,11375001,李大明,922555123,89.0,86,69,82.1
4,11375199,陳辰威,932098543,74.0,78,89,79.7
6,10975001,安瑪莉,920562000,70.0,77,80,75.1
2,11375022,林玉明,955235743,78.0,77,66,74.1


### 過濾資料

In [73]:
#利用boolean運算式過濾資料
filter = df['python'] >= 80
filter

0    False
1    False
2    False
3     True
4     True
6     True
Name: python, dtype: bool

In [74]:
df[filter]

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
3,11375055,姜昆雨,931239097,88.0,87,98,90.7
4,11375199,陳辰威,932098543,74.0,78,89,79.7
6,10975001,安瑪莉,920562000,70.0,77,80,75.1


In [75]:
df[(df['django-score']>=80) & (df['python']>80)] #每個不林運算式用小括號定義

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
3,11375055,姜昆雨,931239097,88.0,87,98,90.7


### 統計資料

In [76]:
df.describe() #顯示數值資料的統計值

Unnamed: 0,django-score,datastructure,python,total
count,6.0,6.0,6.0,6.0
mean,80.0,82.166667,79.833333,80.6
std,7.563068,5.344779,12.089941,5.987654
min,70.0,77.0,66.0,74.1
25%,75.0,77.25,71.0,76.25
50%,79.5,82.0,78.5,80.8
75%,86.25,86.75,86.75,82.05
max,89.0,88.0,98.0,90.7


In [77]:
score.mean()

django-score     80.000000
datastructure    82.166667
python           79.833333
dtype: float64

In [78]:
score.std()

django-score      7.563068
datastructure     5.344779
python           12.089941
dtype: float64

In [79]:
score.var()

django-score      57.200000
datastructure     28.566667
python           146.166667
dtype: float64

In [80]:
df['datastructure'].mean()

82.16666666666667

### 處理遺失值

In [81]:
joe = ['10275111', '張喬一', '0911982654', 77, np.NaN, 98, np.NaN]

In [82]:
df.loc[len(df)] = joe

In [83]:
df

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,,98,


In [84]:
df['datastructure'].mean()

83.2

In [85]:
# if you instruct .mean() not to skip nan values with skipna=False, then it will consider them and return nan if there’s any missing value among the data.
df['datastructure'].mean(skipna=False) 

nan

In [86]:
df.fillna(value=0)

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,0.0,98,0.0


In [87]:
df.fillna(method='ffill')

  df.fillna(method='ffill')


Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,78.0,98,79.7


In [88]:
df.fillna(method='bfill')

  df.fillna(method='bfill')


Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,,98,


In [89]:
df.interpolate()

  df.interpolate()


Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,78.0,98,79.7


In [90]:
df

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,,98,


In [92]:
df.dropna() #刪除資料中有NaN的哪一筆資料

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7


In [93]:
for row_label, row in df.iterrows():
    print(row_label, row, sep='\n', end='\n\n')

0
sid                11375001
name                    李大明
phone            0922555123
django-score           89.0
datastructure          86.0
python                   69
total                  82.1
Name: 0, dtype: object

1
sid                11375010
name                    陳傑憲
phone            0919123456
django-score           81.0
datastructure          88.0
python                   77
total                  81.9
Name: 1, dtype: object

2
sid                11375022
name                    林玉明
phone            0955235743
django-score           78.0
datastructure          77.0
python                   66
total                  74.1
Name: 2, dtype: object

3
sid                11375055
name                    姜昆雨
phone            0931239097
django-score           88.0
datastructure          87.0
python                   98
total                  90.7
Name: 3, dtype: object

4
sid                11375199
name                    陳辰威
phone            0932098543
django-score           74.

In [97]:
for rr in df.loc[:, ['sid', 'name', 'phone']].itertuples():
    print(rr)

Pandas(Index=0, sid='11375001', name='李大明', phone='0922555123')
Pandas(Index=1, sid='11375010', name='陳傑憲', phone='0919123456')
Pandas(Index=2, sid='11375022', name='林玉明', phone='0955235743')
Pandas(Index=3, sid='11375055', name='姜昆雨', phone='0931239097')
Pandas(Index=4, sid='11375199', name='陳辰威', phone='0932098543')
Pandas(Index=6, sid='10275111', name='張喬一', phone='0911982654')


In [95]:
df

Unnamed: 0,sid,name,phone,django-score,datastructure,python,total
0,11375001,李大明,922555123,89.0,86.0,69,82.1
1,11375010,陳傑憲,919123456,81.0,88.0,77,81.9
2,11375022,林玉明,955235743,78.0,77.0,66,74.1
3,11375055,姜昆雨,931239097,88.0,87.0,98,90.7
4,11375199,陳辰威,932098543,74.0,78.0,89,79.7
6,10275111,張喬一,911982654,77.0,,98,


In [96]:
df.loc[:, ['sid', 'name', 'phone']]

Unnamed: 0,sid,name,phone
0,11375001,李大明,922555123
1,11375010,陳傑憲,919123456
2,11375022,林玉明,955235743
3,11375055,姜昆雨,931239097
4,11375199,陳辰威,932098543
6,10275111,張喬一,911982654
