# Pandas : 데이터 분석 라이브러리
행(row) 와 열(column) 형태의 데이터 객체 다룸 <br>
대표적인 객체는 DataFrame 과 Series

In [3]:
import numpy as np
import pandas as pd

# DataFrame

In [4]:
df = pd.DataFrame([1, 2, 3])
# 3 row x 1 col의 DataFrame 객체 생성
df

Unnamed: 0,0
0,1
1,2
2,3


In [5]:
pd.DataFrame(['dog', 'cat', 'bird'])

Unnamed: 0,0
0,dog
1,cat
2,bird


In [6]:
# pandas.DataFrame(data, index, columns, dtype, copy)
df = pd.DataFrame([
    ['dog', '멍멍이', '45kg'], 
    ['cat', '냥냥이', '7kg'], 
    ['bird', '짹짹이', '4g'],
    ['tiger', '큰냥이', '1t']
])
df

Unnamed: 0,0,1,2
0,dog,멍멍이,45kg
1,cat,냥냥이,7kg
2,bird,짹짹이,4g
3,tiger,큰냥이,1t


In [7]:
len(df)

4

In [8]:
df.shape #4행 3열

(4, 3)

In [9]:
df.ndim #number of dimensions? 2차원 데이터

2

In [10]:
len(df.shape) # .ndim 과 동일한 값

2

In [11]:
df.size # 데이터의 총 개수

12

# column, index

In [12]:
df = pd.DataFrame([
    ['dog', '멍멍이', '45kg'], 
    ['cat', '냥냥이', '7kg'], 
    ['bird', '짹짹이', '4g'],
    ['tiger', '큰냥이', '1t']
], index=['row1', 'row2', 'row3', 'row4'], columns=['종류', '이름', '무게'])
df

Unnamed: 0,종류,이름,무게
row1,dog,멍멍이,45kg
row2,cat,냥냥이,7kg
row3,bird,짹짹이,4g
row4,tiger,큰냥이,1t


In [13]:
df.columns = ['AA', 'BB', 'CC']
df

Unnamed: 0,AA,BB,CC
row1,dog,멍멍이,45kg
row2,cat,냥냥이,7kg
row3,bird,짹짹이,4g
row4,tiger,큰냥이,1t


## 엑셀 저장

In [14]:
df.to_excel('animal.xlsx')

In [15]:
df2 = pd.read_csv('./data/country.csv')
df2

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
0,ABW,Aruba,North America,Caribbean,193.0,,103000,78.4,828.0,793.0,Aruba,Nonmetropolitan Territory of The Netherlands,Beatrix,129.0,AW
1,AFG,Afghanistan,Asia,Southern and Central Asia,652090.0,1919.0,22720000,45.9,5976.0,,Afganistan/Afqanestan,Islamic Emirate,Mohammad Omar,1.0,AF
2,AGO,Angola,Africa,Central Africa,1246700.0,1975.0,12878000,38.3,6648.0,7984.0,Angola,Republic,José Eduardo dos Santos,56.0,AO
3,AIA,Anguilla,North America,Caribbean,96.0,,8000,76.1,63.2,,Anguilla,Dependent Territory of the UK,Elisabeth II,62.0,AI
4,ALB,Albania,Europe,Southern Europe,28748.0,1912.0,3401200,71.6,3205.0,2500.0,Shqipëria,Republic,Rexhep Mejdani,34.0,AL
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
234,YEM,Yemen,Asia,Middle East,527968.0,1918.0,18112000,59.8,6041.0,5729.0,Al-Yaman,Republic,Ali Abdallah Salih,1780.0,YE
235,YUG,Yugoslavia,Europe,Southern Europe,102173.0,1918.0,10640000,72.4,17000.0,,Jugoslavija,Federal Republic,Vojislav Koštunica,1792.0,YU
236,ZAF,South Africa,Africa,Southern Africa,1221037.0,1910.0,40377000,51.1,116729.0,129092.0,South Africa,Republic,Thabo Mbeki,716.0,ZA
237,ZMB,Zambia,Africa,Eastern Africa,752618.0,1964.0,9169000,37.2,3377.0,3922.0,Zambia,Republic,Frederick Chiluba,3162.0,ZM


In [16]:
len(df2)

239

In [17]:
df2["Name"]
# df2.Name

0             Aruba
1       Afghanistan
2            Angola
3          Anguilla
4           Albania
           ...     
234           Yemen
235      Yugoslavia
236    South Africa
237          Zambia
238        Zimbabwe
Name: Name, Length: 239, dtype: object

In [18]:
df2[["Name", "Population"]]

Unnamed: 0,Name,Population
0,Aruba,103000
1,Afghanistan,22720000
2,Angola,12878000
3,Anguilla,8000
4,Albania,3401200
...,...,...
234,Yemen,18112000
235,Yugoslavia,10640000
236,South Africa,40377000
237,Zambia,9169000


In [19]:
#head() 상위 5개
df2[["Name", "Population"]].head()

Unnamed: 0,Name,Population
0,Aruba,103000
1,Afghanistan,22720000
2,Angola,12878000
3,Anguilla,8000
4,Albania,3401200


In [20]:
df2[0:1] #0부터 1전까지

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
0,ABW,Aruba,North America,Caribbean,193.0,,103000,78.4,828.0,793.0,Aruba,Nonmetropolitan Territory of The Netherlands,Beatrix,129.0,AW


In [21]:
df2[20:30][["Continent", "Code"]] #20부터 30전까지 == 20부터 29까지

Unnamed: 0,Continent,Code
20,Africa,BFA
21,Asia,BGD
22,Europe,BGR
23,Asia,BHR
24,North America,BHS
25,Europe,BIH
26,Europe,BLR
27,North America,BLZ
28,North America,BMU
29,South America,BOL


In [31]:
df3 = df2[::2] #2칸마다 출력
df3

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
0,ABW,Aruba,North America,Caribbean,193.0,,103000,78.4,828.0,793.0,Aruba,Nonmetropolitan Territory of The Netherlands,Beatrix,129.0,AW
2,AGO,Angola,Africa,Central Africa,1246700.0,1975.0,12878000,38.3,6648.0,7984.0,Angola,Republic,José Eduardo dos Santos,56.0,AO
4,ALB,Albania,Europe,Southern Europe,28748.0,1912.0,3401200,71.6,3205.0,2500.0,Shqipëria,Republic,Rexhep Mejdani,34.0,AL
6,ANT,Netherlands Antilles,North America,Caribbean,800.0,,217000,74.7,1941.0,,Nederlandse Antillen,Nonmetropolitan Territory of The Netherlands,Beatrix,33.0,AN
8,ARG,Argentina,South America,South America,2780400.0,1816.0,37032000,75.1,340238.0,323310.0,Argentina,Federal Republic,Fernando de la Rúa,69.0,AR
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
230,VNM,Vietnam,Asia,Southeast Asia,331689.0,1945.0,79832000,69.3,21929.0,22834.0,Viêt Nam,Socialistic Republic,Trân Duc Luong,3770.0,VN
232,WLF,Wallis and Futuna,Oceania,Polynesia,200.0,,15000,,0.0,,Wallis-et-Futuna,Nonmetropolitan Territory of France,Jacques Chirac,3536.0,WF
234,YEM,Yemen,Asia,Middle East,527968.0,1918.0,18112000,59.8,6041.0,5729.0,Al-Yaman,Republic,Ali Abdallah Salih,1780.0,YE
236,ZAF,South Africa,Africa,Southern Africa,1221037.0,1910.0,40377000,51.1,116729.0,129092.0,South Africa,Republic,Thabo Mbeki,716.0,ZA


In [23]:
df2[::-1] #역순출력

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
238,ZWE,Zimbabwe,Africa,Eastern Africa,390757.0,1980.0,11669000,37.8,5951.0,8670.0,Zimbabwe,Republic,Robert G. Mugabe,4068.0,ZW
237,ZMB,Zambia,Africa,Eastern Africa,752618.0,1964.0,9169000,37.2,3377.0,3922.0,Zambia,Republic,Frederick Chiluba,3162.0,ZM
236,ZAF,South Africa,Africa,Southern Africa,1221037.0,1910.0,40377000,51.1,116729.0,129092.0,South Africa,Republic,Thabo Mbeki,716.0,ZA
235,YUG,Yugoslavia,Europe,Southern Europe,102173.0,1918.0,10640000,72.4,17000.0,,Jugoslavija,Federal Republic,Vojislav Koštunica,1792.0,YU
234,YEM,Yemen,Asia,Middle East,527968.0,1918.0,18112000,59.8,6041.0,5729.0,Al-Yaman,Republic,Ali Abdallah Salih,1780.0,YE
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4,ALB,Albania,Europe,Southern Europe,28748.0,1912.0,3401200,71.6,3205.0,2500.0,Shqipëria,Republic,Rexhep Mejdani,34.0,AL
3,AIA,Anguilla,North America,Caribbean,96.0,,8000,76.1,63.2,,Anguilla,Dependent Territory of the UK,Elisabeth II,62.0,AI
2,AGO,Angola,Africa,Central Africa,1246700.0,1975.0,12878000,38.3,6648.0,7984.0,Angola,Republic,José Eduardo dos Santos,56.0,AO
1,AFG,Afghanistan,Asia,Southern and Central Asia,652090.0,1919.0,22720000,45.9,5976.0,,Afganistan/Afqanestan,Islamic Emirate,Mohammad Omar,1.0,AF


# loc, iloc
    - loc : 인덱스 명으로 접근
    - iloc : 인덱스 순번으로 접근

In [24]:
df2.loc[0]

Code                                                       ABW
Name                                                     Aruba
Continent                                        North America
Region                                               Caribbean
SurfaceArea                                                193
IndepYear                                                  NaN
Population                                              103000
LifeExpectancy                                            78.4
GNP                                                        828
GNPOld                                                     793
LocalName                                                Aruba
GovernmentForm    Nonmetropolitan Territory of The Netherlands
HeadOfState                                            Beatrix
Capital                                                    129
Code2                                                       AW
Name: 0, dtype: object

In [25]:
df2.loc[[134, 32, 25]] #index이름으로 데이터 뽑기

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
134,MDV,Maldives,Asia,Southern and Central Asia,298.0,1965.0,286000,62.2,199.0,,Dhivehi Raajje/Maldives,Republic,Maumoon Abdul Gayoom,2463.0,MV
32,BRN,Brunei,Asia,Southeast Asia,5765.0,1984.0,328000,73.6,11705.0,12460.0,Brunei Darussalam,Monarchy (Sultanate),Haji Hassan al-Bolkiah,538.0,BN
25,BIH,Bosnia and Herzegovina,Europe,Southern Europe,51197.0,1992.0,3972000,71.5,2841.0,,Bosna i Hercegovina,Federal Republic,Ante Jelavic,201.0,BA


In [26]:
df2.iloc[0] #index번호로 데이터 뽑기

Code                                                       ABW
Name                                                     Aruba
Continent                                        North America
Region                                               Caribbean
SurfaceArea                                                193
IndepYear                                                  NaN
Population                                              103000
LifeExpectancy                                            78.4
GNP                                                        828
GNPOld                                                     793
LocalName                                                Aruba
GovernmentForm    Nonmetropolitan Territory of The Netherlands
HeadOfState                                            Beatrix
Capital                                                    129
Code2                                                       AW
Name: 0, dtype: object

In [27]:
df2.iloc[10:20]

Unnamed: 0,Code,Name,Continent,Region,SurfaceArea,IndepYear,Population,LifeExpectancy,GNP,GNPOld,LocalName,GovernmentForm,HeadOfState,Capital,Code2
10,ASM,American Samoa,Oceania,Polynesia,199.0,,68000,75.1,334.0,,Amerika Samoa,US Territory,George W. Bush,54.0,AS
11,ATA,Antarctica,Antarctica,Antarctica,13120000.0,,0,,0.0,,–,Co-administrated,,,AQ
12,ATF,French Southern territories,Antarctica,Antarctica,7780.0,,0,,0.0,,Terres australes françaises,Nonmetropolitan Territory of France,Jacques Chirac,,TF
13,ATG,Antigua and Barbuda,North America,Caribbean,442.0,1981.0,68000,70.5,612.0,584.0,Antigua and Barbuda,Constitutional Monarchy,Elisabeth II,63.0,AG
14,AUS,Australia,Oceania,Australia and New Zealand,7741220.0,1901.0,18886000,79.8,351182.0,392911.0,Australia,"Constitutional Monarchy, Federation",Elisabeth II,135.0,AU
15,AUT,Austria,Europe,Western Europe,83859.0,1918.0,8091800,77.7,211860.0,206025.0,Österreich,Federal Republic,Thomas Klestil,1523.0,AT
16,AZE,Azerbaijan,Asia,Middle East,86600.0,1991.0,7734000,62.9,4127.0,4100.0,Azärbaycan,Federal Republic,Heydär Äliyev,144.0,AZ
17,BDI,Burundi,Africa,Eastern Africa,27834.0,1962.0,6695000,46.2,903.0,982.0,Burundi/Uburundi,Republic,Pierre Buyoya,552.0,BI
18,BEL,Belgium,Europe,Western Europe,30518.0,1830.0,10239000,77.8,249704.0,243948.0,België/Belgique,"Constitutional Monarchy, Federation",Albert II,179.0,BE
19,BEN,Benin,Africa,Western Africa,112622.0,1960.0,6097000,50.2,2357.0,2141.0,Bénin,Republic,Mathieu Kérékou,187.0,BJ


In [28]:
df.loc[['row1', 'row2']]

Unnamed: 0,AA,BB,CC
row1,dog,멍멍이,45kg
row2,cat,냥냥이,7kg


In [29]:
df.iloc[0:3]

Unnamed: 0,AA,BB,CC
row1,dog,멍멍이,45kg
row2,cat,냥냥이,7kg
row3,bird,짹짹이,4g
