# [파이썬으로 배우는 알고리즘 트레이딩 wikidocs]
- pandas 부분 요약 
[(Reference URL)](https://wikidocs.net/4366)

## 1. pandas
* Series, DataFrame 2가지 자료구조가 있음

## (1.1) Series
### (1) 일차원 배열의 값과 인덱스가 동시에 저장됨

In [1]:
from pandas import Series, DataFrame

In [2]:
kakao = Series([92600, 92400, 92100, 94300, 92300])
print(kakao)

0    92600
1    92400
2    92100
3    94300
4    92300
dtype: int64


In [3]:
print(kakao[0])
print(kakao[1])
print(kakao[2])

92600
92400
92100


### (2) Series의 인덱스 값을 지정할 수 있음 (날짜 등등)

In [4]:
kakao2 = Series([92600, 92400, 92100, 94300, 92300], index=['2016-02-19',
                                                            '2016-02-18',
                                                            '2016-02-17',
                                                            '2016-02-16',
                                                            '2016-02-15'])
print(kakao2)

2016-02-19    92600
2016-02-18    92400
2016-02-17    92100
2016-02-16    94300
2016-02-15    92300
dtype: int64


In [5]:
print(kakao2['2016-02-19'])
print(kakao2['2016-02-18'])

92600
92400


### (3) 인덱스 값, value 값을 출력하기

In [6]:
for date in kakao2.index:
    print(date)
for ending_price in kakao2.values:
    print(ending_price)

2016-02-19
2016-02-18
2016-02-17
2016-02-16
2016-02-15
92600
92400
92100
94300
92300


### (4) Series 객체 덧셈 연산

In [8]:
from pandas import Series

mine = Series([10, 20, 30], index=['naver', 'sk', 'kt'])
friend = Series([10, 30, 20], index=['kt', 'naver', 'sk'])

In [10]:
merge = mine + friend
print(merge)

kt       40
naver    40
sk       40
dtype: int64


## (1.2) DataFrame

### (1) 여러 개의 칼럼으로 구성된 2차원 형태의 자료구조

In [11]:
from pandas import DataFrame

#### 딕셔너리를 이용한 DataFrame 객체 생성1

In [13]:
raw_data = {
    'col0': [1,2,3,4],
    'col1': [10, 20, 30, 40],
    'col2': [100, 200, 300, 400]
}
data = DataFrame(raw_data)
print(data)

   col0  col1  col2
0     1    10   100
1     2    20   200
2     3    30   300
3     4    40   400


In [14]:
data['col0']

0    1
1    2
2    3
3    4
Name: col0, dtype: int64

In [15]:
data['col2']

0    100
1    200
2    300
3    400
Name: col2, dtype: int64

In [16]:
data['col1']

0    10
1    20
2    30
3    40
Name: col1, dtype: int64

In [18]:
type([data['col0']])

list

In [21]:
for i in range(len(data['col0'])):
    print(data['col0'][i])

1
2
3
4


#### 딕셔너리를 이용한 DataFrame 객체 생성2

In [22]:
from pandas import Series, DataFrame

daeshin = {'open':  [11650, 11100, 11200, 11100, 11000],
           'high':  [12100, 11800, 11200, 11100, 11150],
           'low' :  [11600, 11050, 10900, 10950, 10900],
           'close': [11900, 11600, 11000, 11100, 11050]}

daeshin_day = DataFrame(daeshin)
print(daeshin_day)

    open   high    low  close
0  11650  12100  11600  11900
1  11100  11800  11050  11600
2  11200  11200  10900  11000
3  11100  11100  10950  11100
4  11000  11150  10900  11050


In [23]:
daeshin_day = DataFrame(daeshin, columns=['open', 'high', 'low', 'close'])
daeshin_day

Unnamed: 0,open,high,low,close
0,11650,12100,11600,11900
1,11100,11800,11050,11600
2,11200,11200,10900,11000
3,11100,11100,10950,11100
4,11000,11150,10900,11050


#### DataFrame을 생성하는 지점에 index를 통해 인덱스 지정 가능함

In [24]:
date = ['16.02.29', '16.02.26', '16.02.25', '16.02.24', '16.02.23']
daeshin_day = DataFrame(daeshin, columns=['open', 'high', 'low', 'close'], index=date)

In [25]:
daeshin_day

Unnamed: 0,open,high,low,close
16.02.29,11650,12100,11600,11900
16.02.26,11100,11800,11050,11600
16.02.25,11200,11200,10900,11000
16.02.24,11100,11100,10950,11100
16.02.23,11000,11150,10900,11050


### (2) DataFrame 칼럼, 로우 선택

#### 칼럼 검색 하기

In [26]:
close = daeshin_day['close']
print(close)

16.02.29    11900
16.02.26    11600
16.02.25    11000
16.02.24    11100
16.02.23    11050
Name: close, dtype: int64


#### 로우 검색하기 -> .loc 사용

In [28]:
day_data = daeshin_day.loc['16.02.24']
print(day_data)
print(type(day_data))

open     11100
high     11100
low      10950
close    11100
Name: 16.02.24, dtype: int64
<class 'pandas.core.series.Series'>


#### dataframe 칼럼, 인덱스 명 출력

In [29]:
print(daeshin_day.columns)
print(daeshin_day.index)

Index(['open', 'high', 'low', 'close'], dtype='object')
Index(['16.02.29', '16.02.26', '16.02.25', '16.02.24', '16.02.23'], dtype='object')
