# 입문자를 위한, 파이썬/R 데이터 분석  

]

## Today's mission

- FDR(FinanceDataReader)를 통한 상장종목 수집
- 한국거래소(KRX) 전체 종목 분석
- FDR를 통한 개별 종목 데이터 수집 및 분석 / 과학적 기수법

]

## FDR(FinanceDataReader)를 통한 상장종목 수집

### FinanceDataReader 란?

* 한국 주식 가격, 미국주식 가격, 지수, 환율, 암호화폐 가격, 종목 리스팅 등 금융 데이터 수집 라이브러리

* [FinanceData/FinanceDataReader: Financial data reader](https://github.com/FinanceData/FinanceDataReader)
* [FinanceDataReader 사용자 안내서 | FinanceData](https://financedata.github.io/posts/finance-data-reader-users-guide.html)
* https://pandas-datareader.readthedocs.io/en/latest/readers/index.html

## 설치

In [1]:
# 주석을 풀고 설치해 주세요. 주석을 푸는 방법은 아래 코드의 맨 앞에 있는 #을 지워주시면 됩니다.
# !pip install -U finance-datareader

In [2]:
# !pip install plotly

## 라이브러리 불러오기

In [3]:
# 데이터 분석을 위해 pandas 불러오기
import pandas as pd

In [4]:
# FinanceDataReader 를 fdr 별칭으로 불러옵니다.
# 라이브러리의 version을 확인하고 싶을 때는 .__version__ 으로 확인합니다. 
import FinanceDataReader as fdr
fdr.__version__

'0.9.93'

## 한국거래소 상장종목 전체 가져오기

참고 : http://data.krx.co.kr/contents/MDC/MDI/mdiLoader/index.cmd?menuId=MDC0201020101 > [종목정보] > [전종목 기본종목]

In [5]:
# 도움말을 보고자 할때는 ? 를 사용하고 소스코드를 볼 때는 ??를 사용합니다.
# 주피터 노트북에서는 함수나 메소드의 괄호 안에서 shift + tab 키를 누르면 도움말을 볼 수 있습니다.
fdr.StockListing?

[0;31mSignature:[0m [0mfdr[0m[0;34m.[0m[0mStockListing[0m[0;34m([0m[0mmarket[0m[0;34m:[0m [0mstr[0m[0;34m,[0m [0mstart[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mend[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m [0;34m->[0m [0mpandas[0m[0;34m.[0m[0mcore[0m[0;34m.[0m[0mframe[0m[0;34m.[0m[0mDataFrame[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
read stock list of stock exchanges
* market: 'KRX', 'KOSPI', 'KOSDAQ', 'KONEX', 'KRX-MARCAP', 
        'KRX-DESC', 'KOSPI-DESC', 'KOSDAQ-DESC', 'KONEX-DESC',
        'KRX-DELISTING', 'KRX-ADMINISTRATIVE', 'KRX-MARCAP',
        'NASDAQ', 'NYSE', 'AMEX', 'SSE', 'SZSE', 'HKEX', 'TSE', 'HOSE',
        'S&P500',
        'ETF/KR',
[0;31mFile:[0m      ~/opt/anaconda3/envs/dl/lib/python3.9/site-packages/FinanceDataReader/data.py
[0;31mType:[0m      function

In [6]:
# KRX : KRX 종목 전체
# KOSPI : KOSPI 종목
# KOSDAQ : KOSDAQ 종목
# KONEX : KONEX 종목
# NASDAQ : 나스닥 종목
# NYSE : 뉴욕증권거래소 종목
# SP500 : S&P500 종목
df_krx = fdr.StockListing("KRX")
df_krx

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
0,005930,KR7005930003,삼성전자,KOSPI,,79400,1,1100,1.40,79500,79800,78700,4797281,380063908100,474000734470000,5969782550,STK
1,000660,KR7000660001,SK하이닉스,KOSPI,,199800,1,5900,3.04,198700,200500,198100,1422029,283249610300,145454872527000,728002365,STK
2,373220,KR7373220003,LG에너지솔루션,KOSPI,,330000,1,500,0.15,333000,334000,327000,40538,13385655000,77220000000000,234000000,STK
3,207940,KR7207940008,삼성바이오로직스,KOSPI,,916000,2,-14000,-1.51,931000,934000,915000,32421,29916114000,65195384000000,71174000,STK
4,005380,KR7005380001,현대차,KOSPI,,258000,1,3000,1.18,258000,261000,255000,328362,84671381000,54029377278000,209416191,STK
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2823,413300,KR7413300005,티엘엔지니어링,KONEX,일반기업부,1900,1,201,11.83,1900,1900,1900,1,1900,2480478500,1305515,KNX
2824,245450,KR7245450002,씨앤에스링크,KONEX,일반기업부,1398,1,123,9.65,1398,1398,1398,11,15378,2208784080,1579960,KNX
2825,288490,KR7288490006,나라소프트,KONEX,일반기업부,120,1,7,6.19,125,125,120,199,24380,2096589240,17471577,KNX
2826,236030,KR7236030003,씨알푸드,KONEX,일반기업부,555,1,28,5.31,570,580,450,64571,29743280,1128499260,2033332,KNX


In [7]:
fdr.StockListing("NASDAQ")

100%|██████████| 3709/3709 [00:03<00:00, 1076.58it/s]


Unnamed: 0,Symbol,Name,IndustryCode,Industry
0,AAPL,Apple Inc,57106020,전화 및 소형 장치
1,NVDA,NVIDIA Corp,57101010,반도체
2,MSFT,Microsoft Corp,57201020,소프트웨어
3,AMZN,Amazon.com Inc,53402010,백화점
4,META,Meta Platforms Inc,57201030,온라인 서비스
...,...,...,...,...
3704,NYMTI,New York Mortgage Trust Inc 9 125 Senior Notes...,60102040,특수 REITs
3705,RFAIR,RF Acquisition II Right 01st May 2026,55601010,투자 지주 회사
3706,FSHPR,Flag Ship Acquisition Rights,55601010,투자 지주 회사
3707,VLYPN,Valley National Bancorp 8 250 Fixed Rate Reset...,55101010,은행


In [8]:
# 한국거래소 상장종목 전체 가져오기
df_krx.head()

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
0,5930,KR7005930003,삼성전자,KOSPI,,79400,1,1100,1.4,79500,79800,78700,4797281,380063908100,474000734470000,5969782550,STK
1,660,KR7000660001,SK하이닉스,KOSPI,,199800,1,5900,3.04,198700,200500,198100,1422029,283249610300,145454872527000,728002365,STK
2,373220,KR7373220003,LG에너지솔루션,KOSPI,,330000,1,500,0.15,333000,334000,327000,40538,13385655000,77220000000000,234000000,STK
3,207940,KR7207940008,삼성바이오로직스,KOSPI,,916000,2,-14000,-1.51,931000,934000,915000,32421,29916114000,65195384000000,71174000,STK
4,5380,KR7005380001,현대차,KOSPI,,258000,1,3000,1.18,258000,261000,255000,328362,84671381000,54029377278000,209416191,STK


In [9]:
df_krx[df_krx['Market']=='KOSDAQ']

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
37,028300,KR7028300002,HLB,KOSDAQ,중견기업부,90600,1,100,0.11,90500,91700,89700,302531,27432589400,11854589978400,130845364,KSQ
39,086520,KR7086520004,에코프로,KOSDAQ,우량기업부,85500,3,0,0.00,86600,87000,85100,295339,25377853500,11383328070000,133138340,KSQ
90,348370,KR7348370008,엔켐,KOSDAQ,벤처기업부,202000,2,-3000,-1.46,212500,214500,199800,141285,29076383000,4198958446000,20786923,KSQ
106,000250,KR7000250001,삼천당제약,KOSDAQ,중견기업부,152100,2,-3000,-1.93,157400,158300,151900,186485,28964667200,3567881491200,23457472,KSQ
122,068760,KR7068760008,셀트리온제약,KOSDAQ,우량기업부,70800,2,-1400,-1.94,72400,73400,70600,143960,10335499700,2945043032400,41596653,KSQ
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2775,475240,KR7475240008,하나32호스팩,KOSDAQ,SPAC(소속부없음),2190,3,0,0.00,2190,2200,2190,1036,2268950,7008000000,3200000,KSQ
2777,323230,KR7323230003,엠에프엠코리아,KOSDAQ,관리종목(소속부없음),161,0,0,0.00,0,0,0,0,0,6967193212,43274492,KSQ
2778,438580,KR7438580003,엔에이치스팩25호,KOSDAQ,SPAC(소속부없음),2295,1,15,0.66,2275,2300,2250,2391,5381845,6930900000,3020000,KSQ
2814,021045,KR7021041009,대호특수강우,KOSDAQ,중견기업부,7600,3,0,0.00,7600,7600,7600,11,83600,3224269600,424246,KSQ


In [10]:
# 행과 열의 크기를 봅니다.(행, 열) 순
df_krx.shape

(2828, 17)

In [11]:
# 전체 데이터프레임의 요약정보를 봅니다.
df_krx.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2828 entries, 0 to 2827
Data columns (total 17 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   Code         2828 non-null   object 
 1   ISU_CD       2828 non-null   object 
 2   Name         2828 non-null   object 
 3   Market       2828 non-null   object 
 4   Dept         2828 non-null   object 
 5   Close        2828 non-null   object 
 6   ChangeCode   2828 non-null   object 
 7   Changes      2828 non-null   int64  
 8   ChagesRatio  2828 non-null   float64
 9   Open         2828 non-null   int64  
 10  High         2828 non-null   int64  
 11  Low          2828 non-null   int64  
 12  Volume       2828 non-null   int64  
 13  Amount       2828 non-null   int64  
 14  Marcap       2828 non-null   int64  
 15  Stocks       2828 non-null   int64  
 16  MarketId     2828 non-null   object 
dtypes: float64(1), int64(8), object(8)
memory usage: 375.7+ KB


In [12]:
# 기술통계 값을 요약합니다.
df_krx.describe()

Unnamed: 0,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks
count,2828.0,2828.0,2828.0,2828.0,2828.0,2828.0,2828.0,2828.0,2828.0
mean,82.800566,0.746344,18818.427511,19076.301627,18571.738331,230214.0,2488983000.0,917084500000.0,42207460.0
std,855.586646,2.784334,48164.69157,48537.738342,47470.92593,1427480.0,13300020000.0,9900145000000.0,133283300.0
min,-14000.0,-25.36,0.0,0.0,0.0,0.0,0.0,796000000.0,200000.0
25%,-5.0,-0.18,2183.75,2203.75,2165.0,3470.25,16959920.0,49770510000.0,10252440.0
50%,11.0,0.36,5430.0,5560.0,5360.0,17802.5,92579960.0,105230300000.0,19912110.0
75%,95.0,1.4325,14425.0,14667.5,14155.0,74836.25,578864200.0,263016800000.0,42036070.0
max,14500.0,29.97,931000.0,934000.0,915000.0,50726650.0,380063900000.0,474000700000000.0,5969783000.0


In [13]:
pd.set_option('display.max_row', 60)
df_krx

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
0,005930,KR7005930003,삼성전자,KOSPI,,79400,1,1100,1.40,79500,79800,78700,4797281,380063908100,474000734470000,5969782550,STK
1,000660,KR7000660001,SK하이닉스,KOSPI,,199800,1,5900,3.04,198700,200500,198100,1422029,283249610300,145454872527000,728002365,STK
2,373220,KR7373220003,LG에너지솔루션,KOSPI,,330000,1,500,0.15,333000,334000,327000,40538,13385655000,77220000000000,234000000,STK
3,207940,KR7207940008,삼성바이오로직스,KOSPI,,916000,2,-14000,-1.51,931000,934000,915000,32421,29916114000,65195384000000,71174000,STK
4,005380,KR7005380001,현대차,KOSPI,,258000,1,3000,1.18,258000,261000,255000,328362,84671381000,54029377278000,209416191,STK
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2823,413300,KR7413300005,티엘엔지니어링,KONEX,일반기업부,1900,1,201,11.83,1900,1900,1900,1,1900,2480478500,1305515,KNX
2824,245450,KR7245450002,씨앤에스링크,KONEX,일반기업부,1398,1,123,9.65,1398,1398,1398,11,15378,2208784080,1579960,KNX
2825,288490,KR7288490006,나라소프트,KONEX,일반기업부,120,1,7,6.19,125,125,120,199,24380,2096589240,17471577,KNX
2826,236030,KR7236030003,씨알푸드,KONEX,일반기업부,555,1,28,5.31,570,580,450,64571,29743280,1128499260,2033332,KNX


## 파일로 저장하고 불러오기
<img src="https://pandas.pydata.org/docs/_images/02_io_readwrite.svg">

In [14]:
# head 로 미리보기
df_krx.head()

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
0,5930,KR7005930003,삼성전자,KOSPI,,79400,1,1100,1.4,79500,79800,78700,4797281,380063908100,474000734470000,5969782550,STK
1,660,KR7000660001,SK하이닉스,KOSPI,,199800,1,5900,3.04,198700,200500,198100,1422029,283249610300,145454872527000,728002365,STK
2,373220,KR7373220003,LG에너지솔루션,KOSPI,,330000,1,500,0.15,333000,334000,327000,40538,13385655000,77220000000000,234000000,STK
3,207940,KR7207940008,삼성바이오로직스,KOSPI,,916000,2,-14000,-1.51,931000,934000,915000,32421,29916114000,65195384000000,71174000,STK
4,5380,KR7005380001,현대차,KOSPI,,258000,1,3000,1.18,258000,261000,255000,328362,84671381000,54029377278000,209416191,STK


In [15]:
# to_csv로 Dataframe을 데이터 저장용 파일인 CSV 파일로 바꿀 수 있습니다.
df_krx.to_csv("data/krx.csv", index=False)

In [16]:
# CSV로 저장된 파일을 다시 DataFrame으로 읽어서 확인해 봅니다.
pd.read_csv("data/krx.csv")

Unnamed: 0,Code,ISU_CD,Name,Market,Dept,Close,ChangeCode,Changes,ChagesRatio,Open,High,Low,Volume,Amount,Marcap,Stocks,MarketId
0,005930,KR7005930003,삼성전자,KOSPI,,79400,1,1100,1.40,79500,79800,78700,4797281,380063908100,474000734470000,5969782550,STK
1,000660,KR7000660001,SK하이닉스,KOSPI,,199800,1,5900,3.04,198700,200500,198100,1422029,283249610300,145454872527000,728002365,STK
2,373220,KR7373220003,LG에너지솔루션,KOSPI,,330000,1,500,0.15,333000,334000,327000,40538,13385655000,77220000000000,234000000,STK
3,207940,KR7207940008,삼성바이오로직스,KOSPI,,916000,2,-14000,-1.51,931000,934000,915000,32421,29916114000,65195384000000,71174000,STK
4,005380,KR7005380001,현대차,KOSPI,,258000,1,3000,1.18,258000,261000,255000,328362,84671381000,54029377278000,209416191,STK
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2823,413300,KR7413300005,티엘엔지니어링,KONEX,일반기업부,1900,1,201,11.83,1900,1900,1900,1,1900,2480478500,1305515,KNX
2824,245450,KR7245450002,씨앤에스링크,KONEX,일반기업부,1398,1,123,9.65,1398,1398,1398,11,15378,2208784080,1579960,KNX
2825,288490,KR7288490006,나라소프트,KONEX,일반기업부,120,1,7,6.19,125,125,120,199,24380,2096589240,17471577,KNX
2826,236030,KR7236030003,씨알푸드,KONEX,일반기업부,555,1,28,5.31,570,580,450,64571,29743280,1128499260,2033332,KNX


In [17]:
## 도움말 보는 방법
#1 <Shift>+<Tab> : 풍선
#2 함수명? : 도움말
#3 함수명?? : 소스코드까지
pd.read_csv??

[0;31mSignature:[0m
[0mpd[0m[0;34m.[0m[0mread_csv[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mfilepath_or_buffer[0m[0;34m:[0m [0;34m'FilePath | ReadCsvBuffer[bytes] | ReadCsvBuffer[str]'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0msep[0m[0;34m:[0m [0;34m'str | None | lib.NoDefault'[0m [0;34m=[0m [0;34m<[0m[0mno_default[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdelimiter[0m[0;34m:[0m [0;34m'str | None | lib.NoDefault'[0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mheader[0m[0;34m:[0m [0;34m"int | Sequence[int] | None | Literal['infer']"[0m [0;34m=[0m [0;34m'infer'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnames[0m[0;34m:[0m [0;34m'Sequence[Hashable] | None | lib.NoDefault'[0m [0;34m=[0m [0;34m<[0m[0mno_default[0m[0;34m>[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mindex_col[0m[0;34m:[0m [0;34m'IndexLabel | Literal[False] | None'[0m [0

#### (실습) SP500 종목 리스팅 데이터를 읽고, CSV 파일로 저장하고, CSV 파일 읽어서 확인

In [18]:
import pandas as pd
import FinanceDataReader as fdr

In [19]:
df_sp500 = fdr.StockListing("SP500")
df_sp500

Unnamed: 0,Symbol,Name,Sector,Industry
0,MMM,3M,Industrials,Industrial Conglomerates
1,AOS,A. O. Smith,Industrials,Building Products
2,ABT,Abbott Laboratories,Health Care,Health Care Equipment
3,ABBV,AbbVie,Health Care,Biotechnology
4,ACN,Accenture,Information Technology,IT Consulting & Other Services
...,...,...,...,...
498,XYL,Xylem Inc.,Industrials,Industrial Machinery & Supplies & Components
499,YUM,Yum! Brands,Consumer Discretionary,Restaurants
500,ZBRA,Zebra Technologies,Information Technology,Electronic Equipment & Instruments
501,ZBH,Zimmer Biomet,Health Care,Health Care Equipment


In [20]:
df_sp500.to_csv("data/sp500.csv", index=False)

In [21]:
pd.read_csv("data/sp500.csv")

Unnamed: 0,Symbol,Name,Sector,Industry
0,MMM,3M,Industrials,Industrial Conglomerates
1,AOS,A. O. Smith,Industrials,Building Products
2,ABT,Abbott Laboratories,Health Care,Health Care Equipment
3,ABBV,AbbVie,Health Care,Biotechnology
4,ACN,Accenture,Information Technology,IT Consulting & Other Services
...,...,...,...,...
498,XYL,Xylem Inc.,Industrials,Industrial Machinery & Supplies & Components
499,YUM,Yum! Brands,Consumer Discretionary,Restaurants
500,ZBRA,Zebra Technologies,Information Technology,Electronic Equipment & Instruments
501,ZBH,Zimmer Biomet,Health Care,Health Care Equipment
