# `finaStat.ipynb`

- **코스피 개별 종목의 종목 코드로 재무정보 가져오기**
- 종목코드 -> 고유번호 -> 법인등록번호 -> 재무정보

1. '금융감독원\_고유번호'를 통해 '종목 코드'로 '고유번호' 가져오기
   - [**금융감독원\_고유번호**](https://opendart.fss.or.kr/guide/detail.do?apiGrpCd=DS001&apiId=2019018)
2. '금융감독원*공시정보*기업개황'에서 '고유번호'로 '법인등록번호' 가져오기
   - [**금융감독원*공시정보*기업개황**](https://opendart.fss.or.kr/guide/detail.do?apiGrpCd=DS001&apiId=2019002)
   - KOSPI200 기업 목록 활용
3. 가져온 '법인등록번호'으로 재무정보 가져오기
   - [**금융위원회\_기업 재무정보**](https://www.data.go.kr/tcs/dss/selectApiDataDetailView.do?publicDataPk=15043459)

---

- [DART](https://dart.fss.or.kr/main.do) : 금융감독원에서 운영하는 기업정보전자공시시스템
- FSC : 금융위원회
- 연결재무제표(ConsolidatedMember)와 별도재무제표(SeparateMember)
  - 연결재무제표는 종속기업의 실적이 포함
  - 별도재무제표는 종속기업의 실적이 포함되지 않습니다.


# import


In [1]:
import os
import sys
import time
import pickle
import warnings
import json
from glob import glob
from io import BytesIO
from zipfile import ZipFile

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import koreanize_matplotlib
import requests
from bs4 import BeautifulSoup as bs

import FinanceDataReader as fdr
from tqdm import tqdm
import xmltodict

warnings.filterwarnings('ignore')
pd.options.display.max_columns = None
# pd.options.display.float_format = '{:.4f}'.format
plt.style.use("ggplot")
%config InlineBackend.figure_format = 'retina'

sys.path.append("../import")
import module as m
from gitig_auth import authKey

data_path = m.data_path
fp_fs = f"""{m.fp["finaStat"]}"""

fp_cc = f"{data_path}DART_corpCode.parquet"
fp_fs_cm = f"{data_path}finaStat_cm.parquet"
fp_fs_sm = f"{data_path}finaStat_sm.parquet"

authKey_dart = authKey["dart"]
authKey_fscfs = authKey["fsc_finaStatInfo"]

data_path : ../data/
fp
{'esgRating': '../data/esgRating.parquet',
 'finaStat': '../data/finaStat.parquet',
 'stockPrice': '../data/stockPrice.parquet',
 'stockPrice_year': '../data/stockPrice_year.parquet'}


# `DART_corpCode`

1. '금융감독원\_고유번호'를 통해 '종목 코드'로 '고유번호' 가져오기
   - [**금융감독원\_고유번호**](https://opendart.fss.or.kr/guide/detail.do?apiGrpCd=DS001&apiId=2019018)
2. '금융감독원*공시정보*기업개황'에서 '고유번호'로 '법인등록번호' 가져오기
   - [**금융감독원*공시정보*기업개황**](https://opendart.fss.or.kr/guide/detail.do?apiGrpCd=DS001&apiId=2019002)
   - KOSPI200 기업 목록 활용


## 금융감독원\_고유번호


In [5]:
url = f"https://opendart.fss.or.kr/api/corpCode.xml?crtfc_key={authKey_dart}"
response = requests.get(url)
if response.status_code == 200:
    with ZipFile(BytesIO(response.content)) as f:
        df_cc_raw = f.read("CORPCODE.xml")
        df_cc_raw = pd.read_xml(df_cc_raw)

    display(df_cc_raw)
    # (선택) 실행 시간이 오래걸려서 백업
    df_cc = df_cc_raw.copy()
    
else:
    print(response.status_code)

Unnamed: 0,corp_code,corp_name,stock_code,modify_date
0,434003,다코,,20170630
1,434456,일산약품,,20170630
2,430964,굿앤엘에스,,20170630
3,432403,한라판지,,20170630
4,388953,크레디피아제이십오차유동화전문회사,,20170630
...,...,...,...,...
97151,151571,청림실업,,20221114
97152,1143889,에이치엠지하우징,,20221114
97153,1359578,성남대장피에프브이,,20221114
97154,1002944,스마트에프앤디,,20221114


### 전처리


In [6]:
# stock_code가 없는 행 제거
df_cc = df_cc.dropna(subset=["stock_code"])
# code 글자수
df_cc["corp_code"] = df_cc["corp_code"].astype(int).astype(str).apply(lambda x: x.zfill(8))
df_cc["stock_code"] = df_cc["stock_code"].astype(int).astype(str).apply(lambda x: x.zfill(6))
# 컬럼 순서 설정
df_cc = df_cc[["stock_code", "corp_code", "corp_name"]]
df_cc.head()

Unnamed: 0,stock_code,corp_code,corp_name
2009,36720,260985,한빛네트
2021,40130,264529,엔플렉스
2022,55000,358545,동서정보기술
2784,32600,231567,애드모바일
3889,37600,247939,씨모스


### 전처리 : 분석 종목만 남김


In [7]:
df_components = pd.read_csv(f"{data_path}components_list.csv")
df_components["종목코드"] = df_components["종목코드"].astype(str).apply(lambda x: x.zfill(6))
df_components

Unnamed: 0,종목코드,종목명
0,000020,동화약품
1,000030,우리은행
2,000050,경방
3,000060,메리츠화재
4,000070,삼양홀딩스
...,...,...
342,377300,카카오페이
343,381970,케이카
344,383220,F&F
345,383800,LX홀딩스


In [8]:
list_components = set(df_components["종목코드"].to_list())  # KOSPI200 종목코드 리스트
df_cc.drop(df_cc[~df_cc["stock_code"].isin(list_components)].index, inplace=True)
df_cc

Unnamed: 0,stock_code,corp_code,corp_name
4916,103150,00684547,하이트맥주
10912,003640,00140380,유니온스틸
13183,064420,00399773,케이피케미칼
14065,053000,00375302,우리금융지주
25840,068870,00423609,LG생명과학
...,...,...,...
97025,079980,00362238,휴비스
97029,010120,00105855,엘에스일렉트릭
97035,005930,00126380,삼성전자
97098,096760,00632304,JW홀딩스


## 금융감독원*공시정보*기업개황


### 함수 : 종목코드 -> 법인등록번호


In [9]:
# 종목코드 -> 고유번호
def stockCode_to_corpCode(stock_code, df=df_cc):
    cropCode = df[df["stock_code"] == stock_code]["corp_code"].values[0]
    return cropCode


# 고유번호 -> 법인등록번호
def corpCode_to_jurirNo(corp_code, authKey=authKey_dart):

    url = "https://opendart.fss.or.kr/api/company.json"
    params = {"crtfc_key": authKey, "corp_code": corp_code}

    response = requests.get(url, params=params)
    if response.status_code == 200:
        return response.json()["jurir_no"]
    else:
        print(response.status_code)

    time.sleep(0.01)


# 종목코드 -> 고유번호 -> 법인등록번호
def stockCode_to_jurirNo(stock_code, df=df_cc, authKey=authKey_dart):

    cropCode = stockCode_to_corpCode(stock_code, df)
    jurirNo = corpCode_to_jurirNo(cropCode, authKey)
    return jurirNo


# 함수 테스트 : 삼성전자, 00126380, 1301110006246
stock_code = "005930"
cropCode = stockCode_to_corpCode(stock_code)
jurirNo = stockCode_to_jurirNo(stock_code)
print(cropCode == "00126380")
print(jurirNo == "1301110006246")

True
True


In [10]:
def temp(stock_code):
    jurirNo = stockCode_to_jurirNo(stock_code)
    return jurirNo


df_cc["jurir_no"] = df_cc["stock_code"].map(temp)
# (선택) 실행 시간이 오래걸려서 백업
df_cc_raw2 = df_cc.copy()
df_cc

Unnamed: 0,stock_code,corp_code,corp_name,jurir_no
4916,103150,00684547,하이트맥주,1101113927427
10912,003640,00140380,유니온스틸,1101110041501
13183,064420,00399773,케이피케미칼,2301110082112
14065,053000,00375302,우리금융지주,1101112202797
25840,068870,00423609,LG생명과학,1101112581183
...,...,...,...,...
97025,079980,00362238,휴비스,1101112102070
97029,010120,00105855,엘에스일렉트릭,1101110520076
97035,005930,00126380,삼성전자,1301110006246
97098,096760,00632304,JW홀딩스,1101113710468


## (선택) 영속화


In [11]:
m.DfPrst(df_cc, fp_cc)

['../data/DART_corpCode.parquet']


# `finaStat`

금융위원회\_기업 재무정보

- 요청 URL : http://apis.data.go.kr/1160100/service/GetFinaStatInfoService/getSummFinaStat


## 파일 불러오기


In [2]:
if glob(fp_cc):
    df_cc = m.DataLoad(fp_cc)
df_cc = df_cc

# # csv, code 글자수
# df_cc["corp_code"] = df_cc["corp_code"].astype(int).astype(str).apply(lambda x: x.zfill(8))
# df_cc["stock_code"] = df_cc["stock_code"].astype(int).astype(str).apply(lambda x: x.zfill(6))
# df_cc["jurir_no"] = df_cc["jurir_no"].astype(str).apply(lambda x: x.zfill(13))

Mem. usage decreased to  0.01 Mb (0.0% reduction)


[1m┌▣ [4mdf.shape[0m ---- ---- ---- ----
(347, 4)


[1m┌▣ [4mdf.info()[0m ---- ---- ---- ----
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 347 entries, 0 to 346
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   stock_code  347 non-null    object
 1   corp_code   347 non-null    object
 2   corp_name   347 non-null    object
 3   jurir_no    347 non-null    object
dtypes: object(4)
memory usage: 11.0+ KB
None


[1m┌▣ [4mdf.head()[0m ---- ---- ---- ----


Unnamed: 0,stock_code,corp_code,corp_name,jurir_no
0,103150,684547,하이트맥주,1101113927427
1,3640,140380,유니온스틸,1101110041501
2,64420,399773,케이피케미칼,2301110082112
3,53000,375302,우리금융지주,1101112202797
4,68870,423609,LG생명과학,1101112581183




[1m┌▣ [4mdf.columns.to_list()[0m ---- ---- ---- ----
['stock_code', 'corp_code', 'corp_name', 'jurir_no']


## 금융위원회\_기업 재무정보


In [5]:
# 함수
def Get_FinaStatInfo(
    crno,
    authKey,
    bizYear="",
    numOfRows="",
    pageNo="",
    url="http://apis.data.go.kr/1160100/service/GetFinaStatInfoService/getSummFinaStat",
):

    params = {
        "serviceKey": authKey,
        "numOfRows": numOfRows,
        "pageNo": numOfRows,
        "resultType": "json",
        "crno": crno,
        "bizYear": bizYear,
    }

    c = 0

    def func(c):
        try:
            response = requests.get(url, params=params)
            time.sleep(0.01)

            rsc = response.status_code
            if rsc == 200:
                rj = response.json()
                # totalCount(회계 정보 데이터 행의 수)가 0이면 pass
                totalCount = rj["response"]["body"]["totalCount"]
                if totalCount != 0:
                    data_json = rj["response"]["body"]["items"]["item"]
                    return pd.json_normalize(data_json)

            else:
                print(rsc)

        except:
            c += 1
            print(f"errCount : {c}, crno : {crno}")
            time.sleep(2)
            func(c)

    return func(c)


# test, 삼성전자
jurirNo = "1301110006246"
t = Get_FinaStatInfo(jurirNo, authKey=authKey_fscfs)
t

Unnamed: 0,basDt,crno,bizYear,fnclDcd,fnclDcdNm,enpSaleAmt,enpBzopPft,iclsPalClcAmt,enpCrtmNpf,enpTastAmt,enpTdbtAmt,enpTcptAmt,enpCptlAmt,fnclDebtRto
0,20151231,1301110006246,2015,ifrs_ConsolidatedMember,연결요약재무제표,200653482000000,26413442000000,25960995000000,19060144000000,242179521000000,63119716000000,179059805000000,0,35.2506337198
1,20151231,1301110006246,2015,ifrs_SeparateMember,별도요약재무제표,135205045000000,13398215000000,14352617000000,12238469000000,168969630000000,32541375000000,136428255000000,0,23.8523720764
2,20161231,1301110006246,2016,ifrs_ConsolidatedMember,연결요약재무제표,201866745000000,29240672000000,30713652000000,22726092000000,262174324000000,69211291000000,192963033000000,0,35.8676425862
3,20161231,1301110006246,2016,ifrs_SeparateMember,별도요약재무제표,133947204000000,13647436000000,14725074000000,11579749000000,174802959000000,37256197000000,137546762000000,0,27.0862043266
4,20171231,1301110006246,2017,ifrs_ConsolidatedMember,연결요약재무제표,239575376000000,53645038000000,56195967000000,42186747000000,301752090000000,87260662000000,214491428000000,897514000000,40.6825870915
5,20171231,1301110006246,2017,ifrs_SeparateMember,별도요약재무제표,161915007000000,34857091000000,36533552000000,28800837000000,198241360000000,46671585000000,151569775000000,897514000000,30.7921450698
6,20181231,1301110006246,2018,ifrs_ConsolidatedMember,연결요약재무제표,243771415000000,58886669000000,61159958000000,44344857000000,339357244000000,91604067000000,247753177000000,897514000000,36.9739222355
7,20181231,1301110006246,2018,ifrs_SeparateMember,별도요약재무제표,170381870000000,43699451000000,44398855000000,32815127000000,219021357000000,46033232000000,172988125000000,897514000000,26.6106312211
8,20191231,1301110006246,2019,ifrs_ConsolidatedMember,연결요약재무제표,230400881000000,27768509000000,30432189000000,21738865000000,352564497000000,89684076000000,262880421000000,897514000000,34.1159207136
9,20191231,1301110006246,2019,ifrs_SeparateMember,별도요약재무제표,154772859000000,14115067000000,19032469000000,15353323000000,216180920000000,38310673000000,177870247000000,897514000000,21.5385505143


In [6]:
df_fs_raw = pd.DataFrame()

for jurirNo in tqdm(df_cc["jurir_no"].values[:]):
    tmp = Get_FinaStatInfo(jurirNo, authKey=authKey_fscfs)
    df_fs_raw = pd.concat([df_fs_raw, tmp], axis=0, sort=False)

# (선택) 실행 시간이 오래걸려서 백업
with open("df_fs_raw.pickle", "wb") as f:
    pickle.dump(df_fs_raw, f)

# (선택) 실행 시간이 오래걸려서 백업
df_fs = df_fs_raw.copy()

  0%|          | 1/347 [00:00<01:03,  5.47it/s]

errCount : 1, crno : 1101110041501


  4%|▍         | 14/347 [00:18<01:23,  3.97it/s]

In [3]:
# (선택) 백업한 피클 불러오기
with open("df_fs_raw.pickle", "rb") as f:
    df_fs = pickle.load(f)

## 전처리 : 기간 설정

In [4]:
s = df_fs["bizYear"].astype(int)
df_fs = df_fs[(s >= 2010) & (s <= 2022)]
df_fs["bizYear"].unique()

array(['2015', '2016', '2018', '2017', '2019', '2011', '2012', '2013',
       '2020', '2021', '2014', '2010'], dtype=object)

## 병합


In [5]:
df_fs = pd.merge(df_fs, df_cc, how="left", left_on="crno", right_on="jurir_no")
df_fs

Unnamed: 0,basDt,crno,bizYear,fnclDcd,fnclDcdNm,enpSaleAmt,enpBzopPft,iclsPalClcAmt,enpCrtmNpf,enpTastAmt,enpTdbtAmt,enpTcptAmt,enpCptlAmt,fnclDebtRto,stock_code,corp_code,corp_name,jurir_no
0,20151231,1101112581183,2015,ifrs_ConsolidatedMember,연결요약재무제표,450526355034,25201939449,13894239541,11371548367,706980326807,449358007227,257622319580,84066030000,174.4251072499,068870,00423609,LG생명과학,1101112581183
1,20151231,1101112581183,2015,ifrs_SeparateMember,별도요약재무제표,435446816306,26136971686,15021017434,12443782196,703705604242,445384074854,258321529388,84066030000,172.4146167411,068870,00423609,LG생명과학,1101112581183
2,20150331,1748110000151,2015,ifrs_ConsolidatedMember,연결요약재무제표,207890217005,-871462412,-5660470477,-4315628598,727951594676,430331568399,297620026277,0,144.590931525,008000,00148717,도레이케미칼,1748110000151
3,20150331,1748110000151,2015,ifrs_SeparateMember,별도요약재무제표,175932552323,-2615227674,-6690401088,-5677659364,669744438135,397542364257,272202073878,0,146.0467800973,008000,00148717,도레이케미칼,1748110000151
4,20160331,1748110000151,2016,ifrs_ConsolidatedMember,연결요약재무제표,856457055850,48660898849,33307151671,21854084841,713772955053,397204546704,316568408349,0,125.4719473669,008000,00148717,도레이케미칼,1748110000151
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2697,20171231,1101110003733,2017,ifrs_SeparateMember,별도요약재무제표,286386535237,19514857445,19981045677,13814407827,689594455044,66157192173,623437262871,8450000000,10.6116839838,001130,00113243,대한제분,1101110003733
2698,20181231,1101110003733,2018,ifrs_ConsolidatedMember,연결요약재무제표,864585835647,32798360071,74914386097,51472655804,919690198948,170684939463,749005259485,8450000000,22.7882164112,001130,00113243,대한제분,1101110003733
2699,20181231,1101110003733,2018,ifrs_SeparateMember,별도요약재무제표,305132945611,19773999199,55139663239,34676568703,707393585361,61768795388,645624789973,8450000000,9.5672899101,001130,00113243,대한제분,1101110003733
2700,20191231,1101110003733,2019,ifrs_ConsolidatedMember,연결요약재무제표,933866093002,23483614049,25514184242,16993288734,1005108274460,245259980306,759848294154,8450000000,32.277493046,001130,00113243,대한제분,1101110003733


## 전처리


In [6]:
# 컬럼명 변경
dict_colReName = {
    "stock_code": "종목코드",
    "corp_name": "종목명",
    "basDt": "연_월_일",
    "crno": "법인등록번호",
    "bizYear": "사업연도",
    "fnclDcd": "재무제표구분코드",
    "fnclDcdNm": "재무제표구분코드명",
    "enpSaleAmt": "기업매출금액",
    "enpBzopPft": "기업영업이익",
    "iclsPalClcAmt": "포괄손익계산금액",
    "enpCrtmNpf": "기업당기순이익",
    "enpTastAmt": "기업총자산금액",
    "enpTdbtAmt": "기업총부채금액",
    "enpTcptAmt": "기업총자본금액",
    "enpCptlAmt": "기업자본금액",
    "fnclDebtRto": "재무제표부채비율",
}
df_fs = df_fs.rename(columns=dict_colReName)

# 컬럼 순서 변경
list_colOrder = [
    "종목코드",
    "종목명",
    "연_월_일",
    "재무제표구분코드명",
    "기업매출금액",
    "기업영업이익",
    "포괄손익계산금액",
    "기업당기순이익",
    "기업총자산금액",
    "기업총부채금액",
    "기업총자본금액",
    "기업자본금액",
    "재무제표부채비율",
]
df_fs = df_fs[list_colOrder]

#
list_roof = [
    "기업매출금액",
    "기업영업이익",
    "포괄손익계산금액",
    "기업당기순이익",
    "기업총자산금액",
    "기업총부채금액",
    "기업총자본금액",
    "기업자본금액",
    "재무제표부채비율",
]
for i in list_roof:
    df_fs[f"{i}"] = pd.to_numeric(df_fs[f"{i}"])

# 파생변수 추가
col = pd.to_datetime(df_fs["연_월_일"], format="%Y-%m-%d")
df_fs["연"] = col.dt.year
df_fs["분기"] = col.dt.quarter
df_fs["월"] = col.dt.month
df_fs["연_분기"] = df_fs["연"].astype("str") + "-" + df_fs["분기"].astype("str")
df_fs["연_월"] = df_fs["연"].astype("str") + "-" + df_fs["월"].astype("str")
df_fs["분기_월"] = df_fs["분기"].astype("str") + "-" + df_fs["월"].astype("str")
df_fs["연_분기_월"] = (
    df_fs["연"].astype("str") + "-" + df_fs["분기"].astype("str") + "-" + df_fs["월"].astype("str")
)


# 정렬
df_fs = df_fs.sort_values(by=["종목코드", "재무제표구분코드명", "연_월_일"], ascending=[True, True, True])

# 아래에 활용
list_col = df_fs.columns.to_list()

# 확인
df_fs.head(2)

Unnamed: 0,종목코드,종목명,연_월_일,재무제표구분코드명,기업매출금액,기업영업이익,포괄손익계산금액,기업당기순이익,기업총자산금액,기업총부채금액,기업총자본금액,기업자본금액,재무제표부채비율,연,분기,월,연_분기,연_월,분기_월,연_분기_월
517,20,동화약품,20151231,별도요약재무제표,223201285434,4812973681,6000622879,5608652157,317187030052,87069287627,230117742425,27931470000,37.836842,2015,4,12,2015-4,2015-12,4-12,2015-4-12
518,20,동화약품,20161231,별도요약재무제표,237470834801,11259333902,35655076190,26254318411,324604536650,71679236748,252925299902,27931470000,28.340082,2016,4,12,2016-4,2016-12,4-12,2016-4-12


### MinMaxScaling
- 일반적인 MinMaxScaling은 컬럼의 Min과 Max를 기준으로 스케일링되지만
- 이 분석의 경우에는 적절하지 못하므로 개별 종목의 Min과 Max를 기준으로 스케일링을 진행함.


In [10]:
l = [
    "기업매출금액",
    "기업영업이익",
    "포괄손익계산금액",
    "기업당기순이익",
    "기업총자산금액",
    "기업총부채금액",
    "기업총자본금액",
    "기업자본금액",
    "재무제표부채비율",
]
df_fs = m.DerivedCol_Groupby_MinMaxScaler(df_fs, ["종목코드", "종목명"],l)
df_fs

Unnamed: 0,종목코드,종목명,연_월_일,재무제표구분코드명,기업매출금액,기업영업이익,포괄손익계산금액,기업당기순이익,기업총자산금액,기업총부채금액,기업총자본금액,기업자본금액,재무제표부채비율,연,분기,월,연_분기,연_월,분기_월,연_분기_월,기업매출금액_mmscl,기업영업이익_mmscl,포괄손익계산금액_mmscl,기업당기순이익_mmscl,기업총자산금액_mmscl,기업총부채금액_mmscl,기업총자본금액_mmscl,기업자본금액_mmscl,재무제표부채비율_mmscl
517,000020,동화약품,20151231,별도요약재무제표,223201285434,4812973681,6000622879,5608652157,317187030052,87069287627,230117742425,27931470000,37.836842,2015,4,12,2015-4,2015-12,4-12,2015-4-12,0.000000,0.000000,0.000000,0.000000,0.000000,0.805409,0.000000,0.0,1.000000
518,000020,동화약품,20161231,별도요약재무제표,237470834801,11259333902,35655076190,26254318411,324604536650,71679236748,252925299902,27931470000,28.340082,2016,4,12,2016-4,2016-12,4-12,2016-4-12,0.169979,0.351484,0.500767,0.498683,0.056788,0.067106,0.175422,0.0,0.384839
519,000020,동화약품,20171231,별도요약재무제표,258881616575,10987308187,65218742497,47009013175,367225133428,70280404999,296944728429,27931470000,23.667841,2017,4,12,2017-4,2017-12,4-12,2017-4-12,0.425025,0.336652,1.000000,1.000000,0.383089,0.000000,0.513993,0.0,0.082191
521,000020,동화약품,20181231,별도요약재무제표,306602589029,11232142004,14545424343,10074474538,370294498762,73197485231,297097013531,27931470000,24.637570,2018,4,12,2018-4,2018-12,4-12,2018-4-12,0.993479,0.350002,0.144294,0.107869,0.406588,0.139940,0.515164,0.0,0.145006
523,000020,동화약품,20191231,별도요약재무제표,307150025786,10056452854,16743189225,9529175971,376466088770,74852728834,301613359936,27931470000,24.817445,2019,4,12,2019-4,2019-12,4-12,2019-4-12,1.000000,0.285898,0.181407,0.094698,0.453837,0.219347,0.549901,0.0,0.156657
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1124,377300,카카오페이,20211231,연결요약재무제표,0,-27227158574,-25951154634,-33863310446,3432708944352,1637071183230,1795637761122,65941540000,91.169345,2021,4,12,2021-4,2021-12,4-12,2021-4-12,0.000000,0.000000,0.000000,0.000000,1.000000,1.000000,1.000000,0.0,1.000000
480,381970,케이카,20211231,별도요약재무제표,1902394224458,71108716318,62536622867,46775528905,550023979312,274551592028,275472387284,24043266500,99.665740,2021,4,12,2021-4,2021-12,4-12,2021-4-12,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000
207,383220,F&F,20211231,별도요약재무제표,1081845593257,368305263484,369915729010,267596187446,1073133609361,485161176132,587972433229,3830707500,82.514273,2021,4,12,2021-4,2021-12,4-12,2021-4-12,0.000000,1.000000,1.000000,1.000000,0.000000,0.000000,1.000000,0.0,0.000000
206,383220,F&F,20211231,연결요약재무제표,1089172086930,322683779494,322533081639,231925929728,1146494709343,599071742826,547422966517,3830707500,109.434894,2021,4,12,2021-4,2021-12,4-12,2021-4-12,1.000000,0.000000,0.000000,0.000000,1.000000,1.000000,0.000000,0.0,1.000000


## (선택) '요약ㅇㅇ'와 'ㅇㅇ요약'을 'ㅇㅇ요약'으로 통일

- '연결요약재무제표'와 '요약연결재무제표'을 '연결요약재무제표'로 통일
- '별도요약재무제표'와 '요약별도재무정보'을 '별도요약재무제표'로 통일


In [11]:
df_fs.loc[
    (df_fs["재무제표구분코드명"] == "요약연결재무제표") | (df_fs["재무제표구분코드명"] == "연결요약재무제표"), "재무제표구분코드명"
] = "연결요약재무제표"

df_fs.loc[
    (df_fs["재무제표구분코드명"] == "요약별도재무정보") | (df_fs["재무제표구분코드명"] == "별도요약재무제표"), "재무제표구분코드명"
] = "별도요약재무제표"

df_fs["재무제표구분코드명"].unique()

array(['별도요약재무제표', '연결요약재무제표'], dtype=object)

## (선택) '연결재무제표'와 '별도재무제표' 분리

- '연결재무제표'만 있는 데이터프레임과
- '별도재무제표'만 있는 데이터프레임으로 분리


In [12]:
df_fs_cm = df_fs[(df_fs["재무제표구분코드명"] == "요약연결재무제표") | (df_fs["재무제표구분코드명"] == "연결요약재무제표")]
df_fs_cm.head(2)

Unnamed: 0,종목코드,종목명,연_월_일,재무제표구분코드명,기업매출금액,기업영업이익,포괄손익계산금액,기업당기순이익,기업총자산금액,기업총부채금액,기업총자본금액,기업자본금액,재무제표부채비율,연,분기,월,연_분기,연_월,분기_월,연_분기_월,기업매출금액_mmscl,기업영업이익_mmscl,포괄손익계산금액_mmscl,기업당기순이익_mmscl,기업총자산금액_mmscl,기업총부채금액_mmscl,기업총자본금액_mmscl,기업자본금액_mmscl,재무제표부채비율_mmscl
520,20,동화약품,20181231,연결요약재무제표,306602589029,11225780035,14539062374,10068112569,370599242793,73198591231,297400651562,27931470000,24.612788,2018,4,12,2018-4,2018-12,4-12,2018-4-12,0.993479,0.349655,0.144186,0.107715,0.408921,0.139994,0.517499,0.0,0.143401
522,20,동화약품,20191231,연결요약재무제표,307150025786,9917746256,16607867985,9393854731,376028553041,75153958643,300874594398,27931470000,24.978499,2019,4,12,2019-4,2019-12,4-12,2019-4-12,1.0,0.278335,0.179122,0.091429,0.450487,0.233798,0.544219,0.0,0.16709


In [13]:
df_fs_sm = df_fs[(df_fs["재무제표구분코드명"] == "요약별도재무정보") | (df_fs["재무제표구분코드명"] == "별도요약재무제표")]
df_fs_sm.head(2)

Unnamed: 0,종목코드,종목명,연_월_일,재무제표구분코드명,기업매출금액,기업영업이익,포괄손익계산금액,기업당기순이익,기업총자산금액,기업총부채금액,기업총자본금액,기업자본금액,재무제표부채비율,연,분기,월,연_분기,연_월,분기_월,연_분기_월,기업매출금액_mmscl,기업영업이익_mmscl,포괄손익계산금액_mmscl,기업당기순이익_mmscl,기업총자산금액_mmscl,기업총부채금액_mmscl,기업총자본금액_mmscl,기업자본금액_mmscl,재무제표부채비율_mmscl
517,20,동화약품,20151231,별도요약재무제표,223201285434,4812973681,6000622879,5608652157,317187030052,87069287627,230117742425,27931470000,37.836842,2015,4,12,2015-4,2015-12,4-12,2015-4-12,0.0,0.0,0.0,0.0,0.0,0.805409,0.0,0.0,1.0
518,20,동화약품,20161231,별도요약재무제표,237470834801,11259333902,35655076190,26254318411,324604536650,71679236748,252925299902,27931470000,28.340082,2016,4,12,2016-4,2016-12,4-12,2016-4-12,0.169979,0.351484,0.500767,0.498683,0.056788,0.067106,0.175422,0.0,0.384839


## 영속화


In [14]:
m.DfPrst(df_fs, fp_fs)

['../data/finaStat.parquet']


In [15]:
m.DfPrst(df_fs_cm, fp_fs_cm)
m.DfPrst(df_fs_sm, fp_fs_sm)

['../data/finaStat_cm.parquet']
['../data/finaStat_sm.parquet']
