 #### 분석 목표
 1. 시간대별 평균 발전량 추이 분석 (bar, line)
   - 시간대별 발전량 추이 분석을 통해 가장 발전이 잘되는 시간대 구간 확인
 2. 월별 합계 발전량, 합계 일조시간, 평균 일사량 추이 분석 (bar, line)
   - 월별 합계 발전량, 합계 일조시간, 평균 일사량, 평균 강수량 추이 분석을 통해 발전 효율이 좋은 월 선정
 3. 태양광 발전량과 기상 데이터 각각의 상관관계 분석 (scatter, heatmap)
   - 발전량과 기상 데이터(강수량, 기온, 습도, 일사량, 일조시간, 전운량, 시정, 풍속, 풍향)의 상관 관계 
     분석을 통해 발전량에 영향을 주는 기상 요소를 확인하여 추후 예측 모델에 활용
 4. 태양광 발전량과 연관성이 큰 기상 데이터 조합 케이스별 상관관계 분석(ANOVA)
   - 단일 기상 데이터별 상관계수와 연관성이 큰 기상 데이터 조합 케이스별 상관계수 비교

* DataFrame 형태
  - Column
    - 날짜 : YYYYMMDD  (단위 : 연월일)
    - 시간 : HH  (단위 : 시간)
    - 발전량 : (단위 : Wh)
    - 강수량 : (단위 : mm)
    - 기온 : (단위 : °C)
    - 습도 : (단위 : %)
    - 일사량 : (단위 : MJ/m2)
    - 일조시간 : (단위 : hr)
    - 전운량 : (단위 : 10분위)
    - 시정 : (단위 : 10m)
    - 풍속 : (단위 : m/s)
    - 풍향 : (단위 : 16방위)

In [190]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib

matplotlib.rcParams

matplotlib.rcParams['font.family']

current_font_list = matplotlib.rcParams['font.family']

font_path = 'C:\\Windows\\Fonts\\batang.ttc'

kfont = matplotlib.font_manager.FontProperties(fname=font_path).get_name()

matplotlib.rcParams['font.family'] = [kfont] + current_font_list

In [191]:
pg_file = '한국전력거래소_지역별 시간별 태양광 발전량_20230228.csv'
pg_data = pd.read_csv(pg_file, encoding='cp949')

# pg_data.head()



# 부산지역, 태양광 발전량 데이터만 추출
# pg_data['지역'].unique()
filter_r = pg_data['지역'] == '부산시'
pg_data = pg_data.iloc[:, :-1][filter_r]

# 2023년 데이터 삭제 처리
filter_y = pg_data['거래일자'].str.contains('2023-')
idx = pg_data[filter_y].index
# print(len(idx))
pg_data.drop(idx, inplace=True)
pg_data = pg_data.replace('부산시', '부산')


# 인덱스 초기화
pg_data = pg_data.reset_index(drop=True)


print(len(pg_data))
pg_data.head()

52584


Unnamed: 0,거래일자,거래시간,지역,태양광 발전량(MWh)
0,2017-01-01,1,부산,
1,2017-01-01,2,부산,
2,2017-01-01,3,부산,
3,2017-01-01,4,부산,
4,2017-01-01,5,부산,


In [192]:
# 날짜 분리하기
pg_data['거래일자'] = pd.to_datetime(pg_data['거래일자'])

# pg_data.dtypes

pg_data['년'] = pg_data['거래일자'].dt.year
pg_data['월'] = pg_data['거래일자'].dt.month
pg_data['일'] = pg_data['거래일자'].dt.day

pg_data = pg_data.iloc[:, 1:]
pg_data.columns = ['시간', '지역', '발전량', '년', '월', '일']
pg_data = pg_data[['지역', '년', '월', '일', '시간', '발전량']]

# pg_data['시간'] = pg_data['시간'].astype(np.int64)-1

pg_data.head()

Unnamed: 0,지역,년,월,일,시간,발전량
0,부산,2017,1,1,1,
1,부산,2017,1,1,2,
2,부산,2017,1,1,3,
3,부산,2017,1,1,4,
4,부산,2017,1,1,5,


In [193]:
# 년도별 기상 데이터 가져오기
for year in range(2017, 2024) :
  globals()['weather_'+str(year)] = f'OBS_ASOS_TIM_{year}.csv'
  globals()['w_data_'+str(year)] = pd.read_csv(globals()['weather_'+str(year)], encoding='cp949')

w_data = pd.concat((w_data_2017, w_data_2018, w_data_2019, w_data_2020, w_data_2021, w_data_2022), axis=0)


w_data = w_data.iloc[:,[1,2,4,5,6,7,12,13,15,16]]

# NaN 처리
w_data = w_data.fillna(0.0)

# 년, 월, 일, 시간 분리
w_data['일시'] = pd.to_datetime(w_data['일시'], format='%Y-%m-%d %H:%M', errors='raise')
w_data['년'] = w_data['일시'].dt.year
w_data['월'] = w_data['일시'].dt.month
w_data['일'] = w_data['일시'].dt.day
w_data['시간'] = w_data['일시'].dt.hour

w_data.columns = ['지역', '일시', '강수량', '풍속', '풍향', '습도', '일조시간', '일사량', '전운량', '시정', '년', '월', '일', '시간']
w_data = w_data[['지역', '년', '월', '일', '시간', '강수량', '풍속', '풍향', '습도', '일조시간', '일사량', '전운량', '시정']]

# print(w_data.dtypes)
print(len(w_data))
w_data.head()

52578


Unnamed: 0,지역,년,월,일,시간,강수량,풍속,풍향,습도,일조시간,일사량,전운량,시정
0,부산,2017,1,1,1,0.0,3.6,360.0,67,0.0,0.0,0.0,1438.0
1,부산,2017,1,1,2,0.0,4.0,360.0,67,0.0,0.0,0.0,1572.0
2,부산,2017,1,1,3,0.0,1.5,360.0,69,0.0,0.0,0.0,1407.0
3,부산,2017,1,1,4,0.0,0.4,0.0,67,0.0,0.0,0.0,1392.0
4,부산,2017,1,1,5,0.0,3.3,320.0,68,0.0,0.0,0.0,1335.0


In [194]:
pg_df = pg_data.set_index(['년','월','일','시간','지역'])
w_df = w_data.set_index(['년','월','일','시간','지역'])

df = pg_df.join(w_df)

# df = df.replace('', 0.0)
# df = df.fillna(0.0)

df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,발전량,강수량,풍속,풍향,습도,일조시간,일사량,전운량,시정
년,월,일,시간,지역,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2017,1,1,1,부산,,0.0,3.6,360.0,67.0,0.0,0.0,0.0,1438.0
2017,1,1,2,부산,,0.0,4.0,360.0,67.0,0.0,0.0,0.0,1572.0
2017,1,1,3,부산,,0.0,1.5,360.0,69.0,0.0,0.0,0.0,1407.0
2017,1,1,4,부산,,0.0,0.4,0.0,67.0,0.0,0.0,0.0,1392.0
2017,1,1,5,부산,,0.0,3.3,320.0,68.0,0.0,0.0,0.0,1335.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022,12,31,20,부산,3.4,0.0,2.9,250.0,51.0,0.0,0.0,0.0,3672.0
2022,12,31,21,부산,2.64,0.0,1.6,200.0,50.0,0.0,0.0,0.0,3800.0
2022,12,31,22,부산,1.05,0.0,2.5,230.0,52.0,0.0,0.0,0.0,3406.0
2022,12,31,23,부산,0.75,0.0,1.8,270.0,53.0,0.0,0.0,0.0,3238.0
