### Disclaimer
* Catboost package versiyon ve thread counttan ötürü sonuçlarda çok ufak değişiklikler olabilir. Ne kadar eşitlemeye çalışsam da tam benzer sonucu alamadım.
* Discussionda da belirtildiği üzere bizlerden belirli varsayımların belirlenmesi istendi. Bu çalışmadaki varsayımlar şöyledir:
    * [EPİAŞ Şeffaflık](https://seffaflik.epias.com.tr/transparency/) Platformundan alınan bütün veriler minumum lag(24) yani bir gün gecikmeli olarak kullanıldı. Kod ve skor üzerinden de teyit edilebilir.
    * Hava durumu verisi aynı timestamp ile kullanıldı. Bunun nedeni günlük hava durumu tahmini içeren geriye dönük bir verisetini ücretsiz olarak bulamamış olmam ve aslında short-term (bu çalışmada 1 gün) hava durumu tahminlerinin oldukça başarılı sonuç vermesidir.
    * Herhangi bir şekilde forward leak (-1,-2,-3,...,-n) kullanılmamıştır.
    

Nihai Çözümde Kullanılan Dış Veriler:
* Resmi/Dini/Milli Tatiller
* NASA Hava Durumu (Temperature, Wind, Humidty)
* Meteostat Hava Durumu (NASA ile benzer)
* Güneş Işınları (Solar irradiance)
* EPİAŞ Şeffaflık Real Time Consumption - Minumum 24 lag (Türkiye'nin toplam enerji tüketimi)
* EPİAŞ Şeffaflık Real Time Generation - Minumum 24 lag (Türkiye'nin toplam enerji üretimi)

In [1]:
pip install pvlib #Solar featurelar için gerekli kütüphane

Collecting pvlib
  Downloading pvlib-0.9.5-py3-none-any.whl (29.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m29.4/29.4 MB[0m [31m36.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pvlib
Successfully installed pvlib-0.9.5
[0mNote: you may need to restart the kernel to use updated packages.


In [2]:
pip install meteostat #Hava durumu verisi python API kütüphanesi

Collecting meteostat
  Downloading meteostat-1.6.5-py3-none-any.whl (31 kB)
Installing collected packages: meteostat
Successfully installed meteostat-1.6.5
[0mNote: you may need to restart the kernel to use updated packages.


In [3]:
import warnings
import requests
import pvlib
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
import numpy as np
import json
import gc
import datetime
from tqdm.notebook import tqdm
from sklearn.preprocessing import SplineTransformer
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_percentage_error
from pvlib.location import Location
from pvlib import clearsky, solarposition,tracking
from meteostat import Hourly,Point
from datetime import datetime
from catboost import CatBoostRegressor

tqdm.pandas()
warnings.filterwarnings("ignore")

pd.set_option('display.max_columns', 300)
pd.set_option('display.max_rows', 300)

In [4]:
class CFG:
    """
    Class'ın amacı:
    - Feature engineering aşamasını aç-kapa formatına dönüştürerek bug azaltmak
    - Nihai çözümümde bazı featurelar olmasa dahi deneme yapılan feature kapsamını genişletmek/gösterebilmek.
    """
    
    target = 'Dağıtılan Enerji (MWh)'
    train_path = '/kaggle/input/gdz-elektrik-datathon-2023/train.csv'
    submission_path = '/kaggle/input/gdz-elektrik-datathon-2023/sample_submission.csv'
    med_path = '/kaggle/input/gdz-elektrik-datathon-2023/med.csv'
    holiday_path = '/kaggle/input/gdz-external-data/Calendar.csv'
    read_from_path = True
    
    #Hangi dataların kullanılacağına karar verme
    production_features = True
    consumption_features = False
    yuk_tahmin_plani_features = False
    kgup_features = False
    nasa_weather_features = True
    meteostat_weather_features = True
    holiday_features = True
    solar_features = True
    seasonality_features = True
    seasonality_spline_features = True
    
    
    #Feature türlerine karar verme
    production_lag_features = True
    consumption_lag_features = False
    diff_pct_features = False
    weather_lag_features = True
    solar_lag_features = True
    rolling_features = False
    rolling_shift_features = False
    
    #Feature parametreleri
    nasa_feature_columns =  ['T2M','T2MDEW','T2MWET','QV2M','RH2M','PRECTOTCORR','PS','WS10M','WD10M','WS50M']
    meteostat_feature_columns = ['dwpt','rhum','prcp','wdir','wspd','pres','coco']
    production_base_columns = ['fueloil','gasOil','blackCoal','lignite','geothermal','naturalGas','river','dammedHydro','lng','biomass','importCoal','asphaltiteCoal','wind','sun','importExport','wasteheat','total']
    consumption_base_columns = ['consumption']
    weather_lag_range = np.arange(0,51,5)
    rolling_range = np.arange(0,48,6)[1:]
    roll_types = ['mean','std','min','max']

### Read Base Data

In [5]:
df = pd.read_csv(CFG.train_path)
ss = pd.read_csv(CFG.submission_path)
med = pd.read_csv(CFG.med_path)
med['kesinti_flag'] = 1
med = med.rename(columns={'Tarih':'Tarih_daily'})
med['Tarih_daily'] = pd.to_datetime(med['Tarih_daily'])
df['data'] = 'train'
ss['data'] = 'sub'

df = pd.concat([df,ss])
df.loc[df.data=='sub',CFG.target] = np.nan

df['Tarih'] = pd.to_datetime(df['Tarih'])
df['Tarih_daily'] = pd.to_datetime(df.Tarih.apply(lambda x: x.strftime('%Y-%m-%d')))
df = df.sort_values('Tarih').reset_index(drop=True)
df.head()

Unnamed: 0,Tarih,Dağıtılan Enerji (MWh),data,Tarih_daily
0,2018-01-01 00:00:00,1593.944216,train,2018-01-01
1,2018-01-01 01:00:00,1513.933887,train,2018-01-01
2,2018-01-01 02:00:00,1402.612637,train,2018-01-01
3,2018-01-01 03:00:00,1278.527266,train,2018-01-01
4,2018-01-01 04:00:00,1220.697701,train,2018-01-01


### Utils

In [6]:
def real_time_consumption(start_date="2017-12-01",
                         end_date="2022-10-01"):
    
    """
    Türkiyenin toplam enerji tüketimi. 24 saat sonrasını tahmin etmek için minumum shift(24) ile kullandığım
    EPİAŞ şeffaflık real-time elektrik tüketim verisi.
    Tarih aralığı geniş olduğundan veriyi çekmek uzun sürebilir.
    
    """
    url=f"https://seffaflik.epias.com.tr/transparency/service/consumption/real-time-consumption?startDate={start_date}&endDate={end_date}"
    response=requests.get(url,verify=False)
    json_data=json.loads(response.text.encode('utf8'))
    consumption=pd.DataFrame(json_data['body']['hourlyConsumptions'])
    consumption['Tarih']=pd.to_datetime(consumption.date.str[:16])
    consumption = consumption[['consumption','Tarih']]
    
    return consumption


def real_time_generation(start_date="2017-12-01",
                         end_date="2022-10-01"):
    
    """
    Türkiyenin toplam enerji üretimi. 24 saat sonrasını tahmin etmek için minumum shift(24) ile kullandığım EPİAŞ şeffaflık real-time elektrik üretim verisi.
    Tarih aralığı geniş olduğundan veriyi çekmek uzun sürebilir. Her ihtimale karşın, ekstra data olarak gdz-ext-dataset'de production.csv dosyasında bulunabilir.
    
    """
    url=f"https://seffaflik.epias.com.tr/transparency/service/production/real-time-generation?startDate={start_date}&endDate={end_date}"
    response=requests.get(url,verify=False)
    json_data=json.loads(response.text.encode('utf8'))
    production = pd.DataFrame(json_data['body']['hourlyGenerations'])
    production['Tarih']=pd.to_datetime(production.date.str[:16])
    production.loc[production.total==0,'total'] = np.nan
    production.drop(['date','naphta','nucklear'],axis=1,inplace=True)
    
    return production

def nasa_external_data_preprocessor(df_temp):
    """
    Data şu kaynaktan beslenmektedir: https://power.larc.nasa.gov/data-access-viewer/
    
    Buradan şu lokasyonlara ait:
    İzmir:
        latitude = 38.4235 
        longitude = 27.1564
    Manisa:
        latitude = 38.630554
        longitude = 27.422222
        
    veriler toplanmıştır. Python API olmadığı için indirip data olarak ekledim,
    "csv" formatında gdz-ext-dataset datası içinde bulabilirsiniz.
    Veriler benzer olduğu için Manisa verisi kullanmadım. İzmir hava durumu verisi yeterli oldu.
    
    """
    
    df_temp['Tarih'] = df_temp['YEAR'].astype(str) + "-" \
                    + df_temp['MO'].astype(str) + "-"\
                    + df_temp['DY'].astype(str) + " "\
                    + df_temp['HR'].astype(str) + ":00"
    df_temp = df_temp.drop(['YEAR','MO','DY','HR'],axis=1)
    df_temp['Tarih'] = pd.to_datetime(df_temp['Tarih'])
    return df_temp


def read_meteostat_weather_data(latitude=38.4235,
                                longitude=27.1564,
                                start_date=datetime(2017, 4, 1),
                                end_date=datetime(2023, 12, 31, 23, 59)):
    
    """
    Hava durumu verisi için ikinci bir kaynak. Python API'ı mevcut: https://github.com/meteostat/meteostat-python
    
    National Oceanic and Atmospheric Administration (NOAA) ve Germany's national meteorological service (DWD)
    kaynaklarından beslenmektedir.
    
    Bu veri üzerinde de farklı denemeler yaptım ancak izmir location'ı modele en faydalı olandı. Bu nedenle Manisa'yı
    dahil etmedim.
    """
    
    location = Point(latitude, longitude)
    weather = Hourly('72219', start_date, end_date)
    weather = weather.fetch()
    weather = weather.reset_index().rename(columns = {'time':'Tarih'})
    weather = weather.drop(['snow','wpgt','tsun'],axis=1)
    weather['Tarih'] = pd.to_datetime(weather['Tarih'])
    return weather


def solar_features(city = 'İzmir'):
    
    """
    Kaynak: https://github.com/pvlib/pvlib-python
    
    Güneş ışınlarının ilgili location'a gelme açısını belirten featurelar.
    
    Bu featureları istediğimiz dönem için lead ve lag feature olarak üretebiliriz.
    Herhangi bir leak unsuru yaratmayacaktır. Seasonality etkisini yansıtmak için modele katkısı yüksekti.

    """

    if city == 'İzmir':
        latitude = 38.4235 
        longitude = 27.1564
    if city == 'Manisa':
        latitude = 38.630554
        longitude = 27.422222
        
    tz = 'Europe/Istanbul'
    start='2017-01-01'
    end='2023-01-01'
    tus = Location(latitude, longitude, tz, 700, city)
    
    times = pd.date_range(start, end, freq='H', tz=tus.tz)
    cs = tus.get_clearsky(times)  # ineichen with climatology table by default
    cs=cs.reset_index().rename(columns={'index':'Tarih'})
    cs['Tarih'] = pd.to_datetime(cs['Tarih'], format = "%d/%m/%Y %H/%M")
    cs['Tarih']=cs['Tarih'].dt.tz_localize(None)
    cs=cs.rename(columns={'ghi':f'ghi_{city}','dni':f'dni_{city}','dhi':f'dhi_{city}'})
    
    return cs

def solar_features_advanced(lat=38.4235,
                            lon=27.1564,
                            start='2017-01-01',
                            end='2023-01-01'):
    """
    Kaynak: https://github.com/pvlib/pvlib-python
    
    Güneş ışınlarının ilgili location'a gelme açısını belirten diğer detaylı featurelar.
    
    Seasonality etkisini yansıtmak için modele katkısı yüksekti.
    
    """
    tz = 'Europe/Istanbul'
    tus = Location(lat, lon, tz, 700, 'İzmir')
    
    #lat, lon = latitude, longitude
    times = pd.date_range(start, end, closed='left', freq='H',tz=tz)
    solpos = solarposition.get_solarposition(times, lat, lon)
    truetracking_angles = tracking.singleaxis(
        apparent_zenith=solpos['apparent_zenith'],
        apparent_azimuth=solpos['azimuth'],
        axis_tilt=0,
        axis_azimuth=180,
        max_angle=90,
        backtrack=False,  # for true-tracking
        gcr=0.5)  # irrelevant for true-tracking
    truetracking_position = truetracking_angles['tracker_theta'].fillna(0)
    
    solpos=solpos.reset_index().rename(columns={'index':'Tarih'})
    solpos['Tarih'] = pd.to_datetime(solpos['Tarih'], format = "%d/%m/%Y %H/%M")
    solpos['Tarih']=solpos['Tarih'].dt.tz_localize(None)
    solpos=solpos.drop(['zenith','elevation'],axis=1)
    solpos.drop('apparent_elevation',axis=1,inplace=True)
    
    truetracking_position = truetracking_position.reset_index().rename(columns={'index':'Tarih','tracker_theta':'sun_position'})
    truetracking_position['Tarih'] = pd.to_datetime(truetracking_position['Tarih'], format = "%d/%m/%Y %H/%M")
    truetracking_position['Tarih'] = truetracking_position['Tarih'].dt.tz_localize(None)
    truetracking_position['Tarih'] = pd.to_datetime(truetracking_position['Tarih'])
    
    turbidity = pvlib.clearsky.lookup_linke_turbidity(times, lat, lon, interp_turbidity=True)
    turbidity=turbidity.reset_index().rename(columns={'index':'Tarih',0:'turbidity'})
    turbidity['Tarih'] = pd.to_datetime(turbidity['Tarih'], format = "%d/%m/%Y %H/%M")
    turbidity['Tarih']=turbidity['Tarih'].dt.tz_localize(None)
    
    return solpos, truetracking_position, turbidity

def lag_features(df_temp,
                 columns,
                 lags):
    
    """
    Lag feature üreten fonksiyon, ilgili dataframe, lag uzunlukluları ve sütun isimleri verilerek lag'li featurelar üretilir.
    """
    for col in columns:
        for lag in lags:
            df_temp[f'lag_{lag}_{col}'] = df_temp[col].shift(lag)
    return df_temp


def rolling_features(df_temp,
                     columns,
                     rolls,
                     roll_types):
    """
    Rolling feature üreten fonksiyon, ilgili dataframe, rolling type'ları ve sütun isimleri verilerek rolling featurelar üretilir.
    """
    
    for col in columns:
        for roll in rolls:
            if 'mean' in roll_types:
                df_temp[f'rolling_mean_{roll}_{col}'] = df_temp[col].rolling(roll,min_periods=1).mean().reset_index(drop=True)
            if 'max' in roll_types:
                df_temp[f'rolling_max_{roll}_{col}'] = df_temp[col].rolling(roll,min_periods=1).max().reset_index(drop=True)
            if 'min' in roll_types:
                df_temp[f'rolling_min_{roll}_{col}'] = df_temp[col].rolling(roll,min_periods=1).min().reset_index(drop=True)
            if 'std' in roll_types:
                df_temp[f'rolling_std_{roll}_{col}'] = df_temp[col].rolling(roll,min_periods=1).std().reset_index(drop=True)
    return df_temp


def rolling_shift_features(df_temp,
                     columns,
                     rolls,
                     roll_types):
    """
    24 lagli rolling feature üreten fonksiyon, ilgili dataframe, rolling type'ları ve sütun isimleri verilerek rolling featurelar üretilir.
    """
    for col in columns:
        for roll in rolls:
            if 'mean' in roll_types:
                df_temp[f'rolling_shift_24_mean_{roll}_{col}'] = df_temp[col].shift(24).rolling(roll,min_periods=1).mean().reset_index(drop=True)
            if 'max' in roll_types:
                df_temp[f'rolling_shift_24_max_{roll}_{col}'] = df_temp[col].shift(24).rolling(roll,min_periods=1).max().reset_index(drop=True)
            if 'min' in roll_types:
                df_temp[f'rolling_shift_24_min_{roll}_{col}'] = df_temp[col].shift(24).rolling(roll,min_periods=1).min().reset_index(drop=True)
            if 'std' in roll_types:
                df_temp[f'rolling_shift_24_std_{roll}_{col}'] = df_temp[col].shift(24).rolling(roll,min_periods=1).std().reset_index(drop=True)
    return df_temp


def diff_pct_features(df_temp,
                  columns,
                  diff_pct
                 ):
    """
    Difference ve percentile change featureları üreten fonksiyon.
    
    """
    for col in columns:
        for value in diff_pct:
            df_temp[f'diff_{col}_{value}'] = df_temp[col].diff(value)
            df_temp[f'pct_change_{col}_{value}'] = df_temp[col].pct_change(value)
    return df_temp

def datetime_features(df_temp):
    """
    Datetime feature üretir.
    """
    df_temp['month'] = df_temp['Tarih'].dt.month
    df_temp['hour'] = df_temp['Tarih'].dt.hour
    df_temp['year'] = df_temp['Tarih'].dt.year
    df_temp['dayofweek'] = df_temp['Tarih'].dt.dayofweek
    df_temp['quarter'] = df_temp['Tarih'].dt.quarter
    df_temp['dayofmonth'] = df_temp['Tarih'].dt.day
    df_temp['weekofyear'] = df_temp['Tarih'].dt.weekofyear
    return df_temp

def periodic_spline_transformer(period, n_splines=None, degree=3):
    """
    Kaynak: https://scikit-learn.org/stable/auto_examples/applications/plot_cyclical_feature_engineering.html
    """
    
    if n_splines is None:
        n_splines = period
    n_knots = n_splines + 1  # periodic and include_bias is True
    return SplineTransformer(
        degree=degree,
        n_knots=n_knots,
        knots=np.linspace(0, period, n_knots).reshape(n_knots, 1),
        extrapolation="periodic",
        include_bias=True)

def seasonality_features(df_temp):
    df_temp['month_sin'] = np.sin(2*np.pi*df_temp.month/12)
    df_temp['month_cos'] = np.cos(2*np.pi*df_temp.month/12)
    df_temp['day_sin'] = np.sin(2*np.pi*df_temp.hour/24)
    df_temp['day_cos'] = np.cos(2*np.pi*df_temp.hour/24)
    return df_temp

def seasonality_spline_features(hours=np.arange(0,24)):
    hour_df = pd.DataFrame(np.linspace(0, 24, 24).reshape(-1, 1),columns=["hour"])
    splines = periodic_spline_transformer(24, n_splines=12).fit_transform(hour_df)
    splines_df = pd.DataFrame(splines,columns=[f"spline_{i}" for i in range(splines.shape[1])])
    splines_df =pd.concat([pd.Series(hours,name='hour'), splines_df], axis="columns")
    
    return splines_df

def yuk_tahmin_features(path):
    """
    Bir önceden belirlenen yük tahmin planı verisi önişleme fonksiyonu.
    """
    yuk_tahmin_plani = pd.read_csv(path)
    yuk_tahmin_plani['Tarih'] = pd.to_datetime(yuk_tahmin_plani['Tarih'].astype(str) + " " + yuk_tahmin_plani['Saat'].astype(str))
    yuk_tahmin_plani = yuk_tahmin_plani[['Tarih','yuk_tahmin_plani']]
    yuk_tahmin_plani['yuk_tahmin_plani'] = yuk_tahmin_plani['yuk_tahmin_plani'].str.replace(',','').str.replace('.','').astype(float)
    
    return yuk_tahmin_plani

def kgup_features(path):
    """
    Final solutionda kullanılmadı. Kesinleşmiş Günlük Üretim Planı (KGÜP) featureları üretir.
    
    """
    kgup = pd.read_csv(path)
    kgup['Tarih'] = pd.to_datetime(kgup['Tarih'].astype(str) + " " + kgup['Saat'].astype(str))
    kgup.drop('Saat',axis=1,inplace=True)
    kgup = kgup.set_index('Tarih')
    kgup = kgup.add_prefix('kgup_')
    for col in kgup.columns:
        kgup[col] = kgup[col].str.replace(',','').str.replace('.','').astype(float)
    kgup = kgup.reset_index()
    return kgup

def holiday_features(path):
    """
    Tatil günlerinden türetilmiş featureları üretir.
    
    """
    
    hol = pd.read_csv(path, parse_dates=['CALENDAR_DATE'])
    hol = hol[['CALENDAR_DATE','WEEKEND_FLAG','RAMADAN_FLAG','RELIGIOUS_DAY_FLAG_SK','NATIONAL_DAY_FLAG_SK','PUBLIC_HOLIDAY_FLAG']].rename(columns={'CALENDAR_DATE':'Tarih_daily'})
    hol = hol.sort_values('Tarih_daily').reset_index(drop=True)
    hol.loc[hol.RELIGIOUS_DAY_FLAG_SK != 100, 'RELIGIOUS_DAY_FLAG_SK'] = 1
    hol.loc[hol.RELIGIOUS_DAY_FLAG_SK == 100, 'RELIGIOUS_DAY_FLAG_SK'] = 0
    
    hol.loc[hol.NATIONAL_DAY_FLAG_SK != 200, 'NATIONAL_DAY_FLAG_SK'] = 1
    hol.loc[hol.NATIONAL_DAY_FLAG_SK == 200, 'NATIONAL_DAY_FLAG_SK'] = 0
    
    hol.loc[hol.RAMADAN_FLAG == 'N', 'RAMADAN_FLAG'] = 0
    hol.loc[hol.RAMADAN_FLAG == 'Y', 'RAMADAN_FLAG'] = 1
    
    hol.loc[hol.PUBLIC_HOLIDAY_FLAG == 'N', 'PUBLIC_HOLIDAY_FLAG'] = 0
    hol.loc[hol.PUBLIC_HOLIDAY_FLAG == 'Y', 'PUBLIC_HOLIDAY_FLAG'] = 1
    
    hol.loc[hol.WEEKEND_FLAG == 'N', 'WEEKEND_FLAG'] = 0
    hol.loc[hol.WEEKEND_FLAG == 'Y', 'WEEKEND_FLAG'] = 1
    
    hol['WEEKEND_FLAG'] = hol['WEEKEND_FLAG'].astype(int)
    hol['RAMADAN_FLAG'] = hol['RAMADAN_FLAG'].astype(int)
    hol['PUBLIC_HOLIDAY_FLAG'] = hol['PUBLIC_HOLIDAY_FLAG'].astype(int)
    # Resmi/Dini/Milli Bayram ve Tatilleri önceden bildiren featurelar
    is_next_days_cols = ['RAMADAN_FLAG','PUBLIC_HOLIDAY_FLAG','RELIGIOUS_DAY_FLAG_SK','NATIONAL_DAY_FLAG_SK']
    
    for i in [3,7,15]:
        for col in is_next_days_cols:
            hol[f"is_{col}_in_next_{i}_days"] = hol[col].rolling(i).sum().shift(-i)

    #hol = hol.set_index('Tarih_daily')
    #how_many_days_left_columns = ['WEEKEND_FLAG','RAMADAN_FLAG','RELIGIOUS_DAY_FLAG_SK','NATIONAL_DAY_FLAG_SK','PUBLIC_HOLIDAY_FLAG']
    #for col in how_many_days_left_columns:
    #    dates = hol[hol[col]==1].index.tolist()
    #    s = pd.Series(hol.index.isin(dates), index=hol.index)[::-1].cumsum()
    #    hol[f'days_left_next_{col}'] = hol.groupby(s).cumcount(ascending=False)# + 1
    #    
    #hol = hol.reset_index()
    
    return hol
    

### Read and Preprocess External Datasets

In [7]:
# holiday data for Turkey. Kaynak: https://www.kaggle.com/datasets/frtgnn/turkish-calendar
if CFG.holiday_features:
    hol = holiday_features(CFG.holiday_path)

# merge NASA weather data
if CFG.nasa_weather_features:
    izmir_temp = nasa_external_data_preprocessor(pd.read_csv('/kaggle/input/gdz-external-data/ext_izmir_temp.csv')) # Temperature
    izmir_hum = nasa_external_data_preprocessor(pd.read_csv('/kaggle/input/gdz-external-data/ext_izmir_humi.csv')) # Humidity
    izmir_wind = nasa_external_data_preprocessor(pd.read_csv('/kaggle/input/gdz-external-data/ext_izmir_wind.csv')) # Wind
    izmir_nasa = izmir_temp.merge(izmir_hum.merge(izmir_wind, how='left', on='Tarih'),how='left',on='Tarih')
    del izmir_temp,izmir_hum,izmir_wind; gc.collect()

# weather data: meteostat
if CFG.meteostat_weather_features:
    izmir_meteostat = read_meteostat_weather_data()

#solar features
if CFG.solar_features:
    cs_izmir = solar_features("İzmir")
    #cs_man = solar_features("Manisa")
    solpos,truetracking_position,turbidity = solar_features_advanced()

#lagged real time consumption data
if CFG.consumption_lag_features:
    consumption = real_time_consumption()
    
#lagged real time generation data
if CFG.production_lag_features:
    if CFG.read_from_path:
        production = pd.read_csv('/kaggle/input/gdz-external-data/production.csv')
        production.drop(['naphta','nucklear'],axis=1,inplace=True)
        production['Tarih'] = pd.to_datetime(production['Tarih'])
        production.loc[production.total==0,'total'] = np.nan
    if CFG.read_from_path == False:
        production = real_time_generation()

### Feature Engineering

In [8]:
#datetime features
df = datetime_features(df)

In [9]:
if CFG.holiday_features:
    df = df.merge(hol,how='left',on='Tarih_daily')

In [10]:
if CFG.seasonality_features:
    df = seasonality_features(df)

In [11]:
if CFG.seasonality_spline_features:
    splines_df = seasonality_spline_features()
    df.merge(splines_df,on='hour',how='left')

In [12]:
if CFG.nasa_weather_features:
    if CFG.weather_lag_features:
        izmir_nasa = lag_features(izmir_nasa,
                                  columns = CFG.nasa_feature_columns,
                                  lags = CFG.weather_lag_range
                                 )
    if CFG.diff_pct_features:
        izmir_nasa = diff_pct_features(izmir_nasa,
                                   columns = CFG.nasa_feature_columns,
                                   diff_pct = [1]
                                  )
    if CFG.rolling_features:
        izmir_nasa = rolling_features(izmir_nasa,
                                      columns = CFG.nasa_feature_columns,
                                      rolls = CFG.rolling_range,
                                      roll_types=CFG.roll_types
                                     )
    izmir_nasa =izmir_nasa.set_index('Tarih').add_prefix('izmir_').reset_index()
        
if CFG.meteostat_weather_features:
    #if CFG.weather_lag_features:
    #    izmir_meteostat = lag_features(izmir_meteostat,
    #                              columns = CFG.meteostat_feature_columns,
    #                              lags = CFG.weather_lag_range
    #                             )
    #if CFG.diff_pct_features:
    #    izmir_meteostat = diff_pct_features(izmir_meteostat,
    #                               columns = CFG.meteostat_feature_columns,
    #                               diff_pct = [1,6,12,24]
    #                              ) 
    #if CFG.rolling_features:
    #    izmir_meteostat = rolling_features(izmir_meteostat,
    #                                  columns = CFG.meteostat_feature_columns,
    #                                  rolls = CFG.rolling_range,
    #                                  roll_types=CFG.roll_types
    #                                 )
    izmir_meteostat =izmir_meteostat.set_index('Tarih').add_prefix('izmir_').reset_index()

In [13]:
if CFG.solar_features:
    
    if CFG.solar_lag_features:
    
        cs_izmir = lag_features(cs_izmir,
                                columns = ['ghi_İzmir','dni_İzmir','dhi_İzmir'],
                                lags = np.arange(-10,11)
                               )

In [14]:
if CFG.production_features:
    
    if CFG.production_lag_features:
        production = lag_features(production,
                                  columns = CFG.production_base_columns,
                                  lags = [24,25])
    if CFG.rolling_shift_features:
        production = rolling_shift_features(production,
                                            columns = CFG.production_base_columns,
                                            rolls = CFG.rolling_range,
                                            roll_types=CFG.roll_types)
    production.drop(CFG.production_base_columns,axis=1,inplace=True)
    
if CFG.consumption_features:
    
    if CFG.consumption_lag_features:
        consumption = lag_features(consumption,
                                  columns = CFG.consumption_base_columns,
                                  lags = [24,25])
    consumption.drop(CFG.consumption_base_columns,axis=1,inplace=True)

### Merge Features to Main DF

In [15]:
if CFG.meteostat_weather_features:
    df = df.merge(izmir_meteostat,on='Tarih',how='left')
    
if CFG.nasa_weather_features:
    df = df.merge(izmir_nasa,on='Tarih',how='left')
    
if CFG.solar_features:
    df = df.merge(cs_izmir,on='Tarih',how='left')
    df = pd.merge(df,turbidity,on='Tarih',how='left')
    df = pd.merge(df,solpos,on='Tarih',how='left')
    df = pd.merge(df,truetracking_position,on='Tarih',how='left')

if CFG.production_features:
    production = production.set_index('Tarih').add_prefix('production_').reset_index()
    df = pd.merge(df,production,on='Tarih',how='left')
    
if CFG.consumption_features:
    df = pd.merge(df,consumption,on='Tarih',how='left')
    
if CFG.yuk_tahmin_plani_features:
    yuk_tahmin_plani = yuk_tahmin_features('/kaggle/input/gdz-external-data/yuk_tahmin_plani.csv')
    df = df.merge(yuk_tahmin_plani,on='Tarih',how='left')
    
if CFG.kgup_features:
    kgup = kgup_features('/kaggle/input/gdz-external-data/ext_kgp.csv')
    df = df.merge(kgup,on='Tarih',how='left')

In [16]:
df = df.drop([col for col in df.columns if "lag_0" in col],axis=1)
df = df.drop([col for col in df.columns if "rolling_0" in col],axis=1)

In [17]:
df.shape

(40896, 252)

In [18]:
df

Unnamed: 0,Tarih,Dağıtılan Enerji (MWh),data,Tarih_daily,month,hour,year,dayofweek,quarter,dayofmonth,weekofyear,WEEKEND_FLAG,RAMADAN_FLAG,RELIGIOUS_DAY_FLAG_SK,NATIONAL_DAY_FLAG_SK,PUBLIC_HOLIDAY_FLAG,is_RAMADAN_FLAG_in_next_3_days,is_PUBLIC_HOLIDAY_FLAG_in_next_3_days,is_RELIGIOUS_DAY_FLAG_SK_in_next_3_days,is_NATIONAL_DAY_FLAG_SK_in_next_3_days,is_RAMADAN_FLAG_in_next_7_days,is_PUBLIC_HOLIDAY_FLAG_in_next_7_days,is_RELIGIOUS_DAY_FLAG_SK_in_next_7_days,is_NATIONAL_DAY_FLAG_SK_in_next_7_days,is_RAMADAN_FLAG_in_next_15_days,is_PUBLIC_HOLIDAY_FLAG_in_next_15_days,is_RELIGIOUS_DAY_FLAG_SK_in_next_15_days,is_NATIONAL_DAY_FLAG_SK_in_next_15_days,month_sin,month_cos,day_sin,day_cos,izmir_temp,izmir_dwpt,izmir_rhum,izmir_prcp,izmir_wdir,izmir_wspd,izmir_pres,izmir_coco,izmir_T2M,izmir_T2MDEW,izmir_T2MWET,izmir_QV2M,izmir_RH2M,izmir_PRECTOTCORR,izmir_PS,izmir_WS10M,izmir_WD10M,izmir_WS50M,izmir_lag_5_T2M,izmir_lag_10_T2M,izmir_lag_15_T2M,izmir_lag_20_T2M,izmir_lag_25_T2M,izmir_lag_30_T2M,izmir_lag_35_T2M,izmir_lag_40_T2M,izmir_lag_45_T2M,izmir_lag_50_T2M,izmir_lag_5_T2MDEW,izmir_lag_10_T2MDEW,izmir_lag_15_T2MDEW,izmir_lag_20_T2MDEW,izmir_lag_25_T2MDEW,izmir_lag_30_T2MDEW,izmir_lag_35_T2MDEW,izmir_lag_40_T2MDEW,izmir_lag_45_T2MDEW,izmir_lag_50_T2MDEW,izmir_lag_5_T2MWET,izmir_lag_10_T2MWET,izmir_lag_15_T2MWET,izmir_lag_20_T2MWET,izmir_lag_25_T2MWET,izmir_lag_30_T2MWET,izmir_lag_35_T2MWET,izmir_lag_40_T2MWET,izmir_lag_45_T2MWET,izmir_lag_50_T2MWET,izmir_lag_5_QV2M,izmir_lag_10_QV2M,izmir_lag_15_QV2M,izmir_lag_20_QV2M,izmir_lag_25_QV2M,izmir_lag_30_QV2M,izmir_lag_35_QV2M,izmir_lag_40_QV2M,izmir_lag_45_QV2M,izmir_lag_50_QV2M,izmir_lag_5_RH2M,izmir_lag_10_RH2M,izmir_lag_15_RH2M,izmir_lag_20_RH2M,izmir_lag_25_RH2M,izmir_lag_30_RH2M,izmir_lag_35_RH2M,izmir_lag_40_RH2M,izmir_lag_45_RH2M,izmir_lag_50_RH2M,izmir_lag_5_PRECTOTCORR,izmir_lag_10_PRECTOTCORR,izmir_lag_15_PRECTOTCORR,izmir_lag_20_PRECTOTCORR,izmir_lag_25_PRECTOTCORR,izmir_lag_30_PRECTOTCORR,izmir_lag_35_PRECTOTCORR,izmir_lag_40_PRECTOTCORR,izmir_lag_45_PRECTOTCORR,izmir_lag_50_PRECTOTCORR,izmir_lag_5_PS,izmir_lag_10_PS,izmir_lag_15_PS,izmir_lag_20_PS,izmir_lag_25_PS,izmir_lag_30_PS,izmir_lag_35_PS,izmir_lag_40_PS,izmir_lag_45_PS,izmir_lag_50_PS,izmir_lag_5_WS10M,izmir_lag_10_WS10M,izmir_lag_15_WS10M,izmir_lag_20_WS10M,izmir_lag_25_WS10M,izmir_lag_30_WS10M,izmir_lag_35_WS10M,izmir_lag_40_WS10M,izmir_lag_45_WS10M,izmir_lag_50_WS10M,izmir_lag_5_WD10M,izmir_lag_10_WD10M,izmir_lag_15_WD10M,izmir_lag_20_WD10M,izmir_lag_25_WD10M,izmir_lag_30_WD10M,izmir_lag_35_WD10M,izmir_lag_40_WD10M,izmir_lag_45_WD10M,izmir_lag_50_WD10M,izmir_lag_5_WS50M,izmir_lag_10_WS50M,izmir_lag_15_WS50M,izmir_lag_20_WS50M,izmir_lag_25_WS50M,izmir_lag_30_WS50M,izmir_lag_35_WS50M,izmir_lag_40_WS50M,izmir_lag_45_WS50M,izmir_lag_50_WS50M,ghi_İzmir,dni_İzmir,dhi_İzmir,lag_-10_ghi_İzmir,lag_-9_ghi_İzmir,lag_-8_ghi_İzmir,lag_-7_ghi_İzmir,lag_-6_ghi_İzmir,lag_-5_ghi_İzmir,lag_-4_ghi_İzmir,lag_-3_ghi_İzmir,lag_-2_ghi_İzmir,lag_-1_ghi_İzmir,lag_1_ghi_İzmir,lag_2_ghi_İzmir,lag_3_ghi_İzmir,lag_4_ghi_İzmir,lag_5_ghi_İzmir,lag_6_ghi_İzmir,lag_7_ghi_İzmir,lag_8_ghi_İzmir,lag_9_ghi_İzmir,lag_10_ghi_İzmir,lag_-10_dni_İzmir,lag_-9_dni_İzmir,lag_-8_dni_İzmir,lag_-7_dni_İzmir,lag_-6_dni_İzmir,lag_-5_dni_İzmir,lag_-4_dni_İzmir,lag_-3_dni_İzmir,lag_-2_dni_İzmir,lag_-1_dni_İzmir,lag_1_dni_İzmir,lag_2_dni_İzmir,lag_3_dni_İzmir,lag_4_dni_İzmir,lag_5_dni_İzmir,lag_6_dni_İzmir,lag_7_dni_İzmir,lag_8_dni_İzmir,lag_9_dni_İzmir,lag_10_dni_İzmir,lag_-10_dhi_İzmir,lag_-9_dhi_İzmir,lag_-8_dhi_İzmir,lag_-7_dhi_İzmir,lag_-6_dhi_İzmir,lag_-5_dhi_İzmir,lag_-4_dhi_İzmir,lag_-3_dhi_İzmir,lag_-2_dhi_İzmir,lag_-1_dhi_İzmir,lag_1_dhi_İzmir,lag_2_dhi_İzmir,lag_3_dhi_İzmir,lag_4_dhi_İzmir,lag_5_dhi_İzmir,lag_6_dhi_İzmir,lag_7_dhi_İzmir,lag_8_dhi_İzmir,lag_9_dhi_İzmir,lag_10_dhi_İzmir,turbidity,apparent_zenith,azimuth,equation_of_time,sun_position,production_lag_24_fueloil,production_lag_25_fueloil,production_lag_24_gasOil,production_lag_25_gasOil,production_lag_24_blackCoal,production_lag_25_blackCoal,production_lag_24_lignite,production_lag_25_lignite,production_lag_24_geothermal,production_lag_25_geothermal,production_lag_24_naturalGas,production_lag_25_naturalGas,production_lag_24_river,production_lag_25_river,production_lag_24_dammedHydro,production_lag_25_dammedHydro,production_lag_24_lng,production_lag_25_lng,production_lag_24_biomass,production_lag_25_biomass,production_lag_24_importCoal,production_lag_25_importCoal,production_lag_24_asphaltiteCoal,production_lag_25_asphaltiteCoal,production_lag_24_wind,production_lag_25_wind,production_lag_24_sun,production_lag_25_sun,production_lag_24_importExport,production_lag_25_importExport,production_lag_24_wasteheat,production_lag_25_wasteheat,production_lag_24_total,production_lag_25_total
0,2018-01-01 00:00:00,1593.944216,train,2018-01-01,1,0,2018,0,1,1,1,0,0,0,0,1,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.500000,0.866025,0.000000,1.000000,0.6,-5.6,63.0,,340.0,18.4,1024.6,,4.41,4.92,4.67,5.37,100.00,0.00,100.44,3.56,10.74,5.33,6.90,11.39,6.47,5.03,8.83,11.17,11.62,11.39,11.76,12.01,5.44,4.34,4.33,4.23,7.46,7.87,7.95,10.55,9.64,9.62,6.17,7.87,5.40,4.63,8.14,9.52,9.79,10.97,10.69,10.82,5.62,5.19,5.19,5.19,6.47,6.65,6.71,8.00,7.51,7.51,90.25,61.94,86.25,94.62,91.06,80.00,78.00,94.44,86.75,85.12,0.00,0.00,0.00,0.00,0.00,0.09,0.18,3.86,0.64,0.13,100.33,100.17,100.28,100.03,99.94,99.70,99.39,99.25,99.34,99.47,4.20,6.46,5.58,4.87,4.14,1.70,4.42,2.25,3.36,4.38,4.27,358.41,19.73,15.26,8.57,307.13,281.73,229.79,194.26,184.91,6.34,7.50,6.33,6.61,6.08,2.21,5.32,3.50,5.35,7.15,0.000000,0.00000,0.000000,154.910805,17.809124,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,77.134415,226.683946,349.702857,423.137287,433.682175,82.407483,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,273.755012,536.496242,657.911820,710.552994,53.376999,11.263168,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,34.046862,66.645255,84.645040,93.711583,3.903226,157.853310,308.637806,-3.267481,0.000000,179.2,176.2,0.0,0.0,198.50,193.50,5166.65,5025.27,781.24,777.64,7955.33,8732.27,1898.92,2065.64,5748.65,7065.49,0.0,0.0,246.28,244.84,6356.82,6303.52,280.41,275.99,1482.38,1477.00,0.00,0.00,0.00,0.00,85.15,84.47,30379.53,32421.83
1,2018-01-01 01:00:00,1513.933887,train,2018-01-01,1,1,2018,0,1,1,1,0,0,0,0,1,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.500000,0.866025,0.258819,0.965926,-1.1,-6.6,66.0,0.0,320.0,16.6,1025.4,,4.27,4.76,4.52,5.31,100.00,0.00,100.45,3.48,13.91,5.15,6.37,11.01,8.11,4.65,7.83,10.88,12.26,11.25,11.65,11.98,5.45,4.52,4.62,3.97,6.85,8.09,7.73,10.01,9.82,9.64,5.91,7.76,6.36,4.31,7.34,9.49,9.99,10.62,10.73,10.80,5.62,5.25,5.31,5.07,6.16,6.77,6.59,7.75,7.63,7.51,93.69,64.25,78.69,95.50,93.44,82.75,73.62,92.00,88.38,85.44,0.00,0.00,0.00,0.00,0.00,0.12,0.07,3.19,1.10,0.17,100.36,100.19,100.29,100.07,99.96,99.77,99.43,99.30,99.32,99.44,3.96,6.45,5.56,4.72,4.79,2.19,3.88,3.86,2.97,4.42,7.94,358.19,16.32,17.24,12.06,327.95,269.08,261.03,196.96,184.66,5.88,7.70,6.28,6.48,6.68,2.85,4.64,4.82,4.63,7.15,0.000000,0.00000,0.000000,295.888368,154.910805,17.809124,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,77.134415,226.683946,349.702857,610.713698,433.682175,82.407483,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,273.755012,536.496242,657.911820,77.395225,53.376999,11.263168,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,34.046862,66.645255,84.645040,3.903226,164.287665,347.449364,-3.287147,0.000000,183.1,179.2,0.0,0.0,198.50,198.50,5159.90,5166.65,785.67,781.24,7465.35,7955.33,1845.04,1898.92,4512.22,5748.65,0.0,0.0,244.89,246.28,6077.71,6356.82,282.62,280.41,1438.05,1482.38,0.00,0.00,0.00,0.00,87.77,85.15,28280.82,30379.53
2,2018-01-01 02:00:00,1402.612637,train,2018-01-01,1,2,2018,0,1,1,1,0,0,0,0,1,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.500000,0.866025,0.500000,0.866025,-1.1,-9.5,53.0,0.0,330.0,22.3,1025.8,,4.08,4.54,4.31,5.25,100.00,0.00,100.44,3.19,16.93,4.65,5.76,9.58,9.69,4.21,6.96,10.46,12.26,10.58,11.56,12.08,5.46,4.99,4.61,3.76,6.02,8.31,7.67,8.87,10.08,9.69,5.61,7.29,7.15,3.98,6.49,9.38,9.97,9.72,10.82,10.89,5.62,5.43,5.31,5.00,5.86,6.84,6.59,7.14,7.75,7.57,97.88,73.06,70.62,97.06,93.69,86.44,73.38,89.00,90.50,85.19,0.00,0.00,0.00,0.00,0.00,0.00,0.07,1.53,1.83,0.12,100.38,100.20,100.26,100.12,99.98,99.82,99.47,99.34,99.30,99.42,3.61,5.13,5.70,4.46,5.11,2.67,2.87,5.29,2.44,4.38,11.12,358.95,10.34,19.13,14.89,342.10,270.94,281.59,200.06,185.83,5.25,7.36,6.47,6.31,6.93,3.66,3.61,6.34,3.71,7.08,0.000000,0.00000,0.000000,395.351953,295.888368,154.910805,17.809124,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,77.134415,226.683946,691.604517,610.713698,433.682175,82.407483,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,273.755012,536.496242,90.485783,77.395225,53.376999,11.263168,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,34.046862,66.645255,3.903226,161.813540,35.397026,-3.306804,0.000000,179.4,183.1,0.0,0.0,194.50,198.50,5173.99,5159.90,791.23,785.67,6907.14,7465.35,1886.98,1845.04,3909.25,4512.22,0.0,0.0,247.69,244.89,5701.79,6077.71,273.78,282.62,1464.77,1438.05,0.00,0.00,0.00,0.00,81.27,87.77,26811.79,28280.82
3,2018-01-01 03:00:00,1278.527266,train,2018-01-01,1,3,2018,0,1,1,1,0,0,0,0,1,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.500000,0.866025,0.707107,0.707107,-1.1,-10.0,51.0,0.0,330.0,25.9,1025.6,,3.89,4.29,4.08,5.13,100.00,0.00,100.40,2.89,17.95,4.11,5.12,7.94,10.77,3.92,6.13,10.02,11.17,10.48,11.45,12.23,5.32,5.15,4.43,3.66,5.19,8.14,8.08,8.27,10.32,9.71,5.22,6.55,7.60,3.79,5.66,9.08,9.62,9.37,10.89,10.98,5.55,5.49,5.25,4.94,5.49,6.77,6.77,6.84,7.87,7.57,100.00,82.44,64.94,98.44,93.69,87.94,81.19,86.00,92.62,84.50,0.00,0.00,0.00,0.00,0.00,0.00,0.09,0.39,2.77,0.13,100.40,100.24,100.21,100.19,100.01,99.88,99.54,99.37,99.27,99.40,3.39,4.52,6.01,4.46,5.02,3.09,1.71,5.95,2.06,4.26,12.92,0.89,4.02,18.91,14.97,352.89,277.35,289.96,198.37,189.83,5.00,6.98,6.86,6.33,6.80,4.52,2.50,7.07,3.18,6.85,0.000000,0.00000,0.000000,439.890789,395.351953,295.888368,154.910805,17.809124,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,77.134415,720.778539,691.604517,610.713698,433.682175,82.407483,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,273.755012,95.762650,90.485783,77.395225,53.376999,11.263168,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,34.046862,3.903226,152.786830,63.183033,-3.326452,0.000000,176.7,179.4,0.0,0.0,196.50,194.50,5222.01,5173.99,791.42,791.23,6836.27,6907.14,1783.04,1886.98,3496.04,3909.25,0.0,0.0,248.08,247.69,5216.62,5701.79,273.78,273.78,1575.78,1464.77,0.00,0.00,0.00,0.00,84.59,81.27,25900.83,26811.79
4,2018-01-01 04:00:00,1220.697701,train,2018-01-01,1,4,2018,0,1,1,1,0,0,0,0,1,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,0.500000,0.866025,0.866025,0.500000,-1.7,-10.5,51.0,0.0,340.0,18.4,1026.1,,3.75,4.04,3.89,5.07,100.00,0.00,100.37,2.61,18.65,3.65,4.71,7.40,11.30,5.02,5.48,9.53,11.01,10.86,11.37,12.11,5.01,5.30,4.32,3.98,4.54,7.82,7.86,8.06,10.52,9.68,4.87,6.35,7.81,4.50,5.01,8.68,9.44,9.46,10.95,10.89,5.43,5.55,5.19,5.07,5.25,6.59,6.65,6.77,8.00,7.57,100.00,86.44,62.19,93.19,93.75,88.94,80.81,82.69,94.38,85.00,0.00,0.00,0.00,0.00,0.00,0.00,0.08,0.25,3.63,0.35,100.42,100.28,100.18,100.24,100.03,99.91,99.62,99.38,99.26,99.37,3.45,4.33,6.24,5.24,4.93,3.45,1.53,5.45,1.97,3.80,11.36,2.69,0.29,18.87,15.05,2.47,280.57,289.50,204.63,194.41,5.13,6.62,7.17,6.24,6.73,5.33,2.12,6.55,3.21,6.11,0.000000,0.00000,0.000000,424.882451,439.890789,395.351953,295.888368,154.910805,17.809124,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,711.345802,720.778539,691.604517,610.713698,433.682175,82.407483,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,94.016985,95.762650,90.485783,77.395225,53.376999,11.263168,0.0,0.0,0.0,0.0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,3.903226,141.680072,78.531870,-3.346090,0.000000,182.9,176.7,0.0,0.0,193.50,196.50,5183.25,5222.01,783.44,791.42,6728.72,6836.27,1724.65,1783.04,3231.80,3496.04,0.0,0.0,245.11,248.08,5110.53,5216.62,273.78,273.78,1711.83,1575.78,0.00,0.00,0.00,0.00,85.19,84.59,25454.70,25900.83
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
40891,2022-08-31 19:00:00,,sub,2022-08-31,8,19,2022,2,3,31,35,0,0,0,0,0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,-0.866025,-0.500000,-0.965926,0.258819,31.1,17.1,43.0,0.0,330.0,13.0,1015.5,2.0,27.32,19.95,23.64,14.71,64.12,0.01,99.44,4.01,299.76,5.70,31.94,29.68,24.28,25.44,29.38,34.12,28.50,23.82,26.58,31.90,19.35,20.50,21.76,21.62,19.08,18.46,19.08,19.86,19.16,17.25,25.65,25.08,23.02,23.53,24.23,26.30,23.79,21.84,22.87,24.58,14.16,15.20,16.42,16.30,13.92,13.37,13.92,14.59,13.98,12.39,47.38,57.94,85.75,79.25,53.94,39.62,56.69,78.44,63.81,41.62,0.02,0.11,0.11,0.11,0.03,0.01,0.01,0.00,0.00,0.00,99.44,99.59,99.55,99.59,99.53,99.53,99.64,99.53,99.59,99.44,3.63,1.20,0.72,1.18,3.40,3.43,1.48,1.88,2.12,4.92,295.52,309.71,288.24,251.44,278.99,274.84,37.29,347.52,328.89,294.57,4.30,1.35,0.96,1.83,5.01,3.80,1.73,2.48,3.33,6.41,33.612484,69.94012,24.319573,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,207.787904,406.355610,582.444983,716.205083,795.847626,814.946669,772.017355,670.475631,518.749184,331.143068,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,351.503438,530.117191,630.552186,686.610032,714.306492,720.434212,706.394644,668.942137,598.440685,473.332661,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,91.476998,134.169364,161.588378,179.078279,188.590380,190.791293,185.802908,173.338583,152.379707,120.047940,5.697541,82.355290,274.876037,-0.287087,82.327853,92.8,92.5,0.0,0.0,521.45,524.44,5121.96,4883.43,988.81,962.49,13325.95,12898.70,1370.44,1495.08,7717.83,6957.64,0.0,0.0,766.05,762.64,8726.72,8721.63,205.33,207.54,3411.53,3739.81,4.91,141.75,451.98,463.68,83.70,75.57,42789.46,41926.90
40892,2022-08-31 20:00:00,,sub,2022-08-31,8,20,2022,2,3,31,35,0,0,0,0,0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,-0.866025,-0.500000,-0.866025,0.500000,30.6,16.2,42.0,0.0,300.0,18.4,1014.2,2.0,26.54,20.40,23.47,15.14,69.00,0.00,99.46,3.45,313.53,4.88,31.69,30.52,24.15,25.12,27.83,33.80,30.65,23.42,25.94,30.30,19.06,20.63,21.85,21.75,19.65,18.54,18.87,20.06,19.19,17.58,25.37,25.58,23.00,23.43,23.74,26.17,24.76,21.74,22.57,23.94,13.92,15.32,16.54,16.42,14.40,13.43,13.73,14.77,14.04,12.70,47.25,55.56,86.88,81.56,61.12,40.56,49.56,81.38,66.38,46.56,0.02,0.08,0.10,0.09,0.03,0.02,0.01,0.00,0.00,0.00,99.42,99.58,99.55,99.58,99.56,99.50,99.65,99.52,99.58,99.44,3.96,1.70,0.53,1.15,2.89,3.97,0.76,1.78,1.98,3.97,295.76,299.16,299.95,247.98,277.76,277.70,39.12,350.12,334.04,293.54,4.80,1.93,0.68,1.75,4.65,4.57,1.04,2.34,2.88,6.10,0.000000,0.00000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,33.612484,207.787904,406.355610,582.444983,716.205083,795.847626,814.946669,772.017355,670.475631,518.749184,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,69.940120,351.503438,530.117191,630.552186,686.610032,714.306492,720.434212,706.394644,668.942137,598.440685,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,24.319573,91.476998,134.169364,161.588378,179.078279,188.590380,190.791293,185.802908,173.338583,152.379707,5.697541,94.060774,284.162028,-0.274025,0.000000,92.7,92.8,0.0,0.0,526.68,521.45,5113.94,5121.96,1005.32,988.81,13237.70,13325.95,1372.40,1370.44,8450.86,7717.83,0.0,0.0,769.63,766.05,8728.81,8726.72,207.54,205.33,3196.37,3411.53,0.00,4.91,432.72,451.98,84.11,83.70,43218.78,42789.46
40893,2022-08-31 21:00:00,,sub,2022-08-31,8,21,2022,2,3,31,35,0,0,0,0,0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,-0.866025,-0.500000,-0.707107,0.707107,30.6,16.2,42.0,0.0,290.0,18.4,1014.2,2.0,25.93,20.54,23.23,15.26,72.12,0.00,99.47,2.77,333.72,4.06,31.15,31.08,24.77,24.83,26.91,33.18,32.71,23.11,25.35,28.94,18.85,20.44,22.00,21.82,20.32,18.55,18.11,20.15,19.25,18.14,25.00,25.76,23.39,23.33,23.62,25.87,25.40,21.63,22.30,23.54,13.73,15.14,16.66,16.48,15.01,13.49,13.06,14.89,14.04,13.12,48.06,53.25,84.50,83.25,67.19,42.00,42.00,83.44,68.94,52.19,0.02,0.05,0.13,0.09,0.15,0.02,0.01,0.00,0.00,0.00,99.41,99.56,99.57,99.57,99.60,99.50,99.62,99.54,99.58,99.50,4.18,2.28,0.41,1.16,2.48,4.27,0.26,1.65,1.91,3.14,297.62,295.60,326.00,251.08,278.87,278.95,280.30,1.90,337.94,295.99,5.19,2.62,0.47,1.68,3.92,5.05,0.01,2.20,2.61,5.46,0.000000,0.00000,0.000000,2.630199,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,33.612484,207.787904,406.355610,582.444983,716.205083,795.847626,814.946669,772.017355,670.475631,3.704581,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,69.940120,351.503438,530.117191,630.552186,686.610032,714.306492,720.434212,706.394644,668.942137,2.428442,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,24.319573,91.476998,134.169364,161.588378,179.078279,188.590380,190.791293,185.802908,173.338583,5.697541,105.165667,294.162332,-0.260955,0.000000,91.7,92.7,0.0,0.0,525.94,526.68,5304.31,5113.94,1026.30,1005.32,12926.15,13237.70,1324.70,1372.40,8046.02,8450.86,0.0,0.0,770.05,769.63,8728.75,8728.81,207.54,207.54,2795.35,3196.37,0.00,0.00,458.98,432.72,88.28,84.11,42294.07,43218.78
40894,2022-08-31 22:00:00,,sub,2022-08-31,8,22,2022,2,3,31,35,0,0,0,0,0,0.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,4.0,0.0,0.0,-0.866025,-0.500000,-0.500000,0.866025,30.6,16.2,42.0,0.0,300.0,13.0,1014.0,2.0,25.36,20.62,22.99,15.32,75.00,0.00,99.48,2.57,347.19,3.67,30.21,31.58,26.85,24.60,26.28,32.29,33.80,23.98,24.78,28.04,18.86,20.18,21.23,21.77,20.94,18.58,18.08,20.14,19.42,18.65,24.53,25.88,24.05,23.19,23.61,25.44,25.94,22.05,22.10,23.34,13.79,14.95,15.93,16.42,15.62,13.49,13.06,14.83,14.22,13.55,50.75,50.94,71.38,84.12,72.44,44.25,39.38,79.06,72.06,56.75,0.01,0.03,0.20,0.09,0.11,0.02,0.01,0.00,0.00,0.00,99.41,99.52,99.57,99.56,99.61,99.50,99.58,99.59,99.56,99.56,4.31,2.77,0.72,1.07,1.91,4.36,1.46,1.80,1.90,2.67,298.10,295.41,353.18,260.33,277.50,280.00,264.78,22.77,341.27,307.98,5.59,3.21,0.88,1.52,3.00,5.32,1.39,2.04,2.52,4.66,0.000000,0.00000,0.000000,131.044286,2.630199,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,33.612484,207.787904,406.355610,582.444983,716.205083,795.847626,814.946669,772.017355,249.209762,3.704581,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,69.940120,351.503438,530.117191,630.552186,686.610032,714.306492,720.434212,706.394644,67.909537,2.428442,0.000000,0.000000,0.000000,0.000000,0.0,0.0,0.0,0.0,0.000000,0.000000,24.319573,91.476998,134.169364,161.588378,179.078279,188.590380,190.791293,185.802908,5.697541,115.363464,305.677837,-0.247874,0.000000,91.4,91.7,0.0,0.0,518.57,525.94,5531.18,5304.31,1043.56,1026.30,12707.04,12926.15,1340.81,1324.70,7582.82,8046.02,0.0,0.0,773.00,770.05,8714.82,8728.75,205.34,207.54,2301.91,2795.35,0.00,0.00,499.88,458.98,87.59,88.28,41397.92,42294.07


### Modelling

#### Modelleme Stratejisi
* Validasyon: Son 3 ayı içeren Expanding Window Time-Series Split Validation. Bu 3 ayı seçme nedenim bayramları ve özel günleri içeren geniş bir zaman aralığı olması. Böylece yalnızca ağustos için değil diğer aylar için de performansı karşılaştırabiliriz.
* 2 Step Model:
    * 1. Aşama EPIAŞ şeffaflık platformundan alınan minumum lag(24)'lü featurelarla train
    * 2. Aşama EPIAŞ verilerini kullanmadan train
* Nihai olarak ise bu iki yaklaşımın ensemble alınmış hali:
    * Bu yaklaşımın her biri LB'de oldukça iyi skor almakta ancak EPIAŞ featurelarını lag'li kullanmak zorunda olduğumuz için modele bazı noktalarda zarar verebiliyor (overfit). Bu nedenle diğer featureların etkisinide yansıtmak için bu iki yaklaşımı birleştirdim.
* Tek bir training fonksiyonu üzerinden farklı feature subsetler ile training yapılabilir. EPIAŞ featurelarında aksama olduğunda dahi conditional parametre konularak EPIAŞ olmayan model sağlıklı bir şekilde infer edilebilir.

In [19]:
cat_features_hol = ['WEEKEND_FLAG',
                    'RAMADAN_FLAG',
                    'RELIGIOUS_DAY_FLAG_SK',
                    'NATIONAL_DAY_FLAG_SK',
                    'PUBLIC_HOLIDAY_FLAG']

cat_features_date = ['month',
                     'hour',
                     'dayofweek']

cat_features = cat_features_date + cat_features_hol

for col in cat_features:
    df[col] = df[col].astype('category')
    
stage_one_exclude = []
stage_two_exclude = [col for col in df.columns if 'production' in col] + \
                    [col for col in df.columns if 'consumption' in col]

In [20]:
stage_two_exclude

['production_lag_24_fueloil',
 'production_lag_25_fueloil',
 'production_lag_24_gasOil',
 'production_lag_25_gasOil',
 'production_lag_24_blackCoal',
 'production_lag_25_blackCoal',
 'production_lag_24_lignite',
 'production_lag_25_lignite',
 'production_lag_24_geothermal',
 'production_lag_25_geothermal',
 'production_lag_24_naturalGas',
 'production_lag_25_naturalGas',
 'production_lag_24_river',
 'production_lag_25_river',
 'production_lag_24_dammedHydro',
 'production_lag_25_dammedHydro',
 'production_lag_24_lng',
 'production_lag_25_lng',
 'production_lag_24_biomass',
 'production_lag_25_biomass',
 'production_lag_24_importCoal',
 'production_lag_25_importCoal',
 'production_lag_24_asphaltiteCoal',
 'production_lag_25_asphaltiteCoal',
 'production_lag_24_wind',
 'production_lag_25_wind',
 'production_lag_24_sun',
 'production_lag_25_sun',
 'production_lag_24_importExport',
 'production_lag_25_importExport',
 'production_lag_24_wasteheat',
 'production_lag_25_wasteheat',
 'producti

In [21]:
train_df = df[df.data=='train']
sub_df = df[df.data=='sub']
exclude_cols = ['Tarih','data','year','Tarih_daily',CFG.target]

sub_df = sub_df.drop(exclude_cols,axis=1)
y = train_df[CFG.target]
X = train_df.drop(exclude_cols,axis=1)

stage_one_features = X.drop(stage_one_exclude,axis=1).columns.tolist()
stage_two_features = X.drop(stage_two_exclude,axis=1).columns.tolist()
len(stage_one_features),len(stage_two_features)

(247, 213)

In [22]:
params = {'learning_rate': 0.03,
          'objective':'MAE',
          'depth': 6,
          'early_stopping_rounds':1000,
          'iterations': 10000,
          'use_best_model': True,
          'eval_metric': "MAPE",
          'random_state': 986,
          'allow_writing_files': False,
          'thread_count':24
          }

In [23]:
def catboost_trainer(X,
                     y,
                     submission_df,
                     cv,
                     model_params,
                     feature_list,
                     cat_features,
                     scorer,
                     target_transform=False):
    """
    Catboost Trainer.
    
    ---------
    :param X: training data
    :param y: target
    :param submission_df: test dataframe to be predicted
    :param cv: scikitlearn cross validation object
    :param model_params: dict of catboost model parameters
    :param feature_list: used list of features for training and inference
    :param cat_features: categorical features
    :param scorer: scikitlearn evaluation metric
    :param target_transform: Target
    :return: CV score list, models and submission predictions 
    """
    
    score_list = []
    fold = 1
    unseen_preds = []
    importance = []
    val_results = []
    train_results = []
    models = []
    
    for train_index, test_index in cv.split(X):
        X_train,X_val = X.iloc[train_index][feature_list],X.iloc[test_index][feature_list]
        y_train,y_val = y.iloc[train_index],y.iloc[test_index]
        sub_df_subset = submission_df[feature_list]
        print(f"Training data shape: {X_train.shape}, Validation data shape: {X_val.shape}")
        
        if target_transform:
            y_train = np.log1p(y_train)
            y_val = np.log1p(y_val)
        
        
        model = CatBoostRegressor(**model_params,
                                cat_features=cat_features
                               )
        model.fit(X_train,y_train,
                eval_set=[(X_val,y_val)],
                verbose=500)
        models.append(model)
        forecast_pred = model.predict(sub_df_subset)
        if target_transform:
            forecast_pred = np.expm1(forecast_pred)
        unseen_preds.append(forecast_pred)
        
        val_result = model.predict(X_val)
        if target_transform:
            val_result = np.expm1(val_result)
        
        train_result = model.predict(X_train)
        if target_transform:
            train_result = np.expm1(train_result)
            
        train_results.append(train_result)
        if target_transform:    
            y_train = np.expm1(y_train)
            y_val = np.expm1(y_val)
            
        score = scorer(y_val,val_result)
        score_t = scorer(y_train,train_result)
        
        print(f"Score FOLD-{fold}:{score}")
        print(f"Score Train FOLD-{fold}:{score_t}")
        print(f'Predicted Mean:{np.mean(forecast_pred)}')
        score_list.append(score)
        importance.append(model.get_feature_importance())
        fold += 1
        print('*'*50)
    print("Mean MAPE:", np.mean(score_list),"Std MAPE:",np.std(score_list))
    return score_list, models, unseen_preds

In [24]:
score_list_stage_one_cat, models_stage_one_cat, unseen_preds_stage_one_cat = catboost_trainer(X=X,
                                                                                      y=y,
                                                                                      submission_df=sub_df,
                                                                                      cv=TimeSeriesSplit(n_splits=3,test_size=744),
                                                                                      model_params=params,
                                                                                      feature_list=stage_one_features,
                                                                                      cat_features=cat_features,
                                                                                      scorer=mean_absolute_percentage_error,
                                                                                      target_transform=False)

Training data shape: (37920, 247), Validation data shape: (744, 247)





0:	learn: 0.1934423	test: 0.2486065	best: 0.2486065 (0)	total: 159ms	remaining: 26m 31s
500:	learn: 0.0270913	test: 0.0339840	best: 0.0337617 (498)	total: 38.9s	remaining: 12m 18s
1000:	learn: 0.0206703	test: 0.0308368	best: 0.0308366 (993)	total: 1m 18s	remaining: 11m 45s
1500:	learn: 0.0175988	test: 0.0296200	best: 0.0296133 (1494)	total: 1m 57s	remaining: 11m 5s
2000:	learn: 0.0156227	test: 0.0289554	best: 0.0289344 (1975)	total: 2m 36s	remaining: 10m 26s
2500:	learn: 0.0142384	test: 0.0286120	best: 0.0286120 (2500)	total: 3m 15s	remaining: 9m 47s
3000:	learn: 0.0132009	test: 0.0282781	best: 0.0282739 (2988)	total: 3m 55s	remaining: 9m 8s
3500:	learn: 0.0124069	test: 0.0281293	best: 0.0280923 (3414)	total: 4m 33s	remaining: 8m 28s
4000:	learn: 0.0117456	test: 0.0279436	best: 0.0279434 (3999)	total: 5m 12s	remaining: 7m 48s
4500:	learn: 0.0111854	test: 0.0279096	best: 0.0278736 (4280)	total: 5m 51s	remaining: 7m 9s
5000:	learn: 0.0107107	test: 0.0278139	best: 0.0278048 (4983)	total: 

In [25]:
score_list_stage_two_cat, models_stage_two_cat, unseen_preds_stage_two_cat = catboost_trainer(X=X,
                                                                                      y=y,
                                                                                      submission_df=sub_df,
                                                                                      cv=TimeSeriesSplit(n_splits=3,test_size=744),
                                                                                      model_params=params,
                                                                                      feature_list=stage_two_features,
                                                                                      cat_features=cat_features,
                                                                                      scorer=mean_absolute_percentage_error,
                                                                                      target_transform=False)

Training data shape: (37920, 213), Validation data shape: (744, 213)
0:	learn: 0.1937587	test: 0.2489154	best: 0.2489154 (0)	total: 83.9ms	remaining: 13m 58s
500:	learn: 0.0348146	test: 0.0409740	best: 0.0409169 (494)	total: 34.8s	remaining: 10m 59s
1000:	learn: 0.0271601	test: 0.0377325	best: 0.0376395 (989)	total: 1m 9s	remaining: 10m 25s
1500:	learn: 0.0231274	test: 0.0362302	best: 0.0362213 (1497)	total: 1m 44s	remaining: 9m 51s
2000:	learn: 0.0205395	test: 0.0350712	best: 0.0350712 (2000)	total: 2m 20s	remaining: 9m 19s
2500:	learn: 0.0187499	test: 0.0345268	best: 0.0345206 (2496)	total: 2m 54s	remaining: 8m 44s
3000:	learn: 0.0173374	test: 0.0343746	best: 0.0343720 (2815)	total: 3m 29s	remaining: 8m 9s
3500:	learn: 0.0162437	test: 0.0343235	best: 0.0342412 (3363)	total: 4m 4s	remaining: 7m 34s
4000:	learn: 0.0153453	test: 0.0342270	best: 0.0341581 (3953)	total: 4m 39s	remaining: 6m 58s
4500:	learn: 0.0145735	test: 0.0339916	best: 0.0339871 (4494)	total: 5m 14s	remaining: 6m 23s
5

### Feature Importance

#### Stage 1 Model Feature Importance

In [26]:
importance = [model.get_feature_importance() for model in models_stage_one_cat]

f_importance = pd.concat([pd.Series(X[stage_one_features].columns.to_list(),name='Feature'),
                          pd.Series(np.mean(importance,axis=0),name="Importance")],
                         axis=1).sort_values(by='Importance',
                                             ascending=True)

fig = px.bar(f_importance.tail(20),x='Importance',y='Feature')
fig.update_layout(
    title_text="First 20 Important Features - CatBoost Average of Folds"
)
fig.show()

#### Stage 2 Model Feature Importance

In [27]:
importance = [model.get_feature_importance() for model in models_stage_two_cat]

f_importance = pd.concat([pd.Series(X[stage_two_features].columns.to_list(),name='Feature'),
                          pd.Series(np.mean(importance,axis=0),name="Importance")],
                         axis=1).sort_values(by='Importance',
                                             ascending=True)

fig = px.bar(f_importance.tail(20),x='Importance',y='Feature')
fig.update_layout(
    title_text="First 20 Important Features - CatBoost Average of Folds"
)
fig.show()

#### Make Submission

In [28]:
ss[CFG.target] = np.mean(np.concatenate([unseen_preds_stage_one_cat,unseen_preds_stage_two_cat]),axis=0)

In [29]:
fig1 = go.Figure()
fig1.add_trace(go.Scatter(x=ss.Tarih,y=ss[CFG.target],mode='lines'))

In [30]:
ss[['Tarih',CFG.target]].to_csv('submission.csv',index=False)