### Подготовка матрицы объекты-признаки и целевой переменной для классификации событий изменения уровня топлива:
- сливы-заправки
- штатный расход

In [1]:
# imports
import pandas as pd
import numpy as np

Must-run код с константами и полезными функциями

In [2]:
%run "constants-and-functions.ipynb"

Будем работать над первым транспортным средством

In [3]:
df = pd.read_csv(DATA_PROC_PATH + 'vehicle1_rolled.csv', index_col='i')
df.dropna(inplace=True)

# Так же избавимся от всех нулевых значений
df = df[df['fuellevel'] > 0]
df['dtime'] = pd.to_datetime(df['dtime'])

df.head()

Unnamed: 0_level_0,dtime,fuellevel,ingection,speed,height,tachometer,refuel
i,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,2020-01-10 09:46:56,42.496154,1,28,13.0,2264,1
1,2020-01-10 09:47:56,42.380769,1,0,6.6,666,1
2,2020-01-10 09:52:09,42.188462,1,0,15.1,678,1
3,2020-01-10 09:53:09,42.053846,1,13,14.7,990,1
4,2020-01-10 09:57:40,41.996154,1,0,18.4,661,1


Получаем дельты времени и уровня топлива

In [4]:
df_X = pd.DataFrame()

DELTA_FEATURES = ['dtime', 'fuellevel']

for feature_name in DELTA_FEATURES:
    val_start = df[feature_name].iloc[:-1].to_numpy()
    val_end = df[feature_name].iloc[1:].to_numpy()

    delta_val = val_end - val_start

    if feature_name == 'dtime':
        df_X['dtime_start'] = val_start
        df_X['dtime_end'] = val_end
        delta_val = timeDeltaToSeconds(pd.DataFrame(delta_val, columns=['deltaDate'])['deltaDate'])

    df_X['delta_' + feature_name] = delta_val

Показания тахометра и уровень топлива в начале и конце интервала

In [5]:
START_END_FEATURES = ['tachometer', 'fuellevel']

for feature_name in START_END_FEATURES:
    val_start = df[feature_name].iloc[:-1].to_numpy()
    val_end = df[feature_name].iloc[1:].to_numpy()

    df_X[f'{feature_name}_start'] = val_start
    df_X[f'{feature_name}_end'] = val_end

Чтобы избавиться от знака в fuellevel_delta и не потерять информацию,
создадим отдельно столбец модуля изменения и его знака

In [6]:
SIGN_ABS_FEATURES = ['fuellevel']

for feature_name in SIGN_ABS_FEATURES:
    df_X[f'delta_{feature_name}_abs'] = np.abs(df_X[f'delta_{feature_name}'])
    df_X[f'delta_{feature_name}_sign'] = np.sign(df_X[f'delta_{feature_name}'])
    df_X.drop(columns=[f'delta_{feature_name}'], inplace=True)

df_X.rename({'delta_dtime':'delta_seconds'}, axis='columns', inplace=True)

df_X.head()

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,fuellevel_start,fuellevel_end,delta_fuellevel_abs,delta_fuellevel_sign
0,2020-01-10 09:46:56,2020-01-10 09:47:56,60.0,2264,666,42.496154,42.380769,0.115385,-1.0
1,2020-01-10 09:47:56,2020-01-10 09:52:09,253.0,666,678,42.380769,42.188462,0.192308,-1.0
2,2020-01-10 09:52:09,2020-01-10 09:53:09,60.0,678,990,42.188462,42.053846,0.134615,-1.0
3,2020-01-10 09:53:09,2020-01-10 09:57:40,271.0,990,661,42.053846,41.996154,0.057692,-1.0
4,2020-01-10 09:57:40,2020-01-10 09:58:40,60.0,661,1232,41.996154,41.953846,0.042308,-1.0


Скорость изменения уровня топлива в л/с

In [7]:
df_X['lps_abs'] = df_X['delta_fuellevel_abs']/df_X['delta_seconds']

df_X.head()

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,fuellevel_start,fuellevel_end,delta_fuellevel_abs,delta_fuellevel_sign,lps_abs
0,2020-01-10 09:46:56,2020-01-10 09:47:56,60.0,2264,666,42.496154,42.380769,0.115385,-1.0,0.001923
1,2020-01-10 09:47:56,2020-01-10 09:52:09,253.0,666,678,42.380769,42.188462,0.192308,-1.0,0.00076
2,2020-01-10 09:52:09,2020-01-10 09:53:09,60.0,678,990,42.188462,42.053846,0.134615,-1.0,0.002244
3,2020-01-10 09:53:09,2020-01-10 09:57:40,271.0,990,661,42.053846,41.996154,0.057692,-1.0,0.000213
4,2020-01-10 09:57:40,2020-01-10 09:58:40,60.0,661,1232,41.996154,41.953846,0.042308,-1.0,0.000705


Целевая переменная refuel

In [8]:
df_Y = pd.DataFrame()

val_start = df['refuel'].iloc[:-1].to_numpy()
val_end = df['refuel'].iloc[1:].to_numpy()

df_Y['refuel'] = (val_end + val_start) >= 1

df_Y

Unnamed: 0,refuel
0,True
1,True
2,True
3,True
4,True
...,...
17867,True
17868,True
17869,True
17870,True


In [9]:
df_X

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,fuellevel_start,fuellevel_end,delta_fuellevel_abs,delta_fuellevel_sign,lps_abs
0,2020-01-10 09:46:56,2020-01-10 09:47:56,60.0,2264,666,42.496154,42.380769,0.115385,-1.0,0.001923
1,2020-01-10 09:47:56,2020-01-10 09:52:09,253.0,666,678,42.380769,42.188462,0.192308,-1.0,0.000760
2,2020-01-10 09:52:09,2020-01-10 09:53:09,60.0,678,990,42.188462,42.053846,0.134615,-1.0,0.002244
3,2020-01-10 09:53:09,2020-01-10 09:57:40,271.0,990,661,42.053846,41.996154,0.057692,-1.0,0.000213
4,2020-01-10 09:57:40,2020-01-10 09:58:40,60.0,661,1232,41.996154,41.953846,0.042308,-1.0,0.000705
...,...,...,...,...,...,...,...,...,...,...
17867,2020-06-26 10:59:53,2020-06-26 11:00:53,60.0,1736,2268,60.326923,60.250000,0.076923,-1.0,0.001282
17868,2020-06-26 11:00:53,2020-06-26 11:01:53,60.0,2268,1591,60.250000,60.161538,0.088462,-1.0,0.001474
17869,2020-06-26 11:01:53,2020-06-26 11:02:53,60.0,1591,2414,60.161538,60.111538,0.050000,-1.0,0.000833
17870,2020-06-26 11:02:53,2020-06-26 11:03:53,60.0,2414,2312,60.111538,60.034615,0.076923,-1.0,0.001282
