### Подготовка матрицы объекты-признаки и целевой переменной для классификации событий изменения уровня топлива:
- сливы-заправки
- штатный расход

In [68]:
# imports
import pandas as pd
import numpy as np

Must-run код с константами и полезными функциями

In [69]:
%run "constants-and-functions.ipynb"

Будем работать над первым транспортным средством

In [70]:
df = pd.read_csv(DATA_PROC_PATH + 'vehicle1.csv', index_col='i')
df.dropna(inplace=True)

# Так же избавимся от всех нулевых значений
df = df[df['fuellevel'] > 0]
df['dtime'] = pd.to_datetime(df['dtime'])

df.head()

Unnamed: 0_level_0,dtime,fuellevel,ingection,speed,height,tachometer,refuel
i,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
3,2020-01-09 10:05:26,49.7,1,0,-22.9,1248,0
4,2020-01-09 10:06:27,49.9,1,0,-22.9,1056,0
5,2020-01-09 10:07:27,50.3,1,0,-22.9,960,0
6,2020-01-09 10:08:27,50.4,1,0,-22.9,864,0
7,2020-01-09 10:09:27,50.1,1,0,-22.9,864,0


Получаем дельты времени и уровня топлива

In [71]:
df_X = pd.DataFrame()

DELTA_FEATURES = ['dtime', 'fuellevel']

for feature_name in DELTA_FEATURES:
    val_start = df[feature_name].iloc[:-1].to_numpy()
    val_end = df[feature_name].iloc[1:].to_numpy()

    delta_val = val_end - val_start

    if feature_name == 'dtime':
        df_X['dtime_start'] = val_start
        df_X['dtime_end'] = val_end
        delta_val = timeDeltaToSeconds(pd.DataFrame(delta_val, columns=['deltaDate'])['deltaDate'])

    df_X['delta_' + feature_name] = delta_val

Показания тахометра в начале и конце интервала

In [72]:
START_END_FEATURES = ['tachometer']

for feature_name in START_END_FEATURES:
    val_start = df[feature_name].iloc[:-1].to_numpy()
    val_end = df[feature_name].iloc[1:].to_numpy()

    df_X[f'{feature_name}_start'] = val_start
    df_X[f'{feature_name}_end'] = val_end

Чтобы избавиться от знака в fuellevel_delta и не потерять информацию,
создадим отдельно столбец модуля изменения и его знака

In [73]:
SIGN_ABS_FEATURES = ['fuellevel']

for feature_name in SIGN_ABS_FEATURES:
    df_X[f'delta_{feature_name}_abs'] = np.abs(df_X[f'delta_{feature_name}'])
    df_X[f'delta_{feature_name}_sign'] = np.sign(df_X[f'delta_{feature_name}'])
    df_X.drop(columns=[f'delta_{feature_name}'], inplace=True)

df_X.rename({'delta_dtime':'delta_seconds'}, axis='columns', inplace=True)

df_X.head()

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,delta_fuellevel_abs,delta_fuellevel_sign
0,2020-01-09 10:05:26,2020-01-09 10:06:27,61.0,1248,1056,0.2,1.0
1,2020-01-09 10:06:27,2020-01-09 10:07:27,60.0,1056,960,0.4,1.0
2,2020-01-09 10:07:27,2020-01-09 10:08:27,60.0,960,864,0.1,1.0
3,2020-01-09 10:08:27,2020-01-09 10:09:27,60.0,864,864,0.3,-1.0
4,2020-01-09 10:09:27,2020-01-09 10:10:27,60.0,864,864,0.4,-1.0


Скорость изменения уровня топлива в л/с

In [74]:
df_X['lps_abs'] = df_X['delta_fuellevel_abs']/df_X['delta_seconds']

df_X.head()

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,delta_fuellevel_abs,delta_fuellevel_sign,lps_abs
0,2020-01-09 10:05:26,2020-01-09 10:06:27,61.0,1248,1056,0.2,1.0,0.003279
1,2020-01-09 10:06:27,2020-01-09 10:07:27,60.0,1056,960,0.4,1.0,0.006667
2,2020-01-09 10:07:27,2020-01-09 10:08:27,60.0,960,864,0.1,1.0,0.001667
3,2020-01-09 10:08:27,2020-01-09 10:09:27,60.0,864,864,0.3,-1.0,0.005
4,2020-01-09 10:09:27,2020-01-09 10:10:27,60.0,864,864,0.4,-1.0,0.006667


Целевая переменная refuel

In [75]:
df_Y = pd.DataFrame()

val_start = df['refuel'].iloc[:-1].to_numpy()
val_end = df['refuel'].iloc[1:].to_numpy()

df_Y['refuel'] = (val_end + val_start) >= 1

df_Y

Unnamed: 0,refuel
0,False
1,False
2,False
3,False
4,False
...,...
18364,False
18365,False
18366,False
18367,False


In [76]:
df_X

Unnamed: 0,dtime_start,dtime_end,delta_seconds,tachometer_start,tachometer_end,delta_fuellevel_abs,delta_fuellevel_sign,lps_abs
0,2020-01-09 10:05:26,2020-01-09 10:06:27,61.0,1248,1056,0.2,1.0,0.003279
1,2020-01-09 10:06:27,2020-01-09 10:07:27,60.0,1056,960,0.4,1.0,0.006667
2,2020-01-09 10:07:27,2020-01-09 10:08:27,60.0,960,864,0.1,1.0,0.001667
3,2020-01-09 10:08:27,2020-01-09 10:09:27,60.0,864,864,0.3,-1.0,0.005000
4,2020-01-09 10:09:27,2020-01-09 10:10:27,60.0,864,864,0.4,-1.0,0.006667
...,...,...,...,...,...,...,...,...
18364,2020-06-27 00:47:05,2020-06-27 00:48:05,60.0,832,832,0.4,1.0,0.006667
18365,2020-06-27 00:48:05,2020-06-27 00:49:05,60.0,832,832,0.2,1.0,0.003333
18366,2020-06-27 00:49:05,2020-06-27 01:15:14,1569.0,832,1408,0.2,-1.0,0.000127
18367,2020-06-27 01:15:14,2020-06-27 01:16:14,60.0,1408,928,0.2,1.0,0.003333
