## Проект Megafon
**Шаг 1: Загрузка и первичная обработка данных**

## Задача:
построить  алгоритм, который для каждой пары пользователь-услуга определит вероятность подключения услуги.
## Данные:
В качестве исходных данных вам будет доступна информация об отклике абонентов на предложение подключения одной из услуг. Каждому пользователю может быть сделано несколько предложений в разное время, каждое из которых он может или принять, или отклонить.
Отдельным набором данных будет являться нормализованный анонимизированный набор признаков, характеризующий профиль потребления абонента. Эти данные привязаны к определенному времени, поскольку профиль абонента может меняться с течением времени.
Данные train и test разбиты по периодам – на train доступно 4 месяцев, а на test отложен последующий месяц. 

**Входные данные:**

    * data_train.csv: id, vas_id, buy_time, target
    
    * features.csv.zip: id, <feature_list> 

И тестовый набор:

    * data_test.csv: id, vas_id, buy_time
    
**target** - целевая переменная, где 1 означает подключение услуги, 0 - абонент не подключил услугу соответственно.

**buy_time** - время покупки, представлено в формате timestamp, для работы с этим столбцом понадобится функция datetime.fromtimestamp из модуля datetime.

**id** - идентификатор абонента

**vas_id** - подключаемая услуга

## Метрика
Скоринг будет осуществляться функцией f1, невзвешенным образом, как например делает функция sklearn.metrics.f1_score(…, average=’macro’).
 sklearn.metrics.f1_score — scikit-learn 0.22.1 documentation


## Формат представления результата
    1. Работающая модель в формате pickle, которая принимает файл data_test.csv из корневой папки и записывает в эту же папку файл answers_test.csv. В этом файле должны находится 4 столбца: buy_time, id, vas_id и target. Target можно записать как вероятность подключения услуги.
    2. Код модели можно представить в виде jupyter-ноутбука. 
    3. Презентация в формате .pdf, в которой необходимо отразить:
        ◦ Информация о модели, ее параметрах, особенностях и основных результатах.
        ◦ Обоснование выбора модели и ее сравнение с альтернативами.
        ◦ Принцип составления индивидуальных предложений для выбранных абонентов.
Рекомендуемое количество слайдов – 5 – 10.
Файл answers_test.csv с результатами работы модели, презентацию, ноутбуки и резюме необходимо прикрепить ко второму уроку “курсовой проект”.

In [2]:
import warnings
warnings.filterwarnings("ignore")

In [16]:
import pandas as pd
import numpy as np


pd.set_option('display.max_columns', None)

from matplotlib import pyplot as plt
%matplotlib inline

plt.style.use('ggplot')

import sklearn
print('The scikit-learn version is {}.'.format(sklearn.__version__))

The scikit-learn version is 0.24.2.


In [4]:
#import dask.dataframe as dd не стала использовать dask, так как слияние удобнее и быстрее сделать в pandas

### 1.1 Данные

In [5]:
# сразу уберем Unnamed, так как это те же индексы
# признаки анонимизированные, так что мы не знаем, что они содержат
features = pd.read_csv('features.csv',sep='\t').drop(columns=['Unnamed: 0'])
features.head()

Unnamed: 0,id,buy_time,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252
0,2013026,1531688400,18.910029,46.980888,4.969214,-1.386798,3.791754,-14.01179,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,49.520873,38.19189,-0.000725,-0.016435,-0.107041,-1.17746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-42.026959,-2841.496068,-1085.821501,-1755.674564,-89.504287,-119.724355,-70.712019,-54.069191,-16.642826,-7.896282,-5.634035,-10.717958,-28.571103,57.869716,52.911014,26.828289,26.668705,-4.958702,38.254749,-1.671324,-0.001656,1.318354,2.117335,-0.265234,0.331838,0.078356,-0.237576,0.254338,-0.028454,-0.044465,3.698872,26.41199,-0.036834,3.869969,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,4.027863,-11.955314,-1.019293,-0.473446,-2.62084,-1087.017387,-1757.811263,-0.36799,0.396143,-2844.828651,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.058077,0.129549,0.274871,-3.618164,-11.681641,-0.573283,0.531557,0.582717,-190.670372,1.856777,3.277409,2.174027,4.064012,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-70.470802,-14.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,60.582329,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.280492,-21.183166,-44.376426,-25.320085,-51.984826,-23.961228,-54.128903,-11.614497,-30.288386,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-9.978121,-16.684052,4.645192,13.112964,-0.034569,-0.163184,-109.036398,0.533317,-1.929048,0.376263,-0.228106,-0.251959,-0.000567,0.566264,-0.000708,-0.02921,0.895335,-0.001358,0.0,0.039208,0.665644,-0.008999,-11953.712824,-45175.257711,0.377099,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,959537300.0,-42.014078,-440560400.0,1356414000.0,5.565998,-1.465191,-33.302382,-249.128986,-36.772492,-0.364694,-0.133771,-0.209468,-32.356505,-109.884564,-876.69102,-5.368281,-247.110707,-108.409742,-512.437331,-84.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-301.747724,-25.832889,-0.694428,-12.175933,-0.45614,0.0
1,2014722,1539550800,36.690029,152.400888,448.069214,563.833202,463.841754,568.99821,-16.08618,-53.216097,-6.78366,-26.544905,-2.736081,-4.007526,-2.558912,67.300873,55.97189,-0.000725,-0.016435,-0.107041,15.77254,-3.178521,411.379185,-10.744164,-0.094251,-0.001733,-0.009327,131.407791,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-2937.6572,-1181.982633,-1755.674564,447.193953,1258.981645,-119.662019,-54.602524,-65.059494,29.770382,-7.997875,-10.717958,-28.567778,-4.130284,-8.088986,39.828289,13.668705,-3.958702,1.254749,-1.671324,-0.001656,-28.681646,-1.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,8.633562,-1.2015,-0.998268,-0.203232,0.0,20.941245,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-2.473446,-5.62084,-1183.178519,-1757.811263,-0.36799,0.246143,-2940.989783,-2293.941935,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,447.486292,1259.031589,1706.517942,-0.058077,0.009549,0.424871,-5.618164,-14.681641,-3.573283,-0.468443,-0.417283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,57.495858,34.03637,-30.888325,-3.861461,-0.317164,-0.007024,-0.143269,-39.417671,-0.212646,0.980438,-4.4e-05,-0.000379,35.451144,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.300492,-21.183166,-44.376426,-25.320085,-51.984826,-25.961228,-54.662236,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,19.742637,16.406362,13.067242,13.458569,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,-0.466683,-1.929048,-0.623737,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,-1035.951824,-45175.257711,-0.622901,-30.716053,-61790.157098,0.756864,-39131.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,967399700.0,-39.474078,1033869000.0,-120441800.0,5.232666,-0.465191,-33.302382,38.871014,4.227508,-0.364694,-0.133771,-0.209468,2.643495,-109.884564,-573.69102,3.631719,43.889293,-108.409742,-509.437331,-27.617978,-5.295406,-891.373846,-544.770792,-20.996269,48.369552,80.252276,-13.832889,-0.694428,-1.175933,-0.45614,0.0
2,2015199,1545598800,-67.019971,157.050888,-63.180786,178.103202,-68.598246,156.99821,3.51382,25.183903,-0.417924,-1.689642,-2.736081,9.226737,-2.558912,-66.189127,-69.87811,-0.000725,-0.016435,-0.107041,-5.41746,-0.638521,3.839185,8.855836,-0.094251,-0.001733,-0.009327,0.457791,0.200138,-0.00909,0.648138,0.785634,0.788392,-0.001884,-2.3e-05,-3e-05,16.94061,-0.065583,1.839235,507.973041,11357.393596,5126.807163,6230.586136,-89.504287,-119.724355,-25.128689,-24.602526,-0.526164,-7.896282,-7.917057,-10.716588,-28.56709,4.869716,0.911014,88.828289,22.668705,-3.958702,19.254749,-0.671324,-0.001656,0.318354,-3.882665,0.244766,-0.118162,-0.021644,0.202424,0.054338,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-0.284022,46.243562,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,772.984085,-3.950157,-0.253037,-0.318148,-2.29064,-0.907583,-0.040043,-1.768469,-0.212137,-4.315314,-1.019293,-2.473446,13.37916,5125.611277,6228.449437,-0.32799,0.596143,11354.061013,12001.108861,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.721923,-0.060451,0.424871,-5.618164,24.318359,1.426717,0.531557,0.382717,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-10.247016,-43.587468,-17.96363,16.111675,-2.861461,-0.317164,-0.007024,-0.143269,7.582329,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-3.678214,-0.014542,0.260492,12.816834,-14.376424,16.679915,12.548504,8.038772,-24.662234,6.385503,-9.271719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,11.021879,-0.56739,6.645192,9.67963,-0.034569,-0.163184,-102.26413,-0.466683,-1.929048,-0.623737,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,1109.231176,-45175.257711,0.377099,-30.716053,-61790.157098,0.756864,-41331.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,973447700.0,-55.744078,1087204000.0,-120441800.0,-10.580668,-0.465191,-19.302382,149.871014,119.227508,-0.364694,-0.133771,-0.209468,123.643495,-109.884564,-873.69102,-2.368281,-247.110707,-108.409742,-512.437331,2232.382022,-17.295406,-977.373846,-613.770792,-12.996269,-37.630448,10829.252276,-25.832889,-0.694428,-12.175933,-0.45614,0.0
3,2021765,1534107600,7.010029,150.200888,-6.930786,216.213202,76.621754,351.84821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,37.620873,26.29189,-0.000725,-0.016435,-0.107041,83.55254,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,7.973041,-1382.965804,363.802563,-1746.768314,-89.504287,-119.724355,-71.128686,-38.819191,-32.309494,-7.896282,-5.858933,-10.717958,-28.571103,-19.130284,-21.088986,-24.171711,0.668705,-1.958702,-5.745251,-0.671324,-0.001656,-9.681646,1.117335,-0.265234,0.141838,0.028356,-0.237576,0.204338,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,1.526554,-2.62084,362.606677,-1748.905011,0.20201,-0.153857,-1386.298387,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.038077,0.269549,0.254871,1.381836,-10.681641,1.426717,0.531557,0.382717,-190.670372,13.856777,6.277409,13.174027,6.064012,0.0,-1.276187,-0.020137,-0.042636,-23.597016,11.079198,-17.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-16.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-1.678214,-0.014542,-0.229508,-21.183166,-44.376426,-25.320085,-51.984826,-21.961228,-38.878903,-10.614497,-19.105053,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-21.978121,-32.35072,-8.354808,-8.753703,-0.034569,-0.163184,-109.036398,0.533317,3.070952,0.376263,-0.228106,-0.251959,-0.000567,0.566264,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,0.665644,-0.008999,-11953.712824,-1797.257711,0.377099,-39.096053,-61813.537098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,961956500.0,-43.714078,-440560400.0,-120441800.0,-8.045113,7.534809,-33.302382,-216.128986,952.227508,-0.364694,-0.133771,-0.209468,956.643495,-109.884564,-603.69102,6.631719,-60.110707,-34.409742,-512.437331,-92.617978,-17.295406,-973.373846,-613.770792,-23.996269,-37.630448,-205.747724,-24.832889,-0.694428,-11.175933,-0.45614,1.0
4,2027465,1533502800,-90.439971,134.220888,-104.380786,153.643202,-109.798246,132.53821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-66.189127,-77.51811,-0.000725,-0.016435,-0.107041,-5.41746,3.181479,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,457.973041,-2942.440404,-1186.765837,-1755.674564,719.681263,5981.273645,-119.662019,-54.602524,-65.059494,95.470388,-7.997875,-10.717958,-28.571103,8.869716,6.911014,-19.171711,-29.331295,-1.958702,15.254749,-1.671324,-0.001656,-32.681646,-4.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,6.109812,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-2.473446,-5.62084,-1187.961723,-1757.811263,-0.12799,-0.153857,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,719.973602,5981.323589,6701.297242,0.301923,0.129549,-0.105129,-5.618164,-14.681641,-3.573283,-0.468443,-0.417283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-116.020802,-42.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-43.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,52.451144,2.738691,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.100492,-21.183166,-44.376426,-25.320085,-51.984826,-25.961228,-54.662236,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,8.742637,54.089693,9.067242,41.475238,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,0.533317,2.070952,-0.623737,0.771894,0.748041,-0.000567,0.566264,-0.000708,-0.02921,0.895335,-0.001358,0.0,0.039208,0.665644,-0.008999,2596.299176,-45175.257711,0.377099,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,-572669500.0,-58.544078,-440560400.0,-120441800.0,-4.350668,-1.465191,-33.302382,14.871014,897.227508,-0.364694,-0.133771,-0.209468,901.643495,-35.884564,-829.69102,2.631719,-242.110707,-105.409742,-496.437331,-70.617978,105.704594,1643.626154,2007.229208,206.003731,-21.630448,6667.252276,92.167111,-0.694428,49.824067,47.54386,0.0


In [27]:
# посмотрим, есть ли дубли в id - есть
# но это нормально, так как у нас пары : юзер-услуга
id_double = features.id.value_counts()
id_double[id_double > 1] 

2456449    2
868170     2
4350738    2
2837963    2
2038831    2
          ..
2588556    2
3307448    2
1707210    2
3533762    2
1030284    2
Name: id, Length: 149789, dtype: int64

In [29]:
# посмотрим пару для примера - видим, что у одного юзера может быть несколько векторов фичей в разный момент времени
features.loc[features.id == 1707210]

Unnamed: 0,id,buy_time,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252
2497151,1707210,1539550800,-96.799971,-148.859112,-110.740786,-201.466798,-111.118246,-212.49179,-16.08618,21.363903,-6.78366,56.433462,-2.736081,-4.007526,-2.558912,-66.189127,-77.51811,-0.000725,-0.016435,-0.107041,-0.37746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,-0.799862,-0.00909,0.648138,0.785634,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-2942.440404,-1186.765837,-1755.674564,14.368763,-119.724355,-119.662019,-54.602524,-65.059494,128.703718,-7.997875,-10.717958,-28.571103,4.869716,1.911014,-24.171711,-20.331295,-2.958702,-5.745251,-1.671324,-0.001656,-32.681646,-4.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,4.789812,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-2.473446,-5.62084,-1187.961723,-1757.811263,-0.36799,-0.393857,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,14.661102,-119.674411,-105.013308,-0.058077,-0.060451,-0.575129,-5.618164,-14.681641,-3.573283,-0.468443,-0.417283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-116.020802,-42.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-43.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,48.451144,1.738691,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.010492,-21.183166,-44.376426,-25.320085,-51.984826,-25.961228,-54.662236,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,72.742637,62.373033,88.067242,66.425238,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,0.533317,2.070952,-0.623737,-0.228106,0.748041,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,0.665644,0.991001,-3393.574824,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,-572669500.0,-58.544078,-440560400.0,-120441800.0,8.099332,-1.465191,-33.302382,-265.128986,-37.772492,-0.364694,-0.133771,-0.209468,-33.356505,-109.884564,-317.69102,-5.368281,-235.110707,-108.409742,34.562669,-106.617978,-17.295406,-976.373846,-613.770792,-25.996269,-29.630448,-305.747724,0.167111,-0.694428,-12.175933,-0.45614,1.0
2746322,1707210,1546808400,-351.799971,-230.209112,-245.740786,-162.816798,-251.158246,-173.84179,-101.08618,112.893903,-91.78366,147.963462,-2.736081,-4.007526,-2.558912,-222.019127,-233.34811,-0.000725,-0.016435,-0.107041,-5.41746,-17.348521,106.059185,-95.744164,-0.094251,-0.001733,-0.009327,-2.082209,-0.799862,-0.00909,1.648138,0.785634,0.788392,-0.001884,-2.3e-05,-3e-05,-87.65939,-0.065583,-14.870765,-192.026959,-2942.440404,-1186.765837,-1755.674564,4871.007413,-119.724355,-119.662019,-54.602524,-65.059494,11.93705,-7.997875,-10.717958,-28.571103,-26.130284,-30.088986,-30.171711,-8.331295,-3.958702,-8.745251,-1.671324,-0.001656,-32.681646,-4.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-2.473446,-5.62084,-1187.961723,-1757.811263,0.41201,0.366143,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,4871.299752,-119.674411,4751.625342,-0.048077,-0.040451,0.004871,-5.618164,-14.681641,-3.573283,-0.468443,-0.417283,-115.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-116.020802,-42.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-43.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,17.451144,0.738691,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.230492,-21.183166,-44.376426,-25.320085,-51.984826,-25.961228,-54.662236,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,7.742637,2.12303,7.067242,9.90857,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,0.533317,2.070952,-0.623737,-0.228106,0.748041,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,0.665644,0.991001,-50146.056824,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,-572669500.0,-58.544078,-440560400.0,-120441800.0,-7.314954,-1.465191,-33.302382,-240.128986,-37.772492,-0.364694,-0.133771,-0.209468,-34.356505,-109.884564,4539.30898,-2.368281,1243.889293,-108.409742,3409.562669,-100.617978,-17.295406,-976.373846,-613.770792,-25.996269,-37.630448,-284.747724,99.167111,-0.694428,-12.175933,-0.45614,0.0


In [6]:
data = pd.read_csv('data_train.csv', sep=',').drop(columns=['Unnamed: 0'])

data.head()  

Unnamed: 0,id,vas_id,buy_time,target
0,540968,8.0,1537131600,0.0
1,1454121,4.0,1531688400,0.0
2,2458816,1.0,1534107600,0.0
3,3535012,5.0,1535922000,0.0
4,1693214,1.0,1535922000,0.0


In [25]:
data.loc[data.id == 2076220]

Unnamed: 0,id,vas_id,buy_time,target
377380,2076220,6.0,1542574800,1.0
377378,2076220,4.0,1542574800,1.0
377379,2076220,4.0,1543179600,0.0


In [15]:
data.buy_time.nunique()

26

In [7]:
# объединим data и features в один датафрейм с помощью merge_asof по полю id и выбором ближайшего по buy_time (nearest)
# для этого предварительно нужно отсортировать данные по полю buy_time
data = data.sort_values(by=['buy_time'])
features = features.sort_values(by=['buy_time']) #долго

In [10]:
data_train_merged = pd.merge_asof(data, features, on='buy_time', by='id', direction='nearest')
data_train_merged.head()

Unnamed: 0,id,vas_id,buy_time,target,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252
0,2582523,2.0,1531083600,0.0,314.560029,9.290888,342.989214,7.523202,337.571754,-13.58179,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,340.590873,329.26189,-0.000725,-0.016435,-0.107041,-5.41746,1.401479,28.429185,-10.744164,-0.094251,-0.001733,-0.009327,2.497791,0.200138,-0.00909,0.648138,0.785634,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,307.973041,9728.201596,-330.600797,10058.802436,-89.504287,-119.724355,-6.012019,4.514146,-10.526161,160.453718,-4.418641,-10.717958,-28.571103,-2.130284,-4.088986,-30.171711,-25.331295,-1.958702,-6.745251,-1.671324,-0.001656,2.318354,-2.882665,-0.265234,-0.178162,-0.011644,-0.237576,-0.265662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,1.132417,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-0.62084,-331.796683,10056.665737,0.19201,0.006143,9724.869013,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.091923,0.039549,0.244871,-1.618164,-3.681641,1.426717,0.531557,0.182717,-190.670372,6.856777,11.277409,5.174027,9.064012,0.0,-1.276187,-0.020137,-0.042636,-8.930349,-64.720802,43.03637,-34.888325,-3.861461,0.352836,-0.007024,-0.143269,0.582329,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,0.597843,-0.002595,-1.678214,-0.014542,0.090492,-21.183166,-44.376426,-25.320085,-51.984826,-16.961228,4.454434,-12.614497,-29.955052,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,28.742637,81.589693,50.067242,78.958574,-18.978121,-10.567387,-9.354808,-1.637036,-0.034569,-0.163184,88.758132,-0.466683,2.070952,-0.623737,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,1482.274176,-40034.257711,0.377099,159.323947,-61602.817098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,972497300.0,348.235922,-440560400.0,-120441800.0,5.613617,-0.465191,-32.302382,193.871014,16.227508,-0.364694,-0.133771,-0.209468,19.643495,-109.884564,854.30898,-4.368281,660.889293,-108.409742,309.562669,508.382022,305.704594,6488.626154,-574.770792,-24.996269,121.369552,142.252276,-16.832889,-0.694428,-11.175933,-0.45614,0.0
1,1292549,2.0,1531083600,0.0,93.880029,-217.499112,79.939214,-270.106798,74.521754,-291.21179,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,124.490873,113.16189,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,57.973041,-2942.440404,-1186.765837,-1755.674564,-89.504287,-119.724355,-33.478685,-2.552524,-30.926164,-7.896282,-5.785383,-10.717958,-28.571103,-25.130284,-30.088986,-56.171711,-24.331295,-4.958702,-8.745251,-1.671324,-0.001656,-10.681646,-0.882665,0.654766,-0.028162,-0.011644,0.622424,-0.005662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-0.473446,5.37916,-1187.961723,-1757.811263,-0.36799,-0.393857,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.058077,-0.060451,-0.575129,-0.618164,4.318359,1.426717,0.531557,0.182717,-190.670372,-1.143223,-1.722591,-0.825973,-0.935988,0.0,-1.276187,-0.020137,-0.042636,-16.797017,-51.754142,-20.96363,-13.888325,-3.861461,0.182836,-0.007024,-0.143269,-22.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,-0.089508,-2.183166,7.673574,-2.320085,-20.651492,-6.961228,-2.612236,-0.614497,15.178281,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-6.978121,-30.967385,0.645192,-14.170369,-0.034569,-0.163184,-109.036398,0.533317,1.070952,-0.623737,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,-11953.712824,-45175.257711,0.377099,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,964894100.0,132.135922,-440560400.0,-120441800.0,14.399332,-1.465191,-33.302382,-266.128986,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,-876.69102,-5.368281,-247.110707,-108.409742,-512.437331,-106.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-306.747724,-25.832889,-0.694428,-12.175933,-0.45614,0.0
2,4053116,1.0,1531083600,0.0,125.110029,152.190888,111.169214,107.213202,105.751754,86.10821,-16.08618,-56.686097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,51.480873,144.39189,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,57.973041,-2942.440404,-1186.765837,-1755.674564,-87.628311,2879.306845,-119.662019,-54.602524,-65.059494,279.687058,-7.997875,-10.717958,-28.570149,-3.130284,-8.088986,43.828289,13.668705,-4.958702,-6.745251,-1.671324,-0.001656,-32.681646,-4.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-9.176438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,103.411618,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-2.473446,-5.62084,-1187.961723,-1757.811263,-0.14799,0.106143,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-87.335971,2879.356789,2792.020842,-0.048077,-0.030451,0.114871,-5.618164,-14.681641,-3.573283,-0.468443,-0.417283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,125.362528,-42.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-43.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,40.451144,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,-0.039508,-21.183166,-44.376426,-25.320085,-51.984826,-25.961228,-54.662236,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,27.742637,102.073033,36.067242,177.708564,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,-0.466683,2.070952,0.376263,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,19161.031176,-42621.257711,0.377099,-44.806053,-61824.077098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,0.377104,-4e-05,0.379228,-0.012257,-0.107878,962993300.0,-41.734078,-440560400.0,-120441800.0,0.343777,-1.465191,-33.302382,-224.128986,54.227508,-0.364694,-0.133771,-0.209468,58.643495,-109.884564,861.30898,4.631719,-247.110707,8.590258,1098.562669,80.382022,-17.295406,-956.373846,-613.770792,-18.996269,761.369552,-213.747724,26.167111,-0.694428,39.824067,-0.45614,1.0
3,4158361,2.0,1531083600,0.0,-7.829971,-266.839112,-20.500786,-304.196798,-25.918246,-325.30179,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,22.780873,11.45189,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-12.670815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,0.648138,0.785634,0.788392,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-30.827104,-1186.709196,1155.882036,-89.504287,-119.724355,17.687981,45.730812,-28.042828,-7.896282,-7.350114,-10.717958,-28.571103,17.869716,12.911014,12.828289,5.668705,-4.958702,-8.745251,-1.671324,-0.001656,12.318354,0.117335,-0.045234,0.031838,0.028356,0.262424,0.084338,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,0.37916,-1187.905082,1153.745337,-0.17799,0.536143,-34.159687,612.888161,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.691923,-0.040451,0.414871,-3.618164,7.318359,1.426717,0.531557,-0.017283,1358.329628,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-8.313681,-6.287472,-12.96363,29.111675,-3.861461,0.482836,-0.007024,-0.143269,20.582329,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.240492,19.816834,55.95691,4.679915,-14.96816,15.038772,45.6711,1.385503,27.578281,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-0.978121,-28.084054,-2.354808,-19.770371,-0.034569,-0.163184,-19.577438,-0.466683,-1.929048,-0.623737,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,7515.359176,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,966190100.0,-45.834078,-440560400.0,-120441800.0,5.699332,2.534809,-33.302382,45.871014,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,-854.69102,13.631719,-247.110707,-108.409742,-510.437331,-35.617978,-17.295406,1257.626154,495.229208,-19.996269,-37.630448,-108.747724,-25.832889,-0.694428,-12.175933,-0.45614,1.0
4,3754468,4.0,1531083600,0.0,83.620029,535.610888,72.219214,503.343202,66.801754,502.66821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-54.359127,100.78189,-0.000725,-0.016435,-0.107041,-5.41746,-1.058521,-11.400815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,-0.799862,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,39.973041,-2942.440404,-1186.765837,-1755.674564,-89.504287,-119.724355,21.837981,27.780806,-5.942824,-7.896282,-6.737805,-10.717958,-28.571103,-7.130284,-10.088986,-12.171711,-1.331295,-2.958702,-7.745251,-1.671324,-0.001656,-24.681646,-1.882665,-0.265234,-0.088162,0.088356,-0.237576,-0.245662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,1.835978,-1.806438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,65.44936,1.212417,-0.040043,162.121531,-0.212137,159.574686,-1.019293,-1.473446,3.37916,-1187.961723,-1757.811263,-0.36799,-0.393857,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.058077,-0.060451,-0.575129,3.381836,16.318359,1.426717,0.531557,0.182717,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,5.669651,104.312528,-27.96363,4.111675,-1.861461,0.132836,-0.007024,-0.143269,-4.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,0.347843,-0.002595,-2.678214,-0.014542,0.150492,27.816834,38.006904,49.679915,7.131844,23.038772,27.721094,-1.614497,-23.855054,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,44.021879,-5.98405,22.645192,-11.653702,-0.034569,-0.163184,-109.036398,0.533317,4.070952,0.376263,-0.228106,-0.251959,-0.000567,0.566264,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,57034.827176,-45175.257711,2.377099,-30.716053,-61790.157098,-0.243136,-40701.166127,-9239.707081,-2.10805,-8.3e-05,2.377104,-4e-05,2.379228,-0.012257,-0.107878,973447700.0,-56.854078,982044000.0,-120441800.0,-14.659491,-1.465191,-33.302382,-266.128986,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,-876.69102,-5.368281,-247.110707,-108.409742,-512.437331,-106.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-306.747724,-25.832889,-0.694428,-12.175933,-0.45614,0.0


In [13]:
data_train_merged.id.nunique() == data.id.nunique() # проверка

True

In [14]:
data_train_merged.buy_time.nunique()

26

In [24]:
data_train_merged.shape

(831653, 257)

In [18]:
# сделаем то же самое для тестовых данных
data_test = pd.read_csv('data_test.csv', sep=',').drop(columns=['Unnamed: 0'])

data_test.head()  

Unnamed: 0,id,vas_id,buy_time
0,3130519,2.0,1548018000
1,2000860,4.0,1548018000
2,1099444,2.0,1546808400
3,1343255,5.0,1547413200
4,1277040,2.0,1546808400


In [19]:
data_test = test_data.sort_values(by=['buy_time'])

In [20]:
data_test_merged = pd.merge_asof(data_test, features, on='buy_time', by='id', direction='nearest')
data_test_merged.head()

Unnamed: 0,id,vas_id,buy_time,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252
0,2905850,5.0,1546808400,326.930029,227.410888,312.989214,200.223202,307.571754,179.11821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,357.540873,346.21189,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,2604.991196,-765.412317,3370.403436,-89.504287,-119.724355,-112.262019,-48.269191,-63.992827,-7.896282,49.262935,-10.717958,-28.571103,-28.130284,-33.088986,-59.171711,-26.331295,-4.958702,-9.745251,-1.671324,-0.001656,2.318354,9.117335,0.594766,-0.138162,-0.061644,0.432424,0.104338,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-5.62084,-766.608203,3368.266737,0.11201,-0.003857,2601.658613,-1127.119639,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.301923,-0.050451,0.144871,-2.618164,-12.681641,-0.573283,-0.468443,-0.08395,-190.670372,3.856777,1.277409,4.174027,1.064012,0.0,-1.276187,-0.020137,-0.042636,-28.880349,-92.387467,7.03637,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-25.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,-0.359508,-21.183166,-44.376426,-25.320085,-51.984826,-23.961228,-48.328902,-13.614497,-30.821719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-26.978121,-64.034053,-15.354808,-35.153704,-0.034569,-0.163184,-82.947451,-0.466683,-2.929048,0.376263,-0.228106,-0.251959,-0.000567,-0.433736,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,-11953.712824,-32003.257711,-0.622901,-66.356053,-61855.797098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,964375700.0,365.185922,-440560400.0,-120441800.0,-8.900668,-1.465191,-33.302382,-208.128986,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,894.30898,-1.368281,-157.110707,-41.409742,1097.562669,-87.617978,-17.295406,-464.373846,-104.770792,-25.996269,-18.630448,-209.747724,-15.832889,-0.694428,-2.175933,-0.45614,0.0
1,31619,2.0,1546808400,-96.799971,100.290888,-62.040786,250.953202,-67.458246,229.84821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-66.189127,-77.51811,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,34.759185,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-2936.883762,-1181.209195,-1755.674564,-89.504287,-119.724355,88.737981,73.747476,14.990501,-7.896282,-7.997875,-10.717958,-28.571103,-11.130284,-16.088986,-42.171711,-2.331295,-4.958702,-7.745251,-1.671324,-0.001656,-15.681646,-2.882665,-0.265234,0.261838,0.108356,-0.237576,-0.025662,-0.028454,-0.044465,2.698872,12.96199,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-3.62084,-1182.405076,-1757.811263,0.02201,-0.133857,-2940.21634,-2293.168492,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.001923,0.139549,0.064871,-1.618164,-0.681641,1.426717,-0.468443,-0.017283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,36.669654,108.479198,4.03637,0.111675,-3.861461,-0.317164,-0.007024,-0.143269,-8.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,-0.119508,2.816834,83.973574,-6.320085,28.065174,-1.961228,73.687764,-6.614497,45.978281,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-11.978121,14.94928,-11.354808,-21.72037,-0.034569,-0.163184,621.963602,0.533317,4.070952,-0.623737,-0.228106,-0.251959,-0.000567,0.566264,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,0.665644,-0.008999,-11953.712824,-45175.257711,-0.622901,-30.716053,-61790.157098,0.756864,-39961.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,964375700.0,-53.464078,1041311000.0,-120441800.0,13.099332,-1.465191,-33.302382,-266.128986,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,-876.69102,-5.368281,-247.110707,-108.409742,-512.437331,-106.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-306.747724,-25.832889,-0.694428,-12.175933,-0.45614,0.0
2,1427271,6.0,1546808400,-87.299971,-368.999112,339.439214,48.733202,334.021754,27.62821,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-56.689127,-68.01811,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,426.739185,380.115836,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,1.648138,1.785634,0.788392,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,4255.140596,-611.515837,4866.656436,-89.504287,-119.724355,-115.278686,-50.219191,-65.059494,14.187052,-5.830575,-10.717958,-28.571103,2.869716,-2.088986,-19.171711,-3.331295,-4.958702,-9.745251,-1.671324,-0.001656,-27.681646,-1.882665,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-4.62084,-612.711723,4864.519737,-0.21799,-0.053857,4251.808013,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,0.061923,0.069549,0.134871,-3.618164,-13.681641,-3.573283,-0.468443,-0.417283,-190.670372,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-89.237467,-2.96363,-34.888325,-3.861461,-0.317164,-0.007024,-0.143269,-43.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,46.451144,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.300492,-21.183166,-44.376426,-25.320085,-51.984826,-22.961228,-50.278903,-11.614497,-27.488386,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,3.742637,2.606363,14.067242,11.675238,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,2.611208,0.533317,-1.929048,-0.623737,0.771894,0.748041,-0.000567,0.566264,-0.000708,-0.02921,0.895335,-0.001358,0.0,0.039208,0.665644,-0.008999,6662.920176,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,960919700.0,-53.794078,-440560400.0,-120441800.0,8.565998,-1.465191,-33.302382,395.871014,-18.772492,-0.364694,-0.133771,-0.209468,-14.356505,-109.884564,328.30898,-1.368281,945.889293,-108.409742,-507.437331,104.382022,-17.295406,928.626154,-548.770792,0.003731,-37.630448,116.252276,25.167111,-0.694428,38.824067,-0.45614,0.0
3,2162521,6.0,1546808400,-96.799971,-20.459112,-110.740786,-34.936798,-116.158246,-56.04179,-16.08618,-65.076097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-66.189127,-77.51811,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,0.200138,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-2806.721654,-1127.154507,-1679.567144,-89.504287,-119.724355,13.904651,-33.835858,47.740501,-7.896282,-7.997875,-10.717958,-28.571103,-35.130284,-40.088986,-50.171711,-19.331295,-4.958702,-6.745251,-1.671324,-0.001656,-20.681646,-3.882665,0.624766,0.361838,-0.001644,0.262424,-0.045662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-0.62084,-1128.350391,-1681.703843,-0.13799,0.166143,-2810.054237,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.058077,-0.060451,0.414871,-2.618164,-0.681641,1.426717,0.531557,-0.017283,-190.670372,4.856777,14.277409,5.174027,15.064012,0.0,-1.276187,-0.020137,-0.042636,-0.530351,35.129198,-0.96363,-34.888325,-3.861461,0.182836,-0.007024,0.356731,-32.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,-0.009508,-21.183166,-44.376426,-25.320085,-51.984826,-16.961228,-33.89557,-10.614497,-19.021719,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,4.021879,47.699275,3.645192,48.229626,-0.034569,-0.163184,-109.036398,0.533317,3.070952,0.376263,0.771894,0.748041,-0.000567,0.566264,-0.000708,-0.02921,0.895335,-0.001358,0.0,0.039208,0.665644,-0.008999,1594.182676,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-42051.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,-572669500.0,-58.544078,-440560400.0,-120441800.0,14.149332,-1.465191,-33.302382,-265.128986,-38.772492,-0.364694,-0.133771,-0.209468,-34.356505,-109.884564,-755.69102,-5.368281,-247.110707,-106.409742,-393.437331,-106.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-298.747724,-24.832889,-0.694428,-11.175933,-0.45614,0.0
4,1529304,6.0,1546808400,-96.799971,-394.439112,-110.740786,-447.046798,-116.158246,-468.15179,-16.08618,-54.636097,-6.78366,-30.006538,-2.736081,-4.007526,-2.558912,-66.189127,-77.51811,-0.000725,-0.016435,-0.107041,-5.41746,-3.178521,-13.940815,-10.744164,-0.094251,-0.001733,-0.009327,-2.082209,-0.799862,-0.00909,-0.351862,-0.214366,-0.211608,-0.001884,-2.3e-05,-3e-05,-2.65939,-0.065583,-0.700765,-192.026959,-2942.440404,-1186.765837,-1755.674564,-89.504287,-119.724355,-101.228687,-36.169192,-65.059494,-7.896282,-7.997875,-10.717958,5314.314897,-13.130284,-18.088986,-23.171711,-18.331295,-4.958702,-9.745251,-1.671324,-0.001656,-8.681646,2.117335,-0.265234,-0.408162,-0.091644,-0.237576,-0.295662,-0.028454,-0.044465,-0.301128,-0.554677,-0.036834,-0.130031,-2.783592,-2.60662,-5.390212,-4.022547,0.0,-2.824022,-10.706438,-1.2015,-0.998268,-0.203232,0.0,-0.248755,-0.222852,-0.134088,0.0,-0.030537,-0.125866,-0.096986,-0.679774,-0.626985,-0.691912,-0.506613,-0.185299,-0.598716,-0.000115,-0.250188,-0.348913,-0.828382,-42.275915,-3.950157,-0.253037,-0.318148,-2.29064,-3.447583,-0.040043,-9.408469,-0.212137,-11.955314,-1.019293,-1.473446,-1.62084,-1187.961723,-1757.811263,-0.36799,-0.393857,-2945.772987,-2298.725139,-0.343415,-0.08972,-0.278878,-0.433135,-0.024048,-89.211948,-119.674411,-208.886358,-0.058077,-0.060451,-0.575129,-2.618164,-8.681641,-3.573283,-0.468443,-0.417283,2211.329628,-1.143223,-2.722591,-0.825973,-1.935988,0.0,-1.276187,-0.020137,-0.042636,-29.797016,-115.804135,-6.96363,-1.888325,-3.861461,-0.317164,-0.007024,-0.143269,-10.417671,-0.212646,-0.019562,-4.4e-05,-0.000379,-2.548856,-0.261309,-0.536315,-0.061481,-0.152157,-0.002595,-4.678214,-0.014542,0.170492,-1.183166,-25.943094,-25.320085,-51.984826,-5.961228,-36.228904,-3.614497,-21.438386,-0.028857,-0.063214,-0.019198,-0.033778,-0.003149,-0.005184,-0.001431,-0.00189,-1.257363,-2.793637,-1.932758,-5.008096,-30.978121,-65.10072,-16.354808,-35.303704,-0.034569,-0.163184,-109.036398,0.533317,-0.929048,0.376263,-0.228106,-0.251959,-0.000567,0.566264,-0.000708,-0.02921,-0.104665,-0.001358,0.0,0.039208,-0.334356,-0.008999,-10591.176424,-45175.257711,-0.622901,-30.716053,-61790.157098,-0.243136,-41181.166127,-9239.707081,-2.10805,-8.3e-05,-0.622896,-4e-05,-0.620772,-0.012257,-0.107878,-572669500.0,-58.544078,953973100.0,1423655000.0,28.232664,-1.465191,-33.302382,-266.128986,-39.772492,-0.364694,-0.133771,-0.209468,-35.356505,-109.884564,-876.69102,-5.368281,-247.110707,-108.409742,-512.437331,-106.617978,-17.295406,-977.373846,-613.770792,-25.996269,-37.630448,-306.747724,-25.832889,-0.694428,-12.175933,-0.45614,1.0


In [22]:
data_test_merged.id.count() == data_test.id.count()

True

In [23]:
# сохраним подготовленные данные
data_train_merged.to_pickle('data_train_m.pickle')
data_test_merged.to_pickle('data_test_m.pickle')