## Notebook with feature engineering process

### Curated Features list (in addition to columns)

#### 1 Features based on Sessions actions:  
1. [x] Create features event_i with according to:  
    * event_i means that it's action_info event of order i  
    * take its first order in session, i.e. if events are show_nan_nan, show_view_p3 then values for show_view_p3 is 2  
    * normalize by deviding by total number of events in user's session
2. [x] COUNT for each action_type
3. [x] MEAN, MAX and other descriptive statistics of secs_elapsed deltas

#### 2 Aggregated on Sessions:  
1. [x] COUNT DISTINCT of device_type
2. [ ] % time spent on each action type
3. [ ] count sessions per each device, MODE of Device type  
4. [ ] given that timestamp_first_active is the start of the session, analyze hour (0-23) of activity

#### 3 Transformed from users:
1. [x] Hour of first activity - users['hour_factive'] = users.timestamp_first_active.dt.hour
2. [x] date of week of account_created

**TODO**: use age_gender_bktd and countries data for features generation

In [1]:
import pandas as pd
from datetime import datetime
from tqdm.notebook import tqdm
import numpy as np
from scipy import stats
from collections import Counter

pd.options.display.float_format = "{:.2f}".format
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
tqdm.pandas()
%load_ext autotime

time: 410 µs (started: 2021-08-16 10:05:02 +00:00)


### 0. Loading Data

In [2]:
users = pd.read_parquet('../data/processed/test_users.parquet')
users.shape

(62096, 14)

time: 135 ms (started: 2021-08-16 10:05:02 +00:00)


In [3]:
sessions = pd.read_parquet('../data/processed/sessions_test.parquet')
sessions.shape

(4934245, 7)

time: 1.8 s (started: 2021-08-16 10:05:02 +00:00)


In [4]:
users.date_account_created = users.date_account_created.apply(lambda x: datetime.strptime(x, '%Y-%m-%d'))
users.timestamp_first_active = users.timestamp_first_active.apply(lambda x: datetime.strptime(str(x), '%Y%m%d%H%M%S'))

time: 3.09 s (started: 2021-08-16 10:05:04 +00:00)


In [5]:
users.head()

Unnamed: 0,id,date_account_created,timestamp_first_active,gender,age,signup_method,signup_flow,language,affiliate_channel,affiliate_provider,first_affiliate_tracked,signup_app,first_device_type,first_browser
0,5uwns89zht,2014-07-01,2014-07-01 00:00:06,FEMALE,35.0,facebook,0,en,direct,direct,untracked,Moweb,iPhone,Mobile Safari
1,jtl0dijy2j,2014-07-01,2014-07-01 00:00:51,-unknown-,,basic,0,en,direct,direct,untracked,Moweb,iPhone,Mobile Safari
2,xx0ulgorjt,2014-07-01,2014-07-01 00:01:48,-unknown-,,basic,0,en,direct,direct,linked,Web,Windows Desktop,Chrome
3,6c6puo6ix0,2014-07-01,2014-07-01 00:02:15,-unknown-,,basic,0,en,direct,direct,linked,Web,Windows Desktop,IE
4,czqhjk3yfe,2014-07-01,2014-07-01 00:03:05,-unknown-,,basic,0,en,direct,direct,untracked,Web,Mac Desktop,Safari


time: 37.3 ms (started: 2021-08-16 10:05:07 +00:00)


In [6]:
sessions.head()

Unnamed: 0,user_id,action,action_type,action_detail,device_type,secs_elapsed,action_info
0,5uwns89zht,show,view,user_profile,-unknown-,79.0,show_view_user_profile
1,5uwns89zht,search,click,view_search_results,-unknown-,17962.0,search_click_view_search_results
2,5uwns89zht,search,click,view_search_results,-unknown-,64883.0,search_click_view_search_results
3,5uwns89zht,show,view,p3,-unknown-,31180.0,show_view_p3
4,5uwns89zht,authenticate,submit,login,iPhone,,authenticate_submit_login


time: 33.3 ms (started: 2021-08-16 10:05:07 +00:00)


### 2. Getting features based on Sessions

In [7]:
sessions.secs_elapsed.fillna(-1, inplace=True)
sessions.sort_values(['user_id', 'secs_elapsed'], inplace=True)
sessions.reset_index(drop=True, inplace=True)
sessions.shape

(4934245, 7)

time: 4.84 s (started: 2021-08-16 10:05:07 +00:00)


In [8]:
sessions.head(10)

Unnamed: 0,user_id,action,action_type,action_detail,device_type,secs_elapsed,action_info
0,0010k6l0om,callback,partner_callback,oauth_response,Mac Desktop,-1.0,callback_partner_callback_oauth_response
1,0010k6l0om,search_results,click,view_search_results,Mac Desktop,3.0,search_results_click_view_search_results
2,0010k6l0om,similar_listings_v2,,,Mac Desktop,9.0,similar_listings_v2_nan_nan
3,0010k6l0om,show,view,p3,Mac Desktop,22.0,show_view_p3
4,0010k6l0om,show,,,Mac Desktop,26.0,show_nan_nan
5,0010k6l0om,show,,,Mac Desktop,30.0,show_nan_nan
6,0010k6l0om,similar_listings_v2,,,Mac Desktop,34.0,similar_listings_v2_nan_nan
7,0010k6l0om,show,,,Mac Desktop,36.0,show_nan_nan
8,0010k6l0om,show,,,Mac Desktop,39.0,show_nan_nan
9,0010k6l0om,show,,,Mac Desktop,45.0,show_nan_nan


time: 21.1 ms (started: 2021-08-16 10:05:12 +00:00)


### 2.1 Generating features based on action_info Events vs its Order in the session stream with Normalization

Sessions actions:  
Create features event_i with according to:  
    * event_i means that it's action_info event of order i  
    * take its first order in session, i.e. if events are show_nan_nan, show_view_p3 then values for show_view_p3 is 2  
    * normalize by deviding by total number of events in user's session  

In [9]:
actions_info = list(sessions.action_info.unique())
len(actions_info)

336

time: 634 ms (started: 2021-08-16 10:05:12 +00:00)


In [10]:
tmp = sessions[['user_id', 'action_info']].groupby('user_id', as_index=False).agg(list)
tmp.shape

(61664, 2)

time: 2.43 s (started: 2021-08-16 10:05:13 +00:00)


In [11]:
tmp['size'] = tmp.action_info.apply(lambda x: len(x))

time: 55.4 ms (started: 2021-08-16 10:05:15 +00:00)


In [12]:
tmp.head()

Unnamed: 0,user_id,action_info,size
0,0010k6l0om,"[callback_partner_callback_oauth_response, sea...",62
1,0031awlkjq,"[authenticate_view_login_page, dashboard_view_...",8
2,00378ocvlh,"[create_submit_create_user, similar_listings_v...",73
3,0048rkdgb1,"[create_submit_signup, show_view_user_profile,...",46
4,0057snrdpu,"[authenticate_view_login_page, show_view_p3, s...",28


time: 23.8 ms (started: 2021-08-16 10:05:15 +00:00)


In [13]:
tmp.columns = ['user_id', 'action_info', 'seassion_length']

time: 1.31 ms (started: 2021-08-16 10:05:15 +00:00)


In [14]:
def find_action_info_pos(ai, ais):
    try:
        return ais.index(ai) + 1
    except ValueError:
        return None

time: 6.37 ms (started: 2021-08-16 10:05:15 +00:00)


In [15]:
for ai in tqdm(actions_info):
    tmp[f'ai_{ai}'] = tmp.action_info.apply(lambda x: find_action_info_pos(ai, x)) / tmp.size    

  0%|          | 0/336 [00:00<?, ?it/s]

  tmp[f'ai_{ai}'] = tmp.action_info.apply(lambda x: find_action_info_pos(ai, x)) / tmp.size


time: 1min (started: 2021-08-16 10:05:15 +00:00)


In [16]:
tmp.head()

Unnamed: 0,user_id,action_info,seassion_length,ai_callback_partner_callback_oauth_response,ai_search_results_click_view_search_results,ai_similar_listings_v2_nan_nan,ai_show_view_p3,ai_show_nan_nan,ai_how_it_works_-unknown-_-unknown-,ai_dashboard_view_dashboard,ai_header_userpic_data_header_userpic,ai_ajax_refresh_subtotal_click_change_trip_characteristics,ai_personalize_data_wishlist_content_update,ai_index_view_view_search_results,ai_index_-unknown-_-unknown-,ai_authenticate_view_login_page,ai_referrer_status_-unknown-_-unknown-,ai_create_multiple_-unknown-_-unknown-,ai_create_submit_create_user,ai_show_view_user_profile,ai_ask_question_submit_contact_host,ai_profile_pic_-unknown-_-unknown-,ai_ajax_check_dates_click_change_contact_host_dates,ai_edit_view_edit_profile,ai_update_-unknown-_-unknown-,ai_show_personalize_data_user_profile_content_update,ai_notifications_view_account_notification_settings,ai_delete_-unknown-_-unknown-,ai_open_graph_setting_-unknown-_-unknown-,ai_update_submit_update_user_profile,ai_ajax_lwlb_contact_click_contact_host,ai_impressions_view_p4,ai_message_to_host_focus_click_message_to_host_focus,ai_ajax_image_upload_-unknown-_-unknown-,ai_update_submit_update_listing,ai_populate_help_dropdown_-unknown-_-unknown-,ai_nan_message_post_message_post,ai_create_submit_signup,ai_reviews_data_listing_reviews,ai_active_-unknown-_-unknown-,ai_index_data_reservations,ai_unavailabilities_data_unavailable_dates,ai_campaigns_nan_nan,ai_payment_instruments_data_payment_instruments,ai_social_connections_data_user_social_connections,ai_search_click_view_search_results,ai_show_view_view_listing,ai_confirm_email_click_confirm_email_link,ai_collections_view_user_wishlists,ai_authenticate_submit_login,ai_set_user_submit_create_listing,ai_manage_listing_view_manage_listing,ai_phone_verification_modal_-unknown-_-unknown-,ai_show_view_p1,ai_index_view_listing_descriptions,ai_new_view_list_your_space,ai_create_view_list_your_space,ai_verify_-unknown-_-unknown-,ai_create_submit_create_phone_numbers,ai_update_submit_update_listing_description,ai_pending_booking_request_pending,ai_agree_terms_check_-unknown-_-unknown-,ai_requested_view_p5,ai_requested_submit_post_checkout_action,ai_message_to_host_change_click_message_to_host_change,ai_similar_listings_data_similar_listings,ai_lookup_nan_nan,ai_index_view_message_thread,ai_qt2_view_message_thread,ai_settings_-unknown-_-unknown-,ai_show_-unknown-_-unknown-,ai_update_submit_update_user,ai_notifications_submit_notifications,ai_cancellation_policies_view_cancellation_policies,ai_signup_login_view_signup_login_page,ai_cancellation_policy_click_click_cancellation_policy_click,ai_recommend_-unknown-_-unknown-,ai_phone_verification_number_sucessfully_submitted_-unknown-_-unknown-,ai_phone_verification_number_submitted_for_sms_-unknown-_-unknown-,ai_endpoint_error_-unknown-_-unknown-,ai_login_view_login_page,ai_index_view_message_inbox,ai_available_data_trip_availability,ai_index_nan_nan,ai_create_-unknown-_-unknown-,ai_reviews_data_user_reviews,ai_connect_submit_oauth_login,ai_track_page_view_nan_nan,ai_popular_view_popular_wishlists,ai_payment_methods_-unknown-_-unknown-,ai_coupon_code_click_click_coupon_code_click,ai_read_policy_click_click_read_policy_click,ai_other_hosting_reviews_first_-unknown-_-unknown-,ai_edit_verification_view_profile_verifications,ai_travel_plans_current_view_your_trips,ai_qt_reply_v2_submit_send_message,ai_handle_vanity_url_-unknown-_-unknown-,ai_phone_number_widget_-unknown-_-unknown-,ai_update_notifications_-unknown-_-unknown-,ai_languages_multiselect_-unknown-_-unknown-,ai_listings_view_user_listings,ai_apply_code_-unknown-_-unknown-,ai_create_submit_create_listing,ai_identity_-unknown-_-unknown-,ai_kba_-unknown-_-unknown-,ai_kba_update_-unknown-_-unknown-,ai_jumio_token_-unknown-_-unknown-,ai_at_checkpoint_booking_request_at_checkpoint,ai_jumio_redirect_-unknown-_-unknown-,ai_transaction_history_view_account_transaction_history,ai_account_-unknown-_-unknown-,ai_recommendations_data_listing_recommendations,ai_faq_-unknown-_-unknown-,ai_faq_category_-unknown-_-unknown-,ai_localization_settings_nan_nan,ai_ajax_photo_widget_form_iframe_-unknown-_-unknown-,ai_notifications_data_notifications,ai_reviews_new_-unknown-_-unknown-,ai_signature_-unknown-_-unknown-,ai_spoken_languages_data_user_languages,ai_click_click_book_it,ai_calendar_tab_inner2_-unknown-_-unknown-,ai_privacy_view_account_privacy_settings,ai_references_view_profile_references,ai_complete_status_-unknown-_-unknown-,ai_complete_redirect_-unknown-_-unknown-,ai_phone_verification_success_click_phone_verification_success,ai_terms_view_terms_and_privacy,ai_click_click_instant_book,ai_ajax_statsd_-unknown-_-unknown-,ai_tell_a_friend_-unknown-_-unknown-,ai_signup_modal_view_signup_modal,ai_facebook_auto_login_-unknown-_-unknown-,ai_phone_verification_phone_number_removed_-unknown-_-unknown-,ai_request_new_confirm_email_click_request_new_confirm_email,ai_receipt_view_guest_receipt,ai_index_view_your_listings,ai_change_currency_-unknown-_-unknown-,ai_delete_submit_delete_phone_numbers,ai_guest_billing_receipt_-unknown-_-unknown-,ai_phone_verification_number_submitted_for_call_-unknown-_-unknown-,ai_populate_from_facebook_-unknown-_-unknown-,ai_ajax_google_translate_description_-unknown-_-unknown-,ai_hospitality_-unknown-_-unknown-,ai_host_summary_view_host_home,ai_my_listings_view_your_reservations,ai_ajax_google_translate_reviews_click_translate_listing_reviews,ai_country_options_-unknown-_-unknown-,ai_payout_preferences_view_account_payout_preferences,ai_index_data_user_tax_forms,ai_top_destinations_-unknown-_-unknown-,ai_change_view_change_or_alter,ai_ajax_price_and_availability_click_alteration_field,ai_create_submit_create_alteration_request,ai_recent_reservations_-unknown-_-unknown-,ai_listings_-unknown-_-unknown-,ai_decision_tree_-unknown-_-unknown-,ai_index_view_user_wishlists,ai_my_view_user_wishlists,ai_mobile_landing_page_-unknown-_-unknown-,ai_currencies_nan_nan,ai_show_view_wishlist,ai_add_note_submit_wishlist_note,ai_p4_terms_click_p4_terms,ai_p4_refund_policy_terms_click_p4_refund_policy_terms,ai_itinerary_view_guest_itinerary,ai_webcam_upload_-unknown-_-unknown-,ai_guest_booked_elsewhere_message_post_message_post,ai_uptodate_nan_nan,ai_review_page_-unknown-_-unknown-,ai_glob_-unknown-_-unknown-,ai_apply_coupon_error_click_apply_coupon_error,ai_apply_coupon_error_type_-unknown-_-unknown-,ai_apply_reservation_submit_apply_coupon,ai_coupon_field_focus_click_coupon_field_focus,ai_login_modal_view_login_modal,ai_apply_coupon_click_click_apply_coupon_click,ai_airbnb_picks_view_airbnb_picks_wishlists,ai_push_notification_callback_-unknown-_-unknown-,ai_about_us_-unknown-_-unknown-,ai_pay_-unknown-_-unknown-,ai_supported_-unknown-_-unknown-,ai_mobile_oauth_callback_-unknown-_-unknown-,ai_upload_-unknown-_-unknown-,ai_10_message_post_message_post,ai_clear_reservation_-unknown-_-unknown-,ai_click_click_request_to_book,ai_set_password_view_set_password_page,ai_signed_out_modal_nan_nan,ai_set_password_submit_set_password,ai_salute_-unknown-_-unknown-,ai_edit_-unknown-_-unknown-,ai_update_cached_data_admin_templates,ai_pending_tickets_-unknown-_-unknown-,ai_this_hosting_reviews_click_listing_reviews_page,ai_show_data_translations,ai_reservation_-unknown-_-unknown-,ai_issue_-unknown-_-unknown-,ai_zendesk_login_jwt_-unknown-_-unknown-,ai_contact_new_-unknown-_-unknown-,ai_become_user_-unknown-_-unknown-,ai_submit_contact_-unknown-_-unknown-,ai_apply_coupon_click_success_click_apply_coupon_click_success,ai_office_location_-unknown-_-unknown-,ai_position_-unknown-_-unknown-,ai_transaction_history_paginated_-unknown-_-unknown-,ai_requirements_-unknown-_-unknown-,ai_countries_-unknown-_-unknown-,ai_agree_terms_uncheck_-unknown-_-unknown-,ai_cancel_submit_guest_cancellation,ai_travel_plans_previous_view_previous_trips,ai_overview_-unknown-_-unknown-,ai_rate_-unknown-_-unknown-,ai_trust_-unknown-_-unknown-,ai_update_country_of_residence_-unknown-_-unknown-,ai_friends_view_friends_wishlists,ai_show_code_-unknown-_-unknown-,ai_click_click_complete_booking,ai_12_message_post_message_post,ai_search_-unknown-_-unknown-,ai_terms_and_conditions_-unknown-_-unknown-,ai_qt_with_data_lookup_message_thread,ai_authorize_-unknown-_-unknown-,ai_11_message_post_message_post,ai_request_photography_-unknown-_-unknown-,ai_photography_-unknown-_-unknown-,ai_pending_-unknown-_-unknown-,ai_destroy_-unknown-_-unknown-,ai_guarantee_view_host_guarantee,ai_why_host_-unknown-_-unknown-,ai_check_nan_nan,ai_ajax_payout_edit_-unknown-_-unknown-,ai_payout_update_-unknown-_-unknown-,ai_ajax_payout_options_by_country_-unknown-_-unknown-,ai_recommendations_data_user_friend_recommendations,ai_toggle_starred_thread_click_toggle_starred_thread,ai_friends_new_-unknown-_-unknown-,ai_forgot_password_click_forgot_password,ai_patch_-unknown-_-unknown-,ai_toggle_archived_thread_click_toggle_archived_thread,ai_remove_dashboard_alert_click_remove_dashboard_alert,ai_email_wishlist_click_email_wishlist_button,ai_email_share_submit_email_wishlist,ai_confirm_email_click_confirm_email,ai_update_hide_from_search_engines_-unknown-_-unknown-,ai_update_friends_display_-unknown-_-unknown-,ai_phone_verification_call_taking_too_long_-unknown-_-unknown-,ai_email_itinerary_colorbox_-unknown-_-unknown-,ai_department_-unknown-_-unknown-,ai_departments_-unknown-_-unknown-,ai_invalid_action_-unknown-_-unknown-,ai_click_click_contact_host,ai_status_-unknown-_-unknown-,ai_create_airbnb_-unknown-_-unknown-,ai_forgot_password_submit_forgot_password,ai_update_nan_nan,ai_delete_submit_delete_listing,ai_jumio_-unknown-_-unknown-,ai_15_message_post_message_post,ai_change_password_submit_change_password,ai_email_by_key_-unknown-_-unknown-,ai_ajax_google_translate_-unknown-_-unknown-,ai_delete_submit_delete_listing_description,ai_update_reservation_requirements_-unknown-_-unknown-,ai_change_availability_submit_change_availability,ai_tos_confirm_-unknown-_-unknown-,ai_signup_weibo_referral_-unknown-_-unknown-,ai_weibo_signup_referral_finish_-unknown-_-unknown-,ai_home_safety_landing_-unknown-_-unknown-,ai_redirect_-unknown-_-unknown-,ai_listing_view_p3,ai_phone_verification_error_-unknown-_-unknown-,ai_place_worth_view_place_worth,ai_ajax_worth_submit_calculate_worth,ai_clickthrough_-unknown-_-unknown-,ai_create_ach_-unknown-_-unknown-,ai_remove_dashboard_alert_-unknown-_-unknown-,ai_complete_-unknown-_-unknown-,ai_locations_-unknown-_-unknown-,ai_localized_-unknown-_-unknown-,ai_domains_-unknown-_-unknown-,ai_acculynk_session_obtained_-unknown-_-unknown-,ai_acculynk_bin_check_success_-unknown-_-unknown-,ai_acculynk_load_pin_pad_-unknown-_-unknown-,ai_new_session_-unknown-_-unknown-,ai_open_hard_fallback_modal_-unknown-_-unknown-,ai_image_order_-unknown-_-unknown-,ai_life_-unknown-_-unknown-,ai_press_news_-unknown-_-unknown-,ai_envoy_form_-unknown-_-unknown-,ai_city_count_-unknown-_-unknown-,ai_print_confirmation_-unknown-_-unknown-,ai_other_hosting_reviews_-unknown-_-unknown-,ai_detect_fb_session_-unknown-_-unknown-,ai_founders_-unknown-_-unknown-,ai_united-states_-unknown-_-unknown-,ai_signup_weibo_-unknown-_-unknown-,ai_qt_reply_v2_-unknown-_-unknown-,ai_show_view_alteration_request,ai_respond_submit_respond_to_alteration_request,ai_sublets_-unknown-_-unknown-,ai_slideshow_-unknown-_-unknown-,ai_create_paypal_-unknown-_-unknown-,ai_questions_-unknown-_-unknown-,ai_media_resources_-unknown-_-unknown-,ai_views_-unknown-_-unknown-,ai_photography_update_-unknown-_-unknown-,ai_payoneer_account_redirect_-unknown-_-unknown-,ai_social_-unknown-_-unknown-,ai_change_default_payout_-unknown-_-unknown-,ai_home_safety_terms_-unknown-_-unknown-,ai_ajax_special_offer_dates_available_click_special_offer_field,ai_payout_delete_-unknown-_-unknown-,ai_sync_-unknown-_-unknown-,ai_new_-unknown-_-unknown-,ai_toggle_availability_-unknown-_-unknown-,ai_load_more_-unknown-_-unknown-,ai_acculynk_pin_pad_inactive_-unknown-_-unknown-,ai_add_guests_-unknown-_-unknown-,ai_set_default_-unknown-_-unknown-,ai_preapproval_message_post_message_post,ai_envoy_bank_details_redirect_-unknown-_-unknown-,ai_badge_-unknown-_-unknown-,ai_payoneer_signup_complete_-unknown-_-unknown-,ai_social-media_-unknown-_-unknown-,ai_approve_submit_host_respond,ai_booking_booking_response_booking,ai_has_profile_pic_-unknown-_-unknown-,ai_feed_-unknown-_-unknown-,ai_southern-europe_-unknown-_-unknown-,ai_maybe_information_message_post_message_post,ai_special_offer_message_post_message_post,ai_track_activity_nan_nan,ai_phone_verification_nan_nan,ai_satisfy_nan_nan,ai_ajax_payout_split_edit_-unknown-_-unknown-,ai_reputation_-unknown-_-unknown-,ai_recommendation_page_-unknown-_-unknown-,ai_approve_-unknown-_-unknown-
0,0010k6l0om,"[callback_partner_callback_oauth_response, sea...",62,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,0031awlkjq,"[authenticate_view_login_page, dashboard_view_...",8,,,,,,,0.0,0.0,,,,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,00378ocvlh,"[create_submit_create_user, similar_listings_v...",73,,,0.0,0.0,0.0,,0.0,0.0,0.0,0.0,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,0048rkdgb1,"[create_submit_signup, show_view_user_profile,...",46,,,,0.0,,,,,,,,,,,,,0.0,,,,,,,,,,,,,,,,,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,0057snrdpu,"[authenticate_view_login_page, show_view_p3, s...",28,,,,0.0,,,0.0,0.0,,,,,0.0,,,,0.0,,,,,,,,,,,,,,,,,,,0.0,0.0,,0.0,,,,0.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


time: 270 ms (started: 2021-08-16 10:06:15 +00:00)


In [17]:
tmp.drop('action_info', axis=1, inplace=True)

time: 178 ms (started: 2021-08-16 10:06:16 +00:00)


#### Checking counts of missing values per each column

In [18]:
not_missing = pd.DataFrame(tmp.notna().sum()).reset_index()
not_missing.columns = ['col', 'counts']
not_missing['ratio'] = not_missing['counts'].apply(lambda x: round(x / len(users), 4))
not_missing.shape

(338, 3)

time: 116 ms (started: 2021-08-16 10:06:16 +00:00)


In [19]:
not_missing.head()

Unnamed: 0,col,counts,ratio
0,user_id,61664,0.99
1,seassion_length,61664,0.99
2,ai_callback_partner_callback_oauth_response,9094,0.15
3,ai_search_results_click_view_search_results,18356,0.3
4,ai_similar_listings_v2_nan_nan,18097,0.29


time: 16.4 ms (started: 2021-08-16 10:06:16 +00:00)


In [20]:
threshold = 0.00005
mask = not_missing.ratio > threshold
mask.sum()

303

time: 10.2 ms (started: 2021-08-16 10:06:16 +00:00)


#### Dropping all columns that are lower than the above threshold

In [21]:
keep_columns = not_missing[mask].col.tolist()
len(keep_columns)

303

time: 11.8 ms (started: 2021-08-16 10:06:16 +00:00)


In [22]:
keep_columns[0], keep_columns[-1]

('user_id', 'ai_add_guests_-unknown-_-unknown-')

time: 9.19 ms (started: 2021-08-16 10:06:16 +00:00)


In [23]:
features1 = tmp[keep_columns].copy(deep=True)
features1.shape

(61664, 303)

time: 147 ms (started: 2021-08-16 10:06:16 +00:00)


### 2.1.1 Count of each action_type normalized

In [24]:
col = 'action_type'
col_values = list(sessions[col].unique())
len(col_values)

10

time: 396 ms (started: 2021-08-16 10:06:16 +00:00)


In [25]:
tmp = sessions[['user_id', col]].groupby('user_id', as_index=False).agg(list)
tmp.shape

(61664, 2)

time: 3.04 s (started: 2021-08-16 10:06:17 +00:00)


In [26]:
tmp['size'] = tmp[col].apply(lambda x: len(x))

time: 75.9 ms (started: 2021-08-16 10:06:20 +00:00)


In [27]:
tmp['counts'] = tmp[col].apply(lambda x: dict(Counter(x)))

time: 791 ms (started: 2021-08-16 10:06:20 +00:00)


In [28]:
tmp.head()

Unnamed: 0,user_id,action_type,size,counts
0,0010k6l0om,"[partner_callback, click, None, view, None, No...",62,"{'partner_callback': 1, 'click': 16, None: 15,..."
1,0031awlkjq,"[view, view, -unknown-, data, view, -unknown-,...",8,"{'view': 3, '-unknown-': 4, 'data': 1}"
2,00378ocvlh,"[submit, None, data, click, view, None, submit...",73,"{'submit': 5, None: 4, 'data': 4, 'click': 5, ..."
3,0048rkdgb1,"[submit, view, data, view, view, view, -unknow...",46,"{'submit': 1, 'view': 20, 'data': 19, '-unknow..."
4,0057snrdpu,"[view, view, view, data, view, view, view, dat...",28,"{'view': 11, 'data': 8, 'click': 7, '-unknown-..."


time: 32.4 ms (started: 2021-08-16 10:06:21 +00:00)


In [29]:
tmp = pd.concat([tmp, pd.json_normalize(tmp['counts'])], axis=1)

time: 710 ms (started: 2021-08-16 10:06:21 +00:00)


In [30]:
tmp.drop(['action_type', 'counts'], axis=1, inplace=True)

time: 19.6 ms (started: 2021-08-16 10:06:21 +00:00)


In [31]:
tmp.head()

Unnamed: 0,user_id,size,partner_callback,click,NaN,view,-unknown-,data,submit,message_post,booking_request,booking_response
0,0010k6l0om,62,1.0,16.0,15.0,17.0,4.0,9.0,,,,
1,0031awlkjq,8,,,,3.0,4.0,1.0,,,,
2,00378ocvlh,73,,5.0,4.0,34.0,20.0,4.0,5.0,1.0,,
3,0048rkdgb1,46,,4.0,1.0,20.0,1.0,19.0,1.0,,,
4,0057snrdpu,28,,7.0,,11.0,2.0,8.0,,,,


time: 40.1 ms (started: 2021-08-16 10:06:21 +00:00)


In [32]:
cols = list(tmp)[2:]
cols = [f'at_{e}' for e in cols]

time: 1.18 ms (started: 2021-08-16 10:06:21 +00:00)


In [33]:
tmp.columns = ['user_id', 'size'] + cols

time: 859 µs (started: 2021-08-16 10:06:21 +00:00)


In [34]:
for e in cols:
    tmp[e] = tmp[e] / tmp['size']

time: 25.3 ms (started: 2021-08-16 10:06:21 +00:00)


In [35]:
tmp.head()

Unnamed: 0,user_id,size,at_partner_callback,at_click,at_None,at_view,at_-unknown-,at_data,at_submit,at_message_post,at_booking_request,at_booking_response
0,0010k6l0om,62,0.02,0.26,0.24,0.27,0.06,0.15,,,,
1,0031awlkjq,8,,,,0.38,0.5,0.12,,,,
2,00378ocvlh,73,,0.07,0.05,0.47,0.27,0.05,0.07,0.01,,
3,0048rkdgb1,46,,0.09,0.02,0.43,0.02,0.41,0.02,,,
4,0057snrdpu,28,,0.25,,0.39,0.07,0.29,,,,


time: 29.9 ms (started: 2021-08-16 10:06:22 +00:00)


In [36]:
tmp.drop(['size'], axis=1, inplace=True)

time: 27.5 ms (started: 2021-08-16 10:06:22 +00:00)


In [37]:
tmp.fillna(0, inplace=True)

time: 32.3 ms (started: 2021-08-16 10:06:22 +00:00)


In [38]:
tmp.head()

Unnamed: 0,user_id,at_partner_callback,at_click,at_None,at_view,at_-unknown-,at_data,at_submit,at_message_post,at_booking_request,at_booking_response
0,0010k6l0om,0.02,0.26,0.24,0.27,0.06,0.15,0.0,0.0,0.0,0.0
1,0031awlkjq,0.0,0.0,0.0,0.38,0.5,0.12,0.0,0.0,0.0,0.0
2,00378ocvlh,0.0,0.07,0.05,0.47,0.27,0.05,0.07,0.01,0.0,0.0
3,0048rkdgb1,0.0,0.09,0.02,0.43,0.02,0.41,0.02,0.0,0.0,0.0
4,0057snrdpu,0.0,0.25,0.0,0.39,0.07,0.29,0.0,0.0,0.0,0.0


time: 29.1 ms (started: 2021-08-16 10:06:22 +00:00)


In [39]:
features1a = tmp.copy(deep=True)
features1a.shape

(61664, 11)

time: 14.8 ms (started: 2021-08-16 10:06:22 +00:00)


### 2.2 Generating features based on seconds elapsed info

In [40]:
tmp = sessions[['user_id', 'secs_elapsed']].groupby('user_id', as_index=False).agg(list)
tmp.shape

(61664, 2)

time: 3.19 s (started: 2021-08-16 10:06:22 +00:00)


In [41]:
tmp.head()

Unnamed: 0,user_id,secs_elapsed
0,0010k6l0om,"[-1.0, 3.0, 9.0, 22.0, 26.0, 30.0, 34.0, 36.0,..."
1,0031awlkjq,"[-1.0, 388.0, 642.0, 719.0, 1675.0, 2795.0, 97..."
2,00378ocvlh,"[-1.0, 14.0, 19.0, 23.0, 24.0, 155.0, 166.0, 2..."
3,0048rkdgb1,"[-1.0, 7.0, 19.0, 27.0, 48.0, 49.0, 52.0, 59.0..."
4,0057snrdpu,"[-1.0, 41.0, 54.0, 65.0, 84.0, 112.0, 178.0, 2..."


time: 20.9 ms (started: 2021-08-16 10:06:25 +00:00)


In [42]:
tmp.secs_elapsed = tmp.secs_elapsed.apply(lambda x: [0] + x[1:])

time: 650 ms (started: 2021-08-16 10:06:25 +00:00)


In [43]:
tmp['deltas'] = tmp['secs_elapsed'].apply(lambda x: [int(j - i) for i, j in zip(x[:-1], x[1:])])

time: 1.9 s (started: 2021-08-16 10:06:26 +00:00)


In [44]:
tmp.head()

Unnamed: 0,user_id,secs_elapsed,deltas
0,0010k6l0om,"[0, 3.0, 9.0, 22.0, 26.0, 30.0, 34.0, 36.0, 39...","[3, 6, 13, 4, 4, 4, 2, 3, 6, 1, 3, 4, 30, 8, 1..."
1,0031awlkjq,"[0, 388.0, 642.0, 719.0, 1675.0, 2795.0, 9797....","[388, 254, 77, 956, 1120, 7002, 13761]"
2,00378ocvlh,"[0, 14.0, 19.0, 23.0, 24.0, 155.0, 166.0, 269....","[14, 5, 4, 1, 131, 11, 103, 23, 15, 23, 6, 29,..."
3,0048rkdgb1,"[0, 7.0, 19.0, 27.0, 48.0, 49.0, 52.0, 59.0, 5...","[7, 12, 8, 21, 1, 3, 7, 0, 1, 1, 8, 4, 23, 1, ..."
4,0057snrdpu,"[0, 41.0, 54.0, 65.0, 84.0, 112.0, 178.0, 248....","[41, 13, 11, 19, 28, 66, 70, 134, 30, 10, 222,..."


time: 29.5 ms (started: 2021-08-16 10:06:28 +00:00)


In [45]:
def get_statistics(x):
    if not x:
        return None, None, None, None
    x = np.array(x)
    return x.mean(), x.std(), x.max(), np.median(x)

time: 7.96 ms (started: 2021-08-16 10:06:28 +00:00)


In [46]:
def get_statistics_no_outliers(x):
    if not x:
        return None, None, None, None, None
    x = np.array(x)
    initial_size = len(x)
    x = [e for e in x if e <= x.mean() + x.std()]
    outliers_count = initial_size - len(x)
    x = np.array(x)
    return x.mean(), x.std(), x.max(), np.median(x), outliers_count

time: 7.64 ms (started: 2021-08-16 10:06:28 +00:00)


In [47]:
get_statistics(tmp.iloc[0].deltas)

(2096.688524590164, 6051.6816645092, 34874, 31.0)

time: 10.3 ms (started: 2021-08-16 10:06:28 +00:00)


In [48]:
get_statistics_no_outliers(tmp.iloc[0].deltas)

(704.280701754386, 1636.6660780474087, 8000, 24.0, 4)

time: 13.2 ms (started: 2021-08-16 10:06:28 +00:00)


In [49]:
tmp = pd.concat([tmp, tmp.deltas.progress_apply(lambda x: pd.Series(get_statistics(x)))], axis=1)
tmp.shape

  0%|          | 0/61664 [00:00<?, ?it/s]

(61664, 7)

time: 36.9 s (started: 2021-08-16 10:06:28 +00:00)


In [50]:
tmp.columns = ['user_id', 'secs_elapsed', 'deltas', 'deltas_mean', 'deltas_std', 'deltas_max', 'deltas_median']

time: 1.62 ms (started: 2021-08-16 10:07:05 +00:00)


In [51]:
tmp.head()

Unnamed: 0,user_id,secs_elapsed,deltas,deltas_mean,deltas_std,deltas_max,deltas_median
0,0010k6l0om,"[0, 3.0, 9.0, 22.0, 26.0, 30.0, 34.0, 36.0, 39...","[3, 6, 13, 4, 4, 4, 2, 3, 6, 1, 3, 4, 30, 8, 1...",2096.69,6051.68,34874.0,31.0
1,0031awlkjq,"[0, 388.0, 642.0, 719.0, 1675.0, 2795.0, 9797....","[388, 254, 77, 956, 1120, 7002, 13761]",3365.43,4803.27,13761.0,956.0
2,00378ocvlh,"[0, 14.0, 19.0, 23.0, 24.0, 155.0, 166.0, 269....","[14, 5, 4, 1, 131, 11, 103, 23, 15, 23, 6, 29,...",8391.38,46492.0,386212.0,78.0
3,0048rkdgb1,"[0, 7.0, 19.0, 27.0, 48.0, 49.0, 52.0, 59.0, 5...","[7, 12, 8, 21, 1, 3, 7, 0, 1, 1, 8, 4, 23, 1, ...",1992.78,7326.2,48094.0,35.0
4,0057snrdpu,"[0, 41.0, 54.0, 65.0, 84.0, 112.0, 178.0, 248....","[41, 13, 11, 19, 28, 66, 70, 134, 30, 10, 222,...",37878.26,181708.52,964245.0,402.0


time: 83.4 ms (started: 2021-08-16 10:07:05 +00:00)


In [52]:
tmp = pd.concat([tmp, tmp.deltas.progress_apply(lambda x: pd.Series(get_statistics_no_outliers(x)))], axis=1)
tmp.shape

  0%|          | 0/61664 [00:00<?, ?it/s]

(61664, 12)

time: 6min 59s (started: 2021-08-16 10:07:05 +00:00)


In [53]:
tmp.columns = [
    'user_id', 'secs_elapsed', 'deltas', 'deltas_mean', 'deltas_std', 'deltas_max', 'deltas_median', 
    'deltas_no_mean', 'deltas_no_std', 'deltas_no_max', 'deltas_no_median', 'deltas_no_num_outliers'
]

time: 1.51 ms (started: 2021-08-16 10:14:04 +00:00)


In [54]:
tmp.head()

Unnamed: 0,user_id,secs_elapsed,deltas,deltas_mean,deltas_std,deltas_max,deltas_median,deltas_no_mean,deltas_no_std,deltas_no_max,deltas_no_median,deltas_no_num_outliers
0,0010k6l0om,"[0, 3.0, 9.0, 22.0, 26.0, 30.0, 34.0, 36.0, 39...","[3, 6, 13, 4, 4, 4, 2, 3, 6, 1, 3, 4, 30, 8, 1...",2096.69,6051.68,34874.0,31.0,704.28,1636.67,8000.0,24.0,4.0
1,0031awlkjq,"[0, 388.0, 642.0, 719.0, 1675.0, 2795.0, 9797....","[388, 254, 77, 956, 1120, 7002, 13761]",3365.43,4803.27,13761.0,956.0,1632.83,2429.69,7002.0,672.0,1.0
2,00378ocvlh,"[0, 14.0, 19.0, 23.0, 24.0, 155.0, 166.0, 269....","[14, 5, 4, 1, 131, 11, 103, 23, 15, 23, 6, 29,...",8391.38,46492.0,386212.0,78.0,1721.44,5115.18,28696.0,71.0,2.0
3,0048rkdgb1,"[0, 7.0, 19.0, 27.0, 48.0, 49.0, 52.0, 59.0, 5...","[7, 12, 8, 21, 1, 3, 7, 0, 1, 1, 8, 4, 23, 1, ...",1992.78,7326.2,48094.0,35.0,513.98,1290.47,6839.0,26.5,3.0
4,0057snrdpu,"[0, 41.0, 54.0, 65.0, 84.0, 112.0, 178.0, 248....","[41, 13, 11, 19, 28, 66, 70, 134, 30, 10, 222,...",37878.26,181708.52,964245.0,402.0,2248.77,3531.96,12416.0,328.5,1.0


time: 42.7 ms (started: 2021-08-16 10:14:04 +00:00)


In [55]:
tmp.drop(['secs_elapsed', 'deltas'], axis=1, inplace=True)

time: 39.2 ms (started: 2021-08-16 10:14:04 +00:00)


In [56]:
features2 = tmp.copy(deep=True)
features2.shape

(61664, 10)

time: 15.6 ms (started: 2021-08-16 10:14:04 +00:00)


### 2.3 Generating features based on device type info

In [57]:
tmp = sessions[['user_id', 'device_type']].groupby('user_id', as_index=False).agg(set)
tmp.shape

(61664, 2)

time: 3.35 s (started: 2021-08-16 10:14:04 +00:00)


In [58]:
tmp['size'] = tmp.device_type.apply(lambda x: len(x))

time: 109 ms (started: 2021-08-16 10:14:08 +00:00)


In [59]:
tmp.drop('device_type', axis=1, inplace=True)

time: 36.5 ms (started: 2021-08-16 10:14:08 +00:00)


In [60]:
tmp.head()

Unnamed: 0,user_id,size
0,0010k6l0om,1
1,0031awlkjq,1
2,00378ocvlh,1
3,0048rkdgb1,1
4,0057snrdpu,2


time: 23.9 ms (started: 2021-08-16 10:14:08 +00:00)


In [61]:
tmp.columns = ['user_id', 'device_count']

time: 1.06 ms (started: 2021-08-16 10:14:08 +00:00)


In [62]:
tmp.head()

Unnamed: 0,user_id,device_count
0,0010k6l0om,1
1,0031awlkjq,1
2,00378ocvlh,1
3,0048rkdgb1,1
4,0057snrdpu,2


time: 27.3 ms (started: 2021-08-16 10:14:08 +00:00)


In [63]:
features3 = tmp.copy(deep=True)
features3.shape

(61664, 2)

time: 9.36 ms (started: 2021-08-16 10:14:08 +00:00)


### 3.1 Features based on Users table

In [64]:
users['dow_registered'] = users.date_account_created.dt.weekday

time: 22.2 ms (started: 2021-08-16 10:14:08 +00:00)


In [65]:
users['hr_registered'] = users.timestamp_first_active.dt.hour

time: 29 ms (started: 2021-08-16 10:14:08 +00:00)


In [66]:
users.sample(5)

Unnamed: 0,id,date_account_created,timestamp_first_active,gender,age,signup_method,signup_flow,language,affiliate_channel,affiliate_provider,first_affiliate_tracked,signup_app,first_device_type,first_browser,dow_registered,hr_registered
17509,zjoqoueotj,2014-07-26,2014-07-26 13:46:12,-unknown-,,basic,0,en,direct,direct,linked,Moweb,iPhone,Mobile Safari,5,13
8083,ezbr8c52bv,2014-07-15,2014-07-15 15:42:34,FEMALE,55.0,facebook,0,en,content,google,omg,Web,Windows Desktop,IE,1,15
24308,1jip44syma,2014-08-04,2014-08-04 23:33:25,-unknown-,,basic,23,ko,direct,direct,untracked,Android,Android Phone,-unknown-,0,23
26686,dkrva2il0x,2014-08-08,2014-08-08 01:45:58,FEMALE,105.0,basic,0,en,direct,direct,untracked,Web,Mac Desktop,Chrome,4,1
62009,st4ctniau6,2014-09-30,2014-09-30 22:02:27,MALE,38.0,basic,25,en,direct,direct,untracked,iOS,iPhone,-unknown-,1,22


time: 39.4 ms (started: 2021-08-16 10:14:08 +00:00)


### 3.1.1. Dropping redundand columns

In [67]:
users.drop(['date_account_created', 'timestamp_first_active'], axis=1, inplace=True)

time: 20.4 ms (started: 2021-08-16 10:14:08 +00:00)


In [68]:
users.columns = ['user_id'] + list(users)[1:]

time: 2.58 ms (started: 2021-08-16 10:14:08 +00:00)


In [69]:
users.head()

Unnamed: 0,user_id,gender,age,signup_method,signup_flow,language,affiliate_channel,affiliate_provider,first_affiliate_tracked,signup_app,first_device_type,first_browser,dow_registered,hr_registered
0,5uwns89zht,FEMALE,35.0,facebook,0,en,direct,direct,untracked,Moweb,iPhone,Mobile Safari,1,0
1,jtl0dijy2j,-unknown-,,basic,0,en,direct,direct,untracked,Moweb,iPhone,Mobile Safari,1,0
2,xx0ulgorjt,-unknown-,,basic,0,en,direct,direct,linked,Web,Windows Desktop,Chrome,1,0
3,6c6puo6ix0,-unknown-,,basic,0,en,direct,direct,linked,Web,Windows Desktop,IE,1,0
4,czqhjk3yfe,-unknown-,,basic,0,en,direct,direct,untracked,Web,Mac Desktop,Safari,1,0


time: 29.3 ms (started: 2021-08-16 10:14:08 +00:00)


In [70]:
users.shape

(62096, 14)

time: 7.37 ms (started: 2021-08-16 10:14:08 +00:00)


#### 4. Assembling all features into one dataset

In [71]:
df = users.merge(features1, on='user_id', how='inner')
df.shape

(61664, 316)

time: 385 ms (started: 2021-08-16 10:14:08 +00:00)


In [72]:
df = df.merge(features1a, on='user_id', how='inner')
df.shape

(61664, 326)

time: 406 ms (started: 2021-08-16 10:14:09 +00:00)


In [73]:
df = df.merge(features2, on='user_id', how='inner')
df.shape

(61664, 335)

time: 215 ms (started: 2021-08-16 10:14:09 +00:00)


In [74]:
df = df.merge(features3, on='user_id', how='inner')
df.shape

(61664, 336)

time: 197 ms (started: 2021-08-16 10:14:09 +00:00)


In [75]:
df.to_parquet('../data/processed/features_test.parquet')

time: 1.59 s (started: 2021-08-16 10:14:10 +00:00)
