Skip to content

DavidRomanovizc/Data_Fusion_Contest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Fusion 2022 Contest

8th place solution for Data Fusion 2022 Contest.

Rank Public Private
Matching 6 8

Used technology

Python Jupyter Numpy Pandas scikit_learn

Problem solving

  1. The main part of the task was the development of functions Before we start analyzing transactional data, we need to create useful features based on the transaction_dttm and transaction_amt columns. This will allow you to get more information in the context of various measurements in the future (such as time of day, days of the week, etc.), as well as use the obtained functions in machine learning models.

  2. Training:

    • CatBoostRanker with YetiRank loss with 9000 iterations,
    • Ensembling of 2 catboost models with different parameters.

Data

  1. General data for all tasks in a tabular .csv format: transactions.zip, clicstream.zip and the target variable train_matching.csv
  2. Common accompanying data for all tasks in tabular .csv format: mcc_codes.csv, click_categories.csv and currency_rk.csv
  3. Baselines and examples of solutions for a container Matching problem: random solution sample_submission.zip and baseline_catboost.zip with an example of a solution based on the catboost library using GPU

About

The solution that took 8th place in private. Data Fusion 2022

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published