## Modeling

> - Author: Shenghui, Yibo
> - Date: 2023/05

We have packed model, data, train, and grid-search into our `./horse/`, here we present some **useful commands for different conditions (Part 1)**.

Note that most model is trained based on GPU by default.

Besides, it also works for **scirpts in notebooks (Part 2)**.

## 1 Easy Devs with Commands

### 1.1 Machine Learning Models

In [None]:
!python ./horse/ml_train.py --logistic --C 0.0001

### 1.2 Deep Learning Models
   

In [None]:
# Point-wise
!python ./horse/train.py --model_name EmbMLP --k_dim_field 4 --k_dim_id 16 --num_layers 5 --epoch 8 --batch_size 20 --learning_rate 5e-5 --weight_decay 1e-3

In [None]:
# Pair-wise
!python ./horse/train_pairwise.py --model_name EmbMLP --k_dim_field 4 --k_dim_id 16 --num_layers 5 --epoch 8 --batch_size 20 --learning_rate 5e-5 --weight_decay 1e-3

### 1.2 Grid Search for hyperparam-tunning

In [None]:
# ML
!python ./horse/grid_search.py --model_name logistic

In [None]:
# DL: Point-wise
!python ./horse/dl_grid_search.py --train_file_path ./horse/ml_train.py --model_name EmbMLP

In [None]:
# DL: Pair-wise
!python ./horse/dl_grid_search.py --train_file_path ./horse/train_pairwise.py --model_name EmbMLP 

## 2 Try with scripts!

## 2.1 Load Dataset 

In [1]:
from horse.data.load_data import DataSet


data = DataSet(
    # z-score standarlization for numeric features
    scaling=True                
    # mapping categorical with ix, for emb lookups
    , do_categorization=True    
    # best 22 features, check more in feature selection
    , use_best_feats=True       
)
train, val, test = data.my_train_val_test_split([0.8, 0.1, 0.1])

### 3.2 Call our models, design your own training ways!

Please note that all models involved are callable. As for machine learning, call as you do with sklearn; As for Deep Learning based models, we provide model class with torch, as well as training files.

In [2]:
from horse.model import HKJC_models as ml_model
from horse.model import racing_model as dl_model



""" 
[Choices for dl_model]:
- dl_model.LinEmbConcat()
- dl_model.LinEmbDotProd()
- dl_model.LinEmbElemProd()
- dl_model.EmbMLP()

[Choices for ml_model]:
- LogisticRegression
- DecisionTreeClassifier
- RandomForestClassifier
- AdaBoostClassifier
- XGBoostClassifier
"""

# try it by yourself

ModuleNotFoundError: No module named 'horse.model.factorization_machine'