Skip to content

hcho3/catboost_python_repro

Repository files navigation

Goal: Reverse-engineer Catboost model object so that we can run prediction with it

1. Install Catboost:
   pip install catboost
2. Install CityHash:
   pip install clickhouse-cityhash
3. Download the Adult data set from https://archive.ics.uci.edu/ml/datasets/Adult. Get the files
   adult.data and adult.test and place them in the same directory as the training script train.py.
4. Run train.py to fit the Catboost model on the Adult dataset. The script will save the model
   as a JSON file named adult_model.json.
5. Finally, run e2e_demo.py to see how we preprocess the features and evaluate Catboost trees.
   You may want to study the Python script more carefully to understand all details.
6. Run target_encoder_demo.py to observe that most of the preprocessing for categorical data can
   be abstracted as a target encoding operator. The target encoding operator is a lookup table
   that maps each possible categorical input to a real number.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages