Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catboost_spark doesn't recognise setAutoClassWeights and setScalePosWeight #2470

Closed
kamranesmaeili opened this issue Aug 6, 2023 · 2 comments
Assignees

Comments

@kamranesmaeili
Copy link

kamranesmaeili commented Aug 6, 2023

Problem: catboost pyspark implementation doesn't recognise setAutoClassWeights and setScalePosWeight as additional commands. For instance when I try setScalePosWeight(3.0) I get an error saying unknown option {scale_pos_weight} with value "3". Without using these I am able to train the model without any issues. Following are the code:

  • Catboost_spark.CatBoostClassifier(featuresCol='features', labelCol='label', evalMetric='AUC) - works well.
  • Catboost_spark.CatBoostClassifier(featuresCol='features', labelCol='label', evalMetric='AUC).setAutoClassWeights('Balanced')
  • Catboost_spark.CatBoostClassifier(featuresCol='features', labelCol='label', evalMetric='AUC).setScalePosWeight(3.0)
catboost version: catboost-spark_3.2_2.12:1.2 (maven) - spark 3.2.1 and scala 2.12
Operating System: EMR cluster
CPU: master node of m4.2xlarge and executor nodes of r4.xlarge (4 and 8V cores and 32 GiB memory)
GPU: N/A
@andrey-khropov
Copy link
Member

andrey-khropov commented Aug 28, 2023

  1. AutoClassWeights issue.

You should use enum value of type catboost_spark.EAutoClassWeightsType instead of string, as described in API documentation.

The proper syntax is

catboost_spark.CatBoostClassifier(featuresCol='features', labelCol='label', evalMetric='AUC).setAutoClassWeights(catboost_spark.EAutoClassWeightsType.Balanced)

I've checked and it works.

@andrey-khropov
Copy link
Member

  1. scalePosWeight support has been fixed in b37c1d5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants