Skip to content
This repository has been archived by the owner on Apr 8, 2024. It is now read-only.

Successfully installed catboost wants to import as _catboost #40 #915

Open
rm-minus-r-star opened this issue Jan 18, 2024 · 9 comments
Open
Labels
bug Something isn't working

Comments

@rm-minus-r-star
Copy link

Describe the bug
Attempt to import catboost results in error that module _catboost cannot be found -- a leading underscore is picked up somewhere.

Your environment

  • OS: Xubuntu 20.04
  • Package Versions:
dbt:1.5.9
fal:1.5.4
  • Adapter being used:
postgres:1.5.9

How to reproduce
I'm trying to use a model trained outside dbt to predict labels via python under dbt-fal

fal-project.yml:

environments:
  - name: ml
    type: venv
    requirements:
      - scipy
      - pandas
      - numpy
      - statsmodels
      - catboost

catboost was just added to this code where other models with the other libraries listed work well. The first run of the file below produced a long installation log to stdout, ending with

[builder] [info] Successfully installed [...] catboost-1.2.2 [...]

Running the python model below with dbt run select ... gives me the subsequent error

from catboost import CatBoostRegressor
from pandas import concat

def model(dbt, fal):
    dbt.config(fal_environment="ml")

    df: pandas.DataFrame = dbt.ref("tr_rep_gentrification_prediction_inputs")

    X = df\
        .drop(['col0', 'col1', 'col2'], axis=1)\
        .fillna(0.0)

    catb = CatBoostRegressor()
    catb.load_model('cb_model.cbm')

    pred = catb.predict(X)
    results = concat([df, pred], axis=0)

    return(results)

stdout:

No module named '_catboost'
22:55:01  1 of 1 ERROR creating python table model trans.tr_rep_gentrification_prediction_outputs  [ERROR in 42.02s]
22:55:02  
22:55:02  Finished running 1 table model in 0 hours 0 minutes and 58.89 seconds (58.89s).
22:55:02  
22:55:02  Completed with 1 error and 0 warnings:
22:55:02  
22:55:02  No module named '_catboost'

If I remove catboost from the fal-project.yml file, I get the same error (as expected), but the leading underscore no longer appears.

I also tried as recommended by @mederka at fal-ai/fal#40 (comment) to import within the model function instead, but I get the same error.

Expected behavior
I expect catboost to be imported the same as every other library

Actual behavior
model fails to run owing to _catboost not being found -- a leading underscore is being added.

Screenshots
None

Additional context
Also posted Here in case there's a more generally obvious solution

@chamini2
Copy link
Member

it seems that this is a library, I think this is more about how catboost installs than dbt-fal itself.

https://github.com/catboost/catboost/blob/d6172a4e4b11f485c416368461feae3f3ce98745/catboost/python-package/catboost/_catboost.pyx

@rm-minus-r-star
Copy link
Author

rm-minus-r-star commented Jan 18, 2024

Hmm. It installs fine outside of dbt-fal though.

CatBoostRegressor appears to be exported out of the package level init.py from core.py. I'm not familiar with why a cython script file in the same directory would interfere?

@chamini2
Copy link
Member

can you add more details around

[builder]  [info]    Successfully installed [...] catboost-1.2.2 [...]

see if we can find a hint there

@rm-minus-r-star
Copy link
Author

tmperr.txt

I had a look over this too, nothing jumped out at me, but I'm not an expert.

This log ended with a silly error on my part when trying to run the python model -- after fixing the obvious, I get the errors as quoted in the bug report.

@chamini2
Copy link
Member

Can you try to build it with a conda environment instead?

environments:
  - name: ml
    type: conda
    packages:
      - scipy
      - pandas
      - numpy
      - statsmodels
      - catboost

@rm-minus-r-star
Copy link
Author

rm-minus-r-star commented Jan 23, 2024

(

@chamini2 noted, but seriously struggling to get conda functional. I've tried so many things. Should this be a no-brainer? Or does this actually give you info?

No matter what I try, I get

Could not find conda executable. If conda executable is not available by default, please point isolate to the path where conda binary is available 'ISOLATE_CONDA_HOME'.

)

@chamini2
Copy link
Member

You need to have conda installed to be able to use this, but I think will make your use case work.

@rm-minus-r-star
Copy link
Author

You need to have conda installed to be able to use this, but I think will make your use case work.

Yeah, I installed conda, tried setting the env var to every level of the install location, and activated it in the same shell, all with no joy. Great the hear that it sounds positive for the venv type.

@rm-minus-r-star
Copy link
Author

rm-minus-r-star commented Mar 10, 2024

[...], but I think will make your use case work.

Any luck here?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants