Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Lambda function for passing data frame to the setup function in pycaret doesn't work #527

Closed
rcshetty3 opened this issue Nov 14, 2023 · 1 comment

Comments

@rcshetty3
Copy link

Minimal Code To Reproduce

from pycaret.classification import *

setup(data=lambda: get_data("juice", verbose=False, profile=False), target = 'Purchase', session_id=0, n_jobs=1);

Error message -
1 from pycaret.classification import *
----> 3 setup(data=lambda: get_data("juice", verbose=False, profile=False), target = 'Purchase', session_id=0, n_jobs=1)

File c:\ProgramData\anaconda3\envs\pycaretenv\lib\site-packages\pycaret\classification\functional.py:595, in setup(data, data_func, target, index, train_size, test_data, ordinal_features, numeric_features, categorical_features, date_features, text_features, ignore_features, keep_features, preprocess, create_date_columns, imputation_type, numeric_imputation, categorical_imputation, iterative_imputation_iters, numeric_iterative_imputer, categorical_iterative_imputer, text_features_method, max_encoding_ohe, encoding_method, rare_to_value, rare_value, polynomial_features, polynomial_degree, low_variance_threshold, group_features, drop_groups, remove_multicollinearity, multicollinearity_threshold, bin_numeric_features, remove_outliers, outliers_method, outliers_threshold, fix_imbalance, fix_imbalance_method, transformation, transformation_method, normalize, normalize_method, pca, pca_method, pca_components, feature_selection, feature_selection_method, feature_selection_estimator, n_features_to_select, custom_pipeline, custom_pipeline_position, data_split_shuffle, data_split_stratify, fold_strategy, fold, fold_shuffle, fold_groups, n_jobs, use_gpu, html, session_id, system_log, log_experiment, experiment_name, experiment_custom_tags, log_plots, log_profile, log_data, verbose, memory, profile, profile_kwargs)
593 exp = _EXPERIMENT_CLASS()
594 set_current_experiment(exp)
--> 595 return exp.setup(
596 data=data,
597 data_func=data_func,
598 target=target,
599 index=index,
600 train_size=train_size,
601 test_data=test_data,
602 ordinal_features=ordinal_features,
603 numeric_features=numeric_features,
604 categorical_features=categorical_features,
605 date_features=date_features,
606 text_features=text_features,
607 ignore_features=ignore_features,
608 keep_features=keep_features,
609 preprocess=preprocess,
610 create_date_columns=create_date_columns,
...
93 if data is not None:
94 if not isinstance(data, pd.DataFrame):
95 # Assign default column names (dict already has column names)

TypeError: 'function' object is not subscriptable
Describe the bug
A clear and concise description of what the bug is.

Lambda function for passing data frame to the setup function in pycaret doesn't work

Expected behavior
A clear and concise description of what you expected to happen.

The setup function should succeed

Environment (please complete the following information):

  • Backend: dask
  • Backend version: not sure (latest as on 14th Nov 2023)
  • Python version: 3.8
  • OS: linux/windows: windows
@kvnkho
Copy link
Collaborator

kvnkho commented Nov 14, 2023

Hi @rcshetty3 , this doesn't seem like a Fugue issue right? It seems like this is pure pycaret code and the pycaret setup() function. Looking at their code, setup() can take both a data or data_func. See this.

Maybe you can try:

setup(data_func=lambda: get_data("juice", verbose=False, profile=False), target = 'Purchase', session_id=0, n_jobs=1);

I think this should be posted in the pycaret Github though.

@kvnkho kvnkho closed this as completed Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants