Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: split not found #1712

Open
WithIbadKhan opened this issue Sep 25, 2022 · 3 comments
Open

AttributeError: split not found #1712

WithIbadKhan opened this issue Sep 25, 2022 · 3 comments
Assignees
Labels
no-stale Auto-stale bot skips this issue

Comments

@WithIbadKhan
Copy link

WithIbadKhan commented Sep 25, 2022

Issue description

In the logistic regression model when I run code i face Attribute erorr.
A clear and concise description of what the bug is.

Code example/repro steps

from sklearn.model_selection import GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.feature_extraction.text import TfidfVectorizer



tfidf = TfidfVectorizer(strip_accents=None,
                        lowercase=False,
                        preprocessor=None)

param_grid = [{'vect__ngram_range': [(1, 1)],
               'vect__stop_words': [stop_words, None],
               'vect__tokenizer': [tokenizer, tokenizer_porter],
               'vect__preprocessor': [None, preprocessor],
               'clf__penalty': ['l1', 'l2'],
               'clf__C': [1.0, 10.0, 100.0]},
              {'vect__ngram_range': [(1,1)],
               'vect__stop_words': [stop_words, None],
               'vect__tokenizer': [tokenizer, tokenizer_porter],
               'vect__preprocessor': [None, preprocessor],
               'vect__use_idf':[False],
               'vect__norm':[None],
               'clf__penalty': ['l1', 'l2'],
               'clf__C': [1.0, 10.0, 100.0]},
              ]

lr_tfidf = Pipeline([('vect', tfidf),
                     ('clf', LogisticRegression(random_state=0))])
#The parameters of the estimator used to apply these methods are optimized by cross-validated grid-search over a parameter grid
gs_lr_tfidf = GridSearchCV(lr_tfidf, param_grid,
                           scoring='accuracy',
                           cv=5,
                           verbose=1,
                           n_jobs=-1)
gs_lr_tfidf.fit(X_train, y_train)

after run this code I face the split not found error.
Please try to provide a minimal example to repro the bug.
Error messages and stack traces are also helpful.

AttributeError: split not found

@bhancock8
Copy link
Member

@humzaiqbal would you take a look?

@github-actions
Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@bhancock8 bhancock8 added no-stale Auto-stale bot skips this issue and removed no-issue-activity labels Dec 27, 2022
@humzaiqbal
Copy link
Contributor

humzaiqbal commented Dec 28, 2022

Hi @WithIbadKhan,

To make sure I understand correctly are you trying to use Snorkel and somewhere in the process getting this issue? From the code snippet you've posted this seems to be an issue with the Scikit-learn library you are using rather than a Snorkel issue? If thats not the case and you are using Snorkel code adding that to the code snippet would help a lot for triaging.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-stale Auto-stale bot skips this issue
Projects
None yet
Development

No branches or pull requests

3 participants