AttributeError: 'DataFrame' object has no attribute 'name' #666

islrnd · 2019-12-18T11:48:14Z

AttributeError                            Traceback (most recent call last)
<ipython-input-32-8d38637b6cc6> in <module>
      6 
      7 oversampler=SMOTE(random_state=42)
----> 8 smote_train, smote_target = oversampler.fit_resample(X,y)
      9 
     10 print("Before OverSampling, counts of label '0', '1':", smote_target['label'].value_counts())

~\Anaconda3\lib\site-packages\imblearn\base.py in fit_resample(self, X, y)
     73         """
     74         check_classification_targets(y)
---> 75         X, y, binarize_y = self._check_X_y(X, y)
     76 
     77         self.sampling_strategy_ = check_sampling_strategy(

~\Anaconda3\lib\site-packages\imblearn\base.py in _check_X_y(self, X, y, accept_sparse)
    148         if hasattr(y, "loc"):
    149             # store information to build a series
--> 150             self._y_name = y.name
    151             self._y_dtype = y.dtype
    152         else:

~\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5065             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5066                 return self[name]
-> 5067             return object.__getattribute__(self, name)
   5068 
   5069     def __setattr__(self, name, value):

AttributeError: 'DataFrame' object has no attribute 'name'

chkoar · 2019-12-18T13:16:50Z

@glemaitre I think that is related with the sanity check that we were discussing about, no? The user passes something that has .loc but it is not a Series as we expect. It is a DataFrame.

glemaitre · 2019-12-18T15:02:37Z

Yes but this is not pythonic to type check. We should probably look at attributes that are "couacing" better for Series.

glemaitre · 2019-12-18T15:05:48Z

@islrnd you should pass a numpy array or a pandas series, not a dataframe.

00krishna · 2019-12-18T16:30:21Z

This is a confusing issue. If I pass in a DataFrame then I get the error about no attribute name. If I pass in a Series I get a different error about ValueError: Found array with 0 feature(s) (shape=(7788867, 0)) while a minimum of 1 is required. So no matter how I am putting the data in, it generates a critical error.

chkoar · 2019-12-18T16:33:39Z

@00krishna post a sample code to reproduce you error.

00krishna · 2019-12-18T16:36:29Z

@chkoar Sorry, I realized I have all categorical features and that is causing the problem.

parth-radonc · 2020-01-13T06:47:28Z

@00krishna I am facing the similar issue.. Can you help me with it?

chkoar · 2020-01-13T12:13:41Z

@parth-mango You probably pass a data frame in y. You should pass a Series object. If you want provide a minimal reproducible code.

flowersw · 2020-01-26T15:16:53Z

I had the same issue, and was accidentally passing in a DataFrame for y instead of Series

chkoar · 2020-01-26T19:12:06Z

@flowersw #673 or a post PR hopefully will solve this problem.

serjko · 2020-01-28T20:01:11Z

Hi there,

I'm facing a similar issue that wasn't resolved after converting the initial DataFrame to Series. Please take a look at my question on SO https://datascience.stackexchange.com/questions/67141/passing-data-to-smote-after-applying-train-test-split

chkoar · 2020-01-28T20:07:35Z

@serjko please post a minimal reproducible example and your package versions.

serjko · 2020-01-28T21:52:13Z

Hi @chkoar, Thanks for jumping in so quickly. This is a false alarm. The root cause was my dataset and not SMOTE. I made it work after cleaning up the data and passing Series as 2nd variable for fit_sample. Apologies for bothering you and thanks again for the answer!

ertanuj96 · 2020-02-05T09:15:14Z

@chkoar , Hey I am facing similar issue when I am using regex a string on entire dataframe .
#Note :: xlsx - you can ask me in private ,cannot expose xlsx here.
Code snippet here -
'import os
import time
import sys
import subprocess
import re
import pandas as pd
from openpyxl import load_workbook
import pyperclip

from tkinter import Tk
from tkinter.filedialog import askopenfilename

def get_lookup_excel_path():
parent_dir = '/root/Desktop/lookup.xlsx'
return parent_dir

def upload_spreadsheet(path, active_sheet_only = True):
'''Returns pandas dataframe. Returns empty dataframe if fails'''
#check if file exists
if not os.path.isfile(path):
return pd.DataFrame()

#check file type
file_type = os.path.splitext(path)[1]

if file_type == '.csv':
    df = pd.read_csv(path)
elif file_type in ['.xlsx','.xlsm','.xltx','.xltm']:
    wb = load_workbook(path)

    #convert sheets to pandas dataframe
    if active_sheet_only:
        df = pd.DataFrame(wb.active.values)
    else:
        #combine all sheets into one dataframe
        frames = [pd.DataFrame(sheet.values) for sheet in wb.worksheets]
        df = pd.concat(frames)
    wb.close()
else:
    return pd.Dataframe()

return df

def combine_dataframe(dataframe):
'''Converts pandas dataframe into one list'''
return [cell for cell in [dataframe[i] for i in dataframe] ]

def get_springer_books_excel():
parent_dir = '/root/Desktop/springnature.xlsx'
return parent_dir

def main():
file_path = get_lookup_excel_path()
if file_path == '': return
df = upload_spreadsheet(file_path, active_sheet_only=True)
search_values = ['Engineering', 'Computer']
df[df.name.str.contains('|'.join(search_values))]

if name == 'main':
main()

chkoar · 2020-02-05T09:45:16Z

Hey I am facing similar issue when I am using regex a string on entire dataframe .

@ertanuj96 sorry, I do not get it. Is this related to imbalanced-learn?

ramiazmi · 2020-02-16T07:25:02Z

I had experienced this before. I resolved it by passing a dataframe into X and a Series into y.

chkoar · 2020-02-16T09:42:14Z

@ramiazmi #681 has solved this issue. So 0.6.2 (which is currently not released) will solve such problems.

glemaitre · 2020-02-16T11:18:40Z

0.6.2 is out on PyPI and will be shortly available on conda-forge. Locking this issue

chkoar mentioned this issue Jan 7, 2020

Accept column vectors when having binary or multiclass targets #673

Merged

chkoar mentioned this issue Feb 3, 2020

[MRG] Better in-out support #681

Merged

glemaitre closed this as completed in #681 Feb 12, 2020

scikit-learn-contrib locked as resolved and limited conversation to collaborators Feb 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'DataFrame' object has no attribute 'name' #666

AttributeError: 'DataFrame' object has no attribute 'name' #666

islrnd commented Dec 18, 2019 •

edited by glemaitre

chkoar commented Dec 18, 2019 •

edited

glemaitre commented Dec 18, 2019

glemaitre commented Dec 18, 2019

00krishna commented Dec 18, 2019

chkoar commented Dec 18, 2019

00krishna commented Dec 18, 2019 •

edited

parth-radonc commented Jan 13, 2020

chkoar commented Jan 13, 2020

flowersw commented Jan 26, 2020

chkoar commented Jan 26, 2020

serjko commented Jan 28, 2020

chkoar commented Jan 28, 2020

serjko commented Jan 28, 2020

ertanuj96 commented Feb 5, 2020

chkoar commented Feb 5, 2020

ramiazmi commented Feb 16, 2020

chkoar commented Feb 16, 2020

glemaitre commented Feb 16, 2020

AttributeError: 'DataFrame' object has no attribute 'name' #666

AttributeError: 'DataFrame' object has no attribute 'name' #666

Comments

islrnd commented Dec 18, 2019 • edited by glemaitre

chkoar commented Dec 18, 2019 • edited

glemaitre commented Dec 18, 2019

glemaitre commented Dec 18, 2019

00krishna commented Dec 18, 2019

chkoar commented Dec 18, 2019

00krishna commented Dec 18, 2019 • edited

parth-radonc commented Jan 13, 2020

chkoar commented Jan 13, 2020

flowersw commented Jan 26, 2020

chkoar commented Jan 26, 2020

serjko commented Jan 28, 2020

chkoar commented Jan 28, 2020

serjko commented Jan 28, 2020

ertanuj96 commented Feb 5, 2020

chkoar commented Feb 5, 2020

ramiazmi commented Feb 16, 2020

chkoar commented Feb 16, 2020

glemaitre commented Feb 16, 2020

islrnd commented Dec 18, 2019 •

edited by glemaitre

chkoar commented Dec 18, 2019 •

edited

00krishna commented Dec 18, 2019 •

edited