Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression issue #11

Closed
prashanthGit945 opened this issue Jan 2, 2021 · 11 comments
Closed

regression issue #11

prashanthGit945 opened this issue Jan 2, 2021 · 11 comments

Comments

@prashanthGit945
Copy link

Creating New Features with Features Selection

[GML] The 1 step feature engineering process could generate up to 49 features.
[GML] With 5429 data points this new feature matrix would use about 0.00 gb of space.
[FEATURE_ENGINEERING] Step 1: transformation of original features
[FEATURE_ENGINEERING] Generated 14 transformed features from 7 original features - done.
[FEATURE_ENGINEERING] Generated altogether 17 new features in 1 steps
[FEATURE_ENGINEERING] Removing correlated features, as well as additions at the highest level

AttributeError Traceback (most recent call last)
in
3 numeric_cols =['session_id','session_number','client_agent','device_details']
4
----> 5 fe = FeatureEngineering(train,'time_spent',fill_missing_data=True, method_cat='Mode',cat_cols = cat_cols,numeric_cols = numeric_cols,
6 method_num='Mean',encode_data=True,normalize=True, remove_outliers=False,new_features=True,feateng_steps=1,task ='regression')
7

~\Anaconda3\lib\site-packages\GML\FEATURE_ENGINEERING.py in init(self, data, label, fill_missing_data, method_cat, method_num, drop, cat_cols, numeric_cols, thresh_cat, thresh_numeric, encode_data, method, thresh, normalize, method_transform, thresh_numeric_transform, remove_outliers, qu_fence, new_features, task, test_data, verbose, feateng_steps)
166 except:
167 pass
--> 168 X = afc.fit_transform(X, y)
169 if not test_data == None:
170 test_data = afc.transform(test_data)

~\Anaconda3\lib\site-packages\GML\AUTO_FEATURE_ENGINEERING\autofeat.py in fit_transform(self, X, y)
294 target_sub = target.copy()
295 # generate features
--> 296 df_subs, self.feature_formulas_ = engineer_features(df_subs, self.feateng_cols_, _parse_units(self.units, verbose=self.verbose),
297 self.feateng_steps, self.transformations, self.verbose)
298 # select predictive features

~\Anaconda3\lib\site-packages\GML\AUTO_FEATURE_ENGINEERING\feateng.py in engineer_features(df_org, start_features, units, max_steps, transformations, verbose)
339 print("[FEATURE_ENGINEERING] Generated altogether %i new features in %i steps" % (len(feature_pool) - len(start_features), max_steps))
340 print("[FEATURE_ENGINEERING] Removing correlated features, as well as additions at the highest level")
--> 341 feature_pool = {c: feature_pool[c] for c in feature_pool if c in uncorr_features and not feature_pool[c].func == sympy.add.Add}
342 cols = [c for c in list(df.columns) if c in feature_pool and c not in df_org.columns] # categorical cols not in feature_pool
343 if cols:

~\Anaconda3\lib\site-packages\GML\AUTO_FEATURE_ENGINEERING\feateng.py in (.0)
339 print("[FEATURE_ENGINEERING] Generated altogether %i new features in %i steps" % (len(feature_pool) - len(start_features), max_steps))
340 print("[FEATURE_ENGINEERING] Removing correlated features, as well as additions at the highest level")
--> 341 feature_pool = {c: feature_pool[c] for c in feature_pool if c in uncorr_features and not feature_pool[c].func == sympy.add.Add}
342 cols = [c for c in list(df.columns) if c in feature_pool and c not in df_org.columns] # categorical cols not in feature_pool
343 if cols:

AttributeError: module 'sympy' has no attribute 'add'

@prashanthGit945
Copy link
Author

if you understand issue please respond

@Muhammad4hmed
Copy link
Owner

Hi, sorry due to academic finals, could not respond fast. can you please share the snap of X and y?

@prashanthGit945
Copy link
Author

Screenshot 2021-01-04 203007
Screenshot 2021-01-04 203107

@prashanthGit945
Copy link
Author

problem with new features creating

@Muhammad4hmed
Copy link
Owner

the problem is, session_id, client_agent and device_details are not numeric, pass them as cat_cols. further you can know what are categorical or numeric columns by X.dtypes

@prashanthGit945
Copy link
Author

still its getting same error

@prashanthGit945
Copy link
Author

why don't you an add demo about regression in github once check whether regression working or not..

@prashanthGit945
Copy link
Author

Screenshot 2021-01-05 172958

@prashanthGit945
Copy link
Author

Unique values in session_id 5429
Unique values in session_number 610
Unique values in client_agent 698
Unique values in device_details 17
Unique values in date 342
Unique values in purchased 2
Unique values in added_in_cart 2
Unique values in checked_out 2
Unique values in time_spent 5235

@Muhammad4hmed
Copy link
Owner

Either drop date column or pass it in cat_cols

@prashanthGit945
Copy link
Author

yes I already dropped date column starting itself...

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants