You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First, run the notebook and preprocess the dataset with given steps :
Steps:
1. Relacing Null values: If null values are present then intelligently handle those. 2. Remove unwanted features or rows: If there are features with low variance, constant feature, then omit those. If there are duplicate data in the dataset omit those too.
The update should reflect in the notebook.
The text was updated successfully, but these errors were encountered:
Need Clarification : 2. Remove unwanted features or rows: If there are features with low variance, constant feature, then omit those. If there are duplicate data in the dataset omit those too.
Here dataset have 19 data columns and 4 of them are categorical.('artists', 'id' ,'name' and 'release_date').' id ' is already dropped. Other features also has low variance data, but I think those rows need for future processing. Is it need to omit those low variance data?
As the specific numeric features can have some importance when clustering, you can avoid dropping those features. Rather than that if you find any data with almost constant distribution feel free to drop those.
P.S. remember to state the modifications you did in your PR. Happy contributing!
Problem Statement:
First, run the notebook and preprocess the dataset with given steps :
Steps:
1. Relacing Null values: If null values are present then intelligently handle those.
2. Remove unwanted features or rows: If there are features with low variance, constant feature, then omit those. If there are duplicate data in the dataset omit those too.
The update should reflect in the notebook.
The text was updated successfully, but these errors were encountered: