Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the efficient way to fill null values in a column with an arbitrary string in a Dataframe? #766

Open
jamalromero opened this issue Apr 13, 2024 · 3 comments

Comments

@jamalromero
Copy link

As we know many data sources have missing values. After reading the data source (csv file for example), is there a way to fill in missing entries in the DataFrame with an arbitrary value. As a comparison with Python Pandas DataFrame we can just call dataframe['some_column_name'].fillna('Missing')
Is that possible? Also, is there a forum or a user group for discussions available where we can post questions like these?
Thanks

@haifengl
Copy link
Owner

There are several algorithms to handle missing values in package smile.feature.imputation. SimpleImputer may be used to fill a fixed value. I would suggest trying other advanced algorithms in the package too.

For simplicity, I will add some methods like fillna to Vector classes.

@haifengl
Copy link
Owner

Feel free to ask questions by creating tickets.

@haifengl
Copy link
Owner

I added DataFrame.fillna() that applies on FloatVector and DoubleVector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants