We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
df.dropna(how=‘any’)
df['price'].fillna(df['price'].mean()) df['price'].fillna(df['price'].median())
* 热卡填补 相似对象的值 * K最近距离邻法(K-means clustering) 无监督机器学习的聚类方法 * 拟合缺失值 * 回归预测 基于完整的数据集,建立回归方程,通过方程求得缺失值 * 极大似然估计 * 多重插补 * 随机森林 * 虚拟变量 通过判断特征值是否有缺失值来定义一个新的二分类变量 * 不处理 一些模型本身可以应对缺失值的数据,不需要处理
The text was updated successfully, but these errors were encountered:
No branches or pull requests
数据缺失
数据缺失的原因
数据缺失的类型
数据缺失的处理方法
最简单粗暴
牺牲大量数据
缺失比例比较大时,导致数据发生偏离
The text was updated successfully, but these errors were encountered: