Skip to content

pandas.Series.map で SettingWithCopyWarning が出ることがある #6

@mmiyahara

Description

@mmiyahara
from sklearn.model_selection import train_test_split
import pandas as pd

# titanic のデータ
data = pd.read_csv('../data/train.csv')
df, df_test = train_test_split(data, test_size = 0.2, random_state = 1)
df['Male'] = df['Sex'].map({'female': 0, 'male': 1})
<ipython-input-9-43fe6ec5c3c0>:7: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['Male'] = df['Sex'].map({'female': 0, 'male': 1})

おそらく以下が原因。

  • dfdata を参照している。
  • df['Male'] = df['Sex'].map({'female': 0, 'male': 1})dfMale 列を追加したいのか、
    dataMale 列を追加したいのか、pandas が判断できない。

以下のように書けば Warning は解消した。

from sklearn.model_selection import train_test_split
import pandas as pd

data = pd.read_csv('../data/train.csv')

df, df_test = train_test_split(data, test_size = 0.2, random_state = 1)
df = df.copy()
df['Male'] = df['Sex'].map({'female': 0, 'male': 1})

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions