# 26. How do I avoid a SettingWithCopyWarning in pandas?

In [1]:
import pandas as pd
movies = pd.read_csv('data/imdbratings.csv')
movies.head()

Unnamed: 0,star_rating,title,content_rating,genre,duration,actors_list
0,9.3,The Shawshank Redemption,R,Crime,142,"'Tim Robbins', 'Morgan Freeman', 'Bob Gnton'"
1,9.2,The Godfather,R,Crime,175,"'Marlon Brando', 'Al Pacino', 'James Caan'"
2,9.1,The Godfather: Part II,R,Crime,200,"'Al Pacino', 'Robert De Niro', 'Robert Dvall'"
3,9.0,The Dark Knight,PG-13,Action,152,"'Christian Bale', 'Heath Ledger', 'Aaron Eckhart'"
4,8.9,Pulp Fiction,R,Crime,154,"'John Travolta', 'Uma Thrman', 'Samel L. Jackson'"


In [2]:
# This is a commun way to see how many 'unknown' values you have in your Series.
movies['content_rating'].isnull().sum()

3

In [3]:
# This is a commun way to show 'unknown' values.
movies[movies['content_rating'].isnull()]

Unnamed: 0,star_rating,title,content_rating,genre,duration,actors_list
187,8.2,Butch Cassidy and the Sundance Kid,,Biography,110,"'Pal Newman', 'Robert Redford', 'Katharine Ross'"
649,7.7,Where Eagles Dare,,Action,158,"'Richard Brton', 'Clint Eastwood', 'Mary Ure'"
936,7.4,True Grit,,Adventure,128,"'John Wayne', 'Kim Darby', 'Glen Campbell'"


In [4]:
# Showing the 'value_counts()', we see that 'NOT RATED' must be represented as missing values.
movies['content_rating'].value_counts()

R            460
PG-13        189
PG           123
NOT RATED     65
APPROVED      47
UNRATED       38
G             32
NC-17          7
PASSED         7
X              4
GP             3
TV-MA          1
Name: content_rating, dtype: int64

In [5]:
import numpy as np
movies[movies['content_rating'] == 'NOT RATED'].content_rating = np.nan

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self[name] = value


In [6]:
# We verify the number of 'unknown' values, and we find that is the same 
# (the changes doesn't applied).
movies['content_rating'].isnull().sum()

3

In [7]:
# The propre way to modify columns values, is to use 'loc'.
movies.loc[movies['content_rating'] == 'NOT RATED', 'content_rating'] = np.nan

In [8]:
# We verify the number of 'unknown' values, and we find that the changes 
# has being applied without warning
movies['content_rating'].isnull().sum()

68

In [9]:
top_movies = movies.loc[movies['star_rating'] >= 9, :]
top_movies

Unnamed: 0,star_rating,title,content_rating,genre,duration,actors_list
0,9.3,The Shawshank Redemption,R,Crime,142,"'Tim Robbins', 'Morgan Freeman', 'Bob Gnton'"
1,9.2,The Godfather,R,Crime,175,"'Marlon Brando', 'Al Pacino', 'James Caan'"
2,9.1,The Godfather: Part II,R,Crime,200,"'Al Pacino', 'Robert De Niro', 'Robert Dvall'"
3,9.0,The Dark Knight,PG-13,Action,152,"'Christian Bale', 'Heath Ledger', 'Aaron Eckhart'"


In [10]:
# Here we are trying to modify the movies duration, We notice that we 
# have a warning but the code still working.
top_movies.loc[0, 'duration'] = 150
top_movies

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


Unnamed: 0,star_rating,title,content_rating,genre,duration,actors_list
0,9.3,The Shawshank Redemption,R,Crime,150,"'Tim Robbins', 'Morgan Freeman', 'Bob Gnton'"
1,9.2,The Godfather,R,Crime,175,"'Marlon Brando', 'Al Pacino', 'James Caan'"
2,9.1,The Godfather: Part II,R,Crime,200,"'Al Pacino', 'Robert De Niro', 'Robert Dvall'"
3,9.0,The Dark Knight,PG-13,Action,152,"'Christian Bale', 'Heath Ledger', 'Aaron Eckhart'"


In [11]:
# This is the propre way to silence the warning and get the job done.
top_movies = movies.loc[movies['star_rating'] >= 9, :].copy()

In [12]:
top_movies.loc[0, 'duration'] = 150
top_movies

Unnamed: 0,star_rating,title,content_rating,genre,duration,actors_list
0,9.3,The Shawshank Redemption,R,Crime,150,"'Tim Robbins', 'Morgan Freeman', 'Bob Gnton'"
1,9.2,The Godfather,R,Crime,175,"'Marlon Brando', 'Al Pacino', 'James Caan'"
2,9.1,The Godfather: Part II,R,Crime,200,"'Al Pacino', 'Robert De Niro', 'Robert Dvall'"
3,9.0,The Dark Knight,PG-13,Action,152,"'Christian Bale', 'Heath Ledger', 'Aaron Eckhart'"
