# Understanding Panda's "SettingWithCopyWarning"

In this notebook we will talk about a common warning that Pandas will give you, which can sometimes be confusing.  We explain it here so that you will not only understand the warning but also avoid what it is trying to warn you about.


# How to use this Notebook

The best way to use this notebook is to follow along with the lecture and then to apply what you learn to your own data files, or (if you do not have any of your own data) to practice using this functions and methods on the provided data. A little practice goes a long way towards understand and retaining! It would be easy to just skim this notebook, but you will learn more by doing!

# A Motivating Example:

In [1]:
# In this cell we import pandas and load the datafile.
import pandas as pd
import os

filepath = os.path.join(os.getcwd(), 'data', 'ShiftManagerApp_LaborSheet.csv')
labor_sheet_data = pd.read_csv(filepath, parse_dates=[['Date', 'Ending_Hour'], 'Timestamp'])
labor_sheet_data.head(1)

Unnamed: 0,Date_Ending_Hour,Store_ID,Manager,Projected_Sales,Sales,DT_TTL,Car_Count,KVS_Total,Scheduled_People,Actual_People,Reason_for_Labor_Diff,Reason_for_High_TTLs,Manager_Entering_Data,Timestamp,OEPE,Park_Percentage
0,2017-01-23 08:00:00,4462,JillianA,540.0,420.0,170.0,,100.0,,,,,,2017-01-23 09:52:14,,


In [2]:
store_4462 = labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, :]
store_4462["Sales"] = store_4462["Sales"] + 100

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


# Why Did We Get This Warning?

It is unclear if `store_4462` is a _view_ of `labor_sheet_data` or a _copy_. We can't know, this is determined under the hood by the memory layout of the data. Pandas does not know if we wanted to update the original values in `labor_sheet_data` or if we actually wanted to create a copy, in `store_4462`, and update the values in the copy.

### The example above is equivalent to what we see below:

See how we use `.loc[]` and then we use `[]` to access Sales. Using the `[]` after using another indexing method is called chained indexing and is a big no no in pandas for the reason explained above!  Note that the issue us not the combination of `loc` and `[]`, but it is the extra 

In [3]:
labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, :]["Sales"] = \
    labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, :]["Sales"] + 100

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


# What You Should Do:

## If You Want to Copy of a Dataframe Because You Plan on Updating the Data Independent of the Original, Be Explicit and Use the Copy() Method

In [4]:
store_4462 = labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, :].copy()
store_4462["Sales"] = store_4462["Sales"] + 100

## If You Want to Update the Values in the Original Dataframe, Put All Indexing into One `.Loc[]` Call:

In [5]:
labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, "Sales"] = \
    labor_sheet_data.loc[labor_sheet_data["Store_ID"]==4462, "Sales"] + 100

# Lesson Summary:
In this lesson you learned:
* What the SettingWithCopyWarning is.
* How to avoid this warning and also avoid data manipulations mistakes.

## Question or Comments About This Notebook?
Feel free to contact me via my LinkedIn: https://www.linkedin.com/in/william-j-henry <br>
You can also email me at will@henryanalytics.com <br>