Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Improve clarity around when SettingWithCopyWarning can be ignored (if ever?) #8730
I [Edit: thought I got] a
The docs seem to imply that I can safely ignore this error.
# passed via reference (will stay) In : dfb['c'][dfb.a.str.startswith('o')] = 42
Is it therefore generally true that I can always ignore errors when the command I'm running is of the form:
I use this all the time, and I think it always works. If that's true, it would be great if it was specifically called out in a 'you can ignore this warning when...' section. If that's not true (e.g. it works for scalar x / numeric x but not for series x), it would be great if that were called out too.
Sorry if this seems trivial, but I'm trying to explain this to a colleague who's new to pandas and he's confused, and it looks like even old hands can be confused when confronted with this warning (see #6757).
[Edit: looking at his code again, I suspect the SettingWithCopy warning was from a different line of code - (I wish warnings made it clear where they were from). All the same, it would still be great if the docs could be clear if there are circumstances where you always get a reference rather than a copy]
Their are no cases that I am aware that you should actually ignore this warning. Your code may still work, that's why its a warning. It has to do with whether what you are modifying is actually a copy of a view. In general if its a single dtyped frame it mostly will work, BUT NOT ALWAYS. And that's the rub. If you do certain types of indexing it will never work, others it will work. You are really playing with fire.
This is kept on as an actual error in the entire pandas test suite. You should really never do this.
Otherwise you are just holding a gun, maybe shooting yourself when you least expect it.
If you think the docs are unclear, pls submit a pull-request to clarify. It is important from a user's perspective that they are very clear.
That example shows a case where chained indexing DOES WORK, but does not imply that you should use it. (again if its unclear from reading, pls lmk / submit a PR to fix the docs).
The reason you would ignore the warning is that AFAIK pandas can't tell the difference between
# WRONG frame[columnone][frame[columntwo]>x] = y # this warns # frame is unsafe here
# WORKS (but using .ix is still better) temp = frame[columnone] temp[frame[columntwo]>x] = y # this warns # frame is unsafe here, but temp is safe frame = temp # now frame is safe
no these are treated the same
their is an is_copy flag that is set on the created frames which propogate
however they can act differently if the reference variable goes out of scope
that said in a single dtype case u can get away with this
but my recommendation still holds to always heed the warning and not use potentially unsafe constructs
Following up on this, I just ran:
Isn't this what I'm supposed to do? (irrespective of the warning - it does appear to have done the right thing on my ~50k row dataframe)
(Once I can write an example of how to do this in a way that always works, I will submit a PR as this is super unclear to me and I bet others too)
Sorry for the lack of info before - assuming my use of
Answer - my code looks like this:
dataframe=loadDataFrame() dataframe=dataframe[colnames] # don't need all the columns ... code code code... dataframe.loc[dataframe[colname]==colvalue, newcolname]=1 # triggers warning
Replacing the offending line with
dataframe=dataframe.drop([col for col in dataframe.columns if col not in colnames], axis=1)
means I don't get the warning
So, seems like the documentation 'fix' in this case (or rather helpful pointer) is that the warning can be triggered by an action that takes place some way away from the line that raises the warning - thanks for your help!
well the warning is exactly correct.
you are taking a 'view' of the dataframe (e.g. the
when you are setting a value via loc you are then effectiviely setting BOTH the original and the new one. This is what the error is guarding against, namely propogating of the view.
simply enough to
This actually acomplished quite a bit, it provides you with a new smaller object, and the old one gets cleaned up (and not copy warnings),
is the proper idiom here
or you don't even need that at all
this ONLY sets that 1 column so the rest don't matter
Further you can also
One of my users has code similar to this:
And gets the warning. However, in the code base, lines 2 and 3 are quite far removed. So it's almost impossible to realize the warning stems from line 2, (because the warning happens on execution of line 3, far away). I don't know if anything can be done about this, but it can be a head scratcher to track down the source of the warnings.
(Or, even why there's a warning here, without understanding the subtle semantics of copy vs view)
The reason the above triggers the warning is the soln that is for this case below is triggered. Basically you reindex a frame to another frame, which 'remembers' that this happend (via the
What you are doing in your example and what I am showing are different, e.g. you are adding a column, while the above is a masked indexing on the rows. But the only way to detect this is to set
Now I can turn off this case, so what you are doing will not trigger the warning, but my example above will no longer warn.
I suspect that what you are doing is much more common, so this would eliminate this false positive while NOT warning when it should in a type of chained indexing.
As you can guess detecting chained indexing is actually quite complex.