-
-
Notifications
You must be signed in to change notification settings - Fork 19.3k
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
I am working on a library to extend the functionality of DataFrames. It extends DataFrames to, well, extend DataFrames. It heavily uses descriptors (like cached_property). I have come across no issues except for this one. In this example, the user is trying to assign a value to a property, but due to NDFrame.__setattr__, behavior arises that can lead to unintended processing or infinite recursion:
class Frame(DataFrame):
@cached_property
def attr(self):
print('unintended')
Frame().attr = ...
Here, the user only wants to set to Frame.attr, but "unintended" will print.
I am hesitant to override the method and for now have been circumventing the issue by calling with object.__setattr__ rather than setattr
Feature Description
The user-facing code should be identical. That is, the user should be able to work with cached_property just as if object were the base class. When the attribute is set, no printout should be made indicating unexpected behavior.
Alternative Solutions
Simplest decision: we remove the code block. I have overridden my DataFrame extension with this function and it seems to function fine.
@final
def __setattr__(self, name: str, value) -> None:
"""
After regular attribute access, try setting the name
This allows simpler access to columns for interactive use.
"""
# if this fails, go on to more involved attribute setting
# (note that this matches __getattr__, above).
if name in self._internal_names_set:
object.__setattr__(self, name, value)
elif name in self._metadata:
object.__setattr__(self, name, value)
Additional Context
Is it truly necessary to preemptively call object.__getattribute__(self, name)? What does this achieve? Can this safely be done differently?
Here is the code from commit d5963dc authored by @jakevdp that causes this functionality:
def __setattr__(self, name: str, value) -> None:
"""
After regular attribute access, try setting the name
This allows simpler access to columns for interactive use.
"""
# first try regular attribute access via __getattribute__, so that
# e.g. ``obj.x`` and ``obj.x = 4`` will always reference/modify
# the same attribute.
try:
object.__getattribute__(self, name)
return object.__setattr__(self, name, value)
except AttributeError:
pass
# if this fails, go on to more involved attribute setting
# (note that this matches __getattr__, above).
if name in self._internal_names_set:
object.__setattr__(self, name, value)
elif name in self._metadata:
object.__setattr__(self, name, value)
else:
try:
existing = getattr(self, name)
if isinstance(existing, Index):
object.__setattr__(self, name, value)
elif name in self._info_axis:
self[name] = value
else:
object.__setattr__(self, name, value)
except (AttributeError, TypeError):
if isinstance(self, ABCDataFrame) and (is_list_like(value)):
warnings.warn(
"Pandas doesn't allow columns to be "
"created via a new attribute name - see "
"https://pandas.pydata.org/pandas-docs/"
"stable/indexing.html#attribute-access",
stacklevel=find_stack_level(),
)
object.__setattr__(self, name, value)