You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The memory_usage() method can be called to get information about the memory used by some of the Pandas objects.
However, in some cases the cached data aren't included.
For example, MultiIndex.memory_usage() includes memory used by:
levels
codes
names
_engine (if initialised)
but it doesn't consider:
_engine.values (it could be included in engine.sizeof)
values (cached in _values)
dtypes and a few other negligible cached properties
Example code (using the current main branch using this code):
If True, it should include also the cached data.
If False, it should keep the existing behaviour (although including the engine data might not be the most intuitive thing, after adding the cache parameter)
Alternative Solutions
Alternatively, the signature of memory_usage() can remain the same, but the result should include the cached data.
However, it may be surprising for the user if the result changes, depending on what properties have been called (but this is already happening for the engine, and it can be documented).
Additional Context
If memory_usage is used to inspect the memory usage of Pandas objects, it would be better to return a value as close as possible to the actually used memory.
The text was updated successfully, but these errors were encountered:
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
The
memory_usage()
method can be called to get information about the memory used by some of the Pandas objects.However, in some cases the cached data aren't included.
For example,
MultiIndex.memory_usage()
includes memory used by:levels
codes
names
_engine
(if initialised)but it doesn't consider:
_engine.values
(it could be included inengine.sizeof
)values
(cached in_values
)dtypes
and a few other negligible cached propertiesExample code (using the current main branch using this code):
Feature Description
memory_usage()
could accept an optional bool parametercache
with default valueFalse
.If True, it should include also the cached data.
If False, it should keep the existing behaviour (although including the
engine
data might not be the most intuitive thing, after adding thecache
parameter)Alternative Solutions
Alternatively, the signature of
memory_usage()
can remain the same, but the result should include the cached data.However, it may be surprising for the user if the result changes, depending on what properties have been called (but this is already happening for the engine, and it can be documented).
Additional Context
If
memory_usage
is used to inspect the memory usage of Pandas objects, it would be better to return a value as close as possible to the actually used memory.The text was updated successfully, but these errors were encountered: