DataFrame memory management utility, del_inactive_dataframes#115
Merged
Conversation
…h optional cleanup and reporting
Collaborator
|
Great job; did some minor cleanup and added unittests; looks great! |
lshpaner
added a commit
that referenced
this pull request
Dec 20, 2025
DataFrame memory management utility, del_inactive_dataframes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DataFrame memory management
The
del_inactive_dataframesfunction is a utility for managing pandas DataFrames ininteractive, cloud, and local Python environments. It helps reduce memory pressure by
identifying inactive DataFrames and optionally deleting them from a given namespace,
while preserving a specified set of active DataFrames.
This function was designed for exploratory and long-running workflows, such as Jupyter
or cloud notebook sessions, where intermediate DataFrames and hidden references
(for example, IPython output cache variables like
_14,_15) can accumulate andcontribute to session instability or crashes.
At a high level, the function can:
Requirements
Optional dependencies
This functionality is designed to run with minimal dependencies. Optional packages
enable enhanced output and diagnostics but are not required.
Pretty console output (optional)
Process memory reporting (optional)
Memory reporting
The function can optionally report memory usage before and after DataFrame deletion.
This is especially useful in long-running or cloud notebook sessions where memory
pressure can build up over time.
Default behavior
Optional process-level reporting
What is process RSS?
Process RSS (Resident Set Size) is the amount of physical memory currently allocated to
the running Python process by the operating system. It includes:
RSS represents total process memory, not just DataFrame memory.
How to interpret process RSS
Process RSS is provided as an advisory metric and should be interpreted with care:
For most users, DataFrame memory totals are the primary signal that cleanup succeeded.
Process RSS is best used to observe long-term memory trends rather than immediate
before-and-after changes.