Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bring back a non-broken get_memory_usage #33637

Open
williamstein opened this issue Apr 3, 2022 · 12 comments
Open

bring back a non-broken get_memory_usage #33637

williamstein opened this issue Apr 3, 2022 · 12 comments

Comments

@williamstein
Copy link
Contributor

See #32656

and also my comments below that just reverting #32656 won't help, since get_memory_usage was evidently broken anyways.

CC: @orlitzky @koffie

Component: misc

Issue created by migration from https://trac.sagemath.org/ticket/33637

@williamstein williamstein added this to the sage-9.7 milestone Apr 3, 2022
@williamstein
Copy link
Contributor Author

comment:1

I tried the last few versions of Sage (at least on CoCalc, so ubuntu linux), and get_memory_usage returns total nonsense on them. Maybe it accounts for some huge amount of virtual memory address spaces that Pari allocates (I don't know).

~$ sage-9.4
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 9.4, Release Date: 2021-08-22                     │
│ Using Python 3.9.5. Type "help()" for help.                        │
└────────────────────────────────────────────────────────────────────┘
sage: get_memory_usage()                                                                                                    
16626.8125
sage:                                                                                                                       
Exiting Sage (CPU time 0m0.76s, Wall time 0m4.99s).
sag~$ sage-9.3
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 9.3, Release Date: 2021-05-09                     │
│ Using Python 3.9.2. Type "help()" for help.                        │
└────────────────────────────────────────────────────────────────────┘
sage: get_memory_usage()                                                                                                    
16609.203125
sage:                                                                                                                       
Exiting Sage (CPU time 0m0.09s, Wall time 0m2.84s).
sage~$ sage-9.
sage-9.1  sage-9.2  sage-9.3  sage-9.4  sage-9.5  
~$ sage-9.1
┌────────────────────────────────────────────────────────────────────┐
│ SageMath version 9.1, Release Date: 2020-05-20                     │
│ Create a "Sage Worksheet" file for the notebook interface.         │
│ Enhanced for CoCalc.                                               │
│ Using Python 3.7.3. Type "help()" for help.                        │
└────────────────────────────────────────────────────────────────────┘
sage: get_memory_usage()
16542.44140625
sage: 

The "smem -ntk" command (which is I think written in Python) does give perfectly useful memory information about the above Sage process:

~$ smem -ntk |grep 32967
33075 2001     grep --color=auto 32967            0   424.0K   456.0K     2.3M 
32967 2001     python3 /ext/sage/9.4/src/b        0   136.5M   171.1M   213.0M 

so this isn't an unsolvable problem.

I'll update this ticket title to include "not broken".

@williamstein

This comment has been minimized.

@williamstein williamstein changed the title bring back get_memory_usage bring back a non-broken get_memory_usage Apr 3, 2022
@dimpase
Copy link
Member

dimpase commented Apr 3, 2022

comment:3

smem is Linux-only.

@williamstein
Copy link
Contributor Author

comment:4

To clarify, when I tested our existing get_memory_usage from sage-9.4, which does not provide useful output, I was testing on Linux. Similarly, with smem.

@orlitzky
Copy link
Contributor

orlitzky commented Apr 6, 2022

comment:5

With WSL widely available and cygwin consistently broken, our effective list of supported platforms is

  • linux
  • macOS
  • other BSD
  • WSL

With that in mind, what number should get_memory_usage() return? I suppose we could have different notions of "memory usage" on different platforms, so long as it's documented.

Even on linux however there are a number of options (see e.g. the /proc/[pid]/smaps section in man proc), each of which can be misleading in its own special way. Sage now makes use of system libraries that can be shared with other processes, and modern operating systems will do things like deduplication of pages that make accounting a headache.

There's also the question of subprocesses (like pexpect maxima) that would have to be accounted for separately, if at all.

This doesn't have to be perfect, but I think we should have an idea of what it should do this time around.

@williamstein
Copy link
Contributor Author

comment:6

Thanks for supporting bring this back, and taking such a systematic approach.

For me personally, I can list some reasons that I use get_memory_usage:

  • I allocate something that I think should be "big", e.g., a list of primes, and want to get a rough estimate of just how big it actually is. For this, I would call get_memory_usage before and after the allocation, and I only care about the difference. This is nice because it avoids a lot of the traps you mention above about system libraries, etc.

  • I write a little snippet of code and suspect a blatant memory leak of some sort, e.g., possibly due to some implicit caching in the Sage library. Running get_memory_usage before and after the computation can provide me with a little bit of additional information about my concerns.

  • I want to do a memory intensive big computation on my computer with 16GB of RAM. I do some smaller cases, interpolate data computed using get_memory_usage before and after each run, and decide that yes, this computation will probably fit in 16GB of RAM. This is much easier than trying to understand how all the underlying code and libraries work and correctly predict RAM usage. Again, all I need is deltas. Often linear algebra algorithms that I use (basis for a lot of modular forms computations) are more worrisome in terms of memory requirements than time requirements (unlike combinatorics, say).

In Sage there is a common contruction:

t = cputime()
...
cputime(t)

which outputs the elapsed cputime. Similar for walltime. We could support and encourage something similar for get_memory_usage, e.g.,

m = get_memory_usage()
...
get_memory_usage(m)
/// prints change in memory

Acknowledgement: I copied this "cputime(t)" approach (which gives the delta) from Magma.

@slel
Copy link
Member

slel commented Apr 6, 2022

comment:7

Suggestion: how about memory_usage instead of get_memory_usage?

To everyone's satisfaction we use cputime and not get_cputime.

@williamstein
Copy link
Contributor Author

comment:8

For what it is worth, the names we currently use were copied (by me) from Magma, which has:

  • GetMemoryUsage
  • Cputime
  • Walltime

I would provide a link, but the Magma website seems down, or at least what google points at is down.

@jhpalmieri
Copy link
Member

comment:10

A few thoughts:

  • The IPython magic commands %time and %timeit work well, I think, and at least some uses of cputime could be replaced by those. Could we create an analogous magic command that measured the memory usage of a single command?
  • Could we instead or in addition implement something to be used in a Python context manager (a "with" block)?
with m as memory():
    do some stuff

then return the change in memory at the end

@jhpalmieri
Copy link
Member

comment:11

See also https://pypi.org/project/memory-profiler/. I don't know how well that functions on different platforms.

@dimpase
Copy link
Member

dimpase commented Aug 31, 2022

comment:12

Replying to @jhpalmieri:

See also https://pypi.org/project/memory-profiler/. I don't know how well that functions on different platforms.

it depends on psutil, so nothing too interesting.

@jhpalmieri
Copy link
Member

comment:13

Replying to @dimpase:

Replying to @jhpalmieri:

See also https://pypi.org/project/memory-profiler/. I don't know how well that functions on different platforms.

it depends on psutil, so nothing too interesting.

The interface is worth looking at, including IPython integration via %mprun and %memit. Someone who can articulate them can also point out the flaws of psutil to the authors, and maybe they can come up with something. "Building the car, not reinventing the wheel" and all that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants