Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is multi-processing supported? #35

Closed
semaphore-egg opened this issue Apr 21, 2022 · 6 comments
Closed

Is multi-processing supported? #35

semaphore-egg opened this issue Apr 21, 2022 · 6 comments
Assignees
Labels
question Further information is requested

Comments

@semaphore-egg
Copy link

Thank you guys for this amazing beautiful cool tool!

Feature Request

I am dealing with some memory problems related to pytorch dataloader for several days. And just tried memray with a simple script below. I found that in the live mode, the information of main process is reported but all processes are detected as threads and no information is reported.

from torch.utils.data import Dataset, DataLoader
import numpy as np
import torch
import sys

class DataIter(Dataset):
    def __init__(self):
        n = int(2.4e7)
        self.data = [x for x in range(n)]

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        data = self.data[idx]
        data = np.array([data], dtype=np.int64)
        return torch.tensor(data)


train_data = DataIter()
train_loader = DataLoader(train_data, batch_size=300,
                          shuffle=True,
                          drop_last=True,
                          pin_memory=False,
                          num_workers=12)

for i, item in enumerate(train_loader):
    if i % 1000 == 0:
        print(i, end='\t', flush=True)

Screenshot of main process:
Screenshot from 2022-04-21 22-56-22

screen shot of other process:

Screenshot from 2022-04-21 22-50-01

The following command memray run --live simple_multi_worker.py is used.

Is there a way to observe multi-processing information?

@godlygeek
Copy link
Contributor

Is there a way to observe multi-processing information?

Not with live mode. We don't have any way right now for one UI to be ingesting data from multiple processes.

What we do have is the --follow-fork option for memray run. That will cause it to write one output file per child process, and you can then inspect each of those output files individually, for instance by using memray flamegraph to generate a flame graph for each that you can open up in a browser.

This will only work if it's forking and not exec'ing - meaning that it will be able to gather meaningful data if you use a multiprocessing.Pool, but not if you use a subprocess.run() call. As far as I can tell at a quick glance, though, DataLoader seems to be using multiprocessing, and so this ought to work.

--follow-fork mode is pretty new, so there may still be some kinks to work out - try it and let me know if you hit any issues.

@godlygeek godlygeek added the question Further information is requested label Apr 21, 2022
@rossjp
Copy link

rossjp commented Apr 21, 2022

Are there any plans to create/extend a reporter to accept and integrate data from multiple capture files? I'm wrapping a multi-worker gunicorn process with memray and I end up with a capture file per worker. Inspecting them separately is useful, but inspecting them all merged together would also provide some insights.

@godlygeek
Copy link
Contributor

There aren't any such plans. When we discussed amongst ourselves, the consensus was that trying to analyze information from multiple processes at the same time was likely to cause more confusion than anything else, and we had trouble coming up with any cases where seeing, say, multiple workers at once would tell you anything that you wouldn't be able to identify by analyzing them individually.

In fact, for the gunicorn case, I would think that what would make the most sense is just to drop the number of workers down to 1 while you're investigating it, so that all requests are reaching the same worker instance.

But you might be seeing something we didn't - can you describe a case where there's some interesting feature of the memory usage of a pool of worker processes that would be difficult to identify by looking at their allocations individually, but easy to identify by looking at their allocations in aggregate?

@semaphore-egg
Copy link
Author

semaphore-egg commented Apr 22, 2022

Great, --follow-fork works! Here is another question.

The script I provide is to trace the copy-on-write caused by accessing python objects from forked-process. Accessing a python object from a forked process changes the reference-counting thus triggers page duplication. It seems that memray do not report memory consumption related to COW.

So does it means we can not use memray to trace COW?

@pablogsal
Copy link
Member

So do it means we can not use memray to trace COW?

Memray traces two things:

  • Request for allocations to the system allocators: these include malloc, mmap, calloc, realloc, valloc... and a bunch more.
  • Resident size every bunch of milliseconds directly from the kernel.

When a process is forked, the memory maps are shared between the part and the child until a write happens, as you indicate. When the write happens it triggers an implicit interrupt generated directly from the MMU, which in turn causes the kernel to update the page table with the new (writable) pages, decrements the number of references, and performs the write.

This means that all of this happens in kernel space and therefore memray cannot really "see" anything here. The only thing memray will be able to see is that the resident size is increased by the kernel when that happens. We don't really have a way to know what operation causes this to happen, as this is deeply underneath us and will require instrumentation or similar.

So the answer is sadly that is very unlikely that you can use many common profilers to properly trace COW unless they allow instrumentation (like valgrind does).

@semaphore-egg
Copy link
Author

This is pretty reasonable. Thank you guys so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants