Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a cleaner way to collect threads and processes in MultiprocessIterator #4155

Merged
merged 1 commit into from Apr 3, 2018

Conversation

kuenishi
Copy link
Member

@kuenishi kuenishi commented Dec 27, 2017

This is rather a request for comment than a change request - __del__ is often harmful because the timing and order of objects of __del__ is unpredictable especially when those objects have cyclic references. This commit tries to remove __del__ and to make iterators as context manager, available in with statement. Pros and cons in my thought follows on this:

Pros

Cons

  • Changes the usage of iterators; may or may not affect existing codes that do important finalization in __del__. See 'Side effects' below for further discussions.
  • Possibly one more indentation needed to custom loop training codes, like train_mnist_custom_loop.py like this:
    with MultiprocessIterator(train, args.batchsize) as train_iter, \
        MultiprocessIterator(test, args.batchsize,
                             repeat=False, shuffle=False) as test_iter:

        sum_accuracy = 0
        sum_loss = 0

        while train_iter.epoch < args.epoch:
            batch = train_iter.next()
            ...

Side effects of removing __del__:

  • As far as I grepped the whole Chainer master branch there are no usage of __del__ method other than MultiprocessIterator.
  • I tried to see effect of not calling finalize() in garbage collection (via __del__ ), with MNist example, so far I cannot find any unexpected behavior such as remaining shared memory or child processes but saw things cleanly removed by operating system.
  • MultiprocessIterator's bulk_mem is shared (ctypes memory) buffer, which may possibly remain after parent process termination, but as far as I tested with Python 3.6.3 it was tied to a temporary file and was removed right after memory allocation succeeded. See strace result of running sharedctypes.RawArray('b', 1024 * 1024) below:
openat(AT_FDCWD, "/tmp/pymp-p30fkagk/pym-9895-4bozg6h7", O_RDWR|O_CREAT|O_EXCL|O_NOFOLLOW|O_CLOEXEC, 0600) = 3
unlink("/tmp/pymp-p30fkagk/pym-9895-4bozg6h7") = 0
getpid()                                = 9895
fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
ioctl(3, TCGETS, 0x7ffe0072ac40)        = -1 ENOTTY (Inappropriate ioctl for device)
lseek(3, 0, SEEK_CUR)                   = 0
mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9c0216e000
write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
munmap(0x7f9c0216e000, 1052672)         = 0
lseek(3, 0, SEEK_CUR)                   = 1048576
fstat(3, {st_mode=S_IFREG|0600, st_size=1048576, ...}) = 0
fcntl(3, F_DUPFD_CLOEXEC, 0)            = 4
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x7f9c0216f000
  • Also, process pool owned by MultiprocessIterator seemed to gracefully terminate after last task finished, even when I killed it forcibly with SIGKILL.

@okuta okuta assigned okuta and unassigned bkvogel Feb 15, 2018
@okuta okuta added the cat:enhancement Implementation that does not break interfaces. label Feb 15, 2018
@okuta
Copy link
Member

okuta commented Feb 15, 2018

I think that preserving __del__ is better.
Some test is failed on travis-ci. You can see there.

@okuta okuta added the st:awaiting-author State indicating that response is needed from contributors, often authors of pull request. label Feb 28, 2018
@okuta
Copy link
Member

okuta commented Feb 28, 2018

@kuenishi Do you have time to fix this PR?

@kuenishi kuenishi force-pushed the iterator-with branch 2 times, most recently from 2f1034f to 1c38436 Compare March 19, 2018 07:55
@okuta okuta added this to the v5.0.0a1 milestone Apr 3, 2018
@okuta
Copy link
Member

okuta commented Apr 3, 2018

jenkins, test this please.

@okuta okuta added to-be-backported Pull request that should be backported. cat:feature Implementation that introduces new interfaces. and removed cat:enhancement Implementation that does not break interfaces. st:awaiting-author State indicating that response is needed from contributors, often authors of pull request. labels Apr 3, 2018
@okuta
Copy link
Member

okuta commented Apr 3, 2018

LGTM!

@okuta okuta merged commit 4b9bab9 into chainer:master Apr 3, 2018
@kuenishi kuenishi deleted the iterator-with branch April 3, 2018 07:16
kmaehashi pushed a commit to kmaehashi/chainer that referenced this pull request Apr 17, 2018
Provide a cleaner way to collect threads and processes in MultiprocessIterator
kmaehashi pushed a commit that referenced this pull request May 11, 2018
Provide a cleaner way to collect threads and processes in MultiprocessIterator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:feature Implementation that introduces new interfaces. to-be-backported Pull request that should be backported.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants