Skip to content
This repository has been archived by the owner on Mar 5, 2020. It is now read-only.

Cannot receive on channel after restarting. Bug? #14

Open
danielniccoli opened this issue Dec 6, 2016 · 4 comments
Open

Cannot receive on channel after restarting. Bug? #14

danielniccoli opened this issue Dec 6, 2016 · 4 comments

Comments

@danielniccoli
Copy link
Contributor

danielniccoli commented Dec 6, 2016

I have an issue and I am not sure if that is by design. When my program restarts it does not receive any more messages on a channel.

I prepared an example which you find further down.. At the 30th iteration I am simulating the restart of the program by simply doing this:

channel_layer_receive = None
channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")

After that the script keeps printing (None, None) although it is still sending on the other channel layer.

Is that by design or a bug?

Example

import asgi_ipc as asgi

channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")
channel_layer_send = asgi.IPCChannelLayer(prefix="my_prefix")

i = 0

while i < 10:
    i += 1
    msg = "Message %s" % i
    try:
        channel_layer_send.send("my_channel", {"text": msg})
        print("Sending %s" % msg)
    except asgi.BaseChannelLayer.ChannelFull:
        print("Dropped %s" % msg)
        pass

    print(channel_layer_receive.receive(["my_channel"]))

    if i == 5:
        channel_layer_receive = None
        channel_layer_receive = asgi.IPCChannelLayer(prefix="my_prefix")

print("Done!")

Output

Sending Message 1
('my_channel', {'text': 'Message 1'})
Sending Message 2
('my_channel', {'text': 'Message 2'})
Sending Message 3
('my_channel', {'text': 'Message 3'})
Sending Message 4
('my_channel', {'text': 'Message 4'})
Sending Message 5
('my_channel', {'text': 'Message 5'})
Sending Message 6
(None, None)
Sending Message 7
(None, None)
Sending Message 8
(None, None)
Sending Message 9
(None, None)
Sending Message 10
(None, None)
Done!
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a875c390>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name
Exception ignored in: <bound method MemoryDict.__del__ of <asgi_ipc.MemoryDict object at 0x7f59a88a6748>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/asgi_ipc.py", line 311, in __del__
posix_ipc.ExistentialError: No shared memory exists with the specified name
@andrewgodwin
Copy link
Member

andrewgodwin commented Dec 6, 2016

This is not by design, so it's probably a bug - while technically channel layers are allowed to drop messages, dropping 70 left in the queue is a bit much.

I would recommend using the Redis channel layer if you want something more reliable, it's much more proven.

@danielniccoli
Copy link
Contributor Author

danielniccoli commented Dec 6, 2016

I have only a small project on one machine. Redis would be an overhead I'd like to avoid. But thanks for the hint, though!

Oh, and I was not dropping 70 messages. I edited/condensed the code to make it more obvious what is happening. I also get an exception at the end. Not sure why, though. When I debug the code line per line I don't get one. Maybe an issue with execution speed or something?

@danielniccoli
Copy link
Contributor Author

danielniccoli commented Dec 7, 2016

The issue is caused by

self.shm.unlink()
.

unlink() marks the shared memory for destruction once all processes have unmapped it.
Source: http://semanchuk.com/philip/posix_ipc/

Not exactly sure how this works here because within a single process we have two SharedMemory objects that map the same shared memory. However if I unlink() one of the objects, the shared memory gets destroyed although the second objects has it still mapped.

Now this happens, minus the part where it says "after the last shm_unlink()":

"Even if the object continues to exist after the last shm_unlink(), reuse of the name shall subsequently cause shm_open() to behave as if no shared memory object of this name exists (that is, shm_open() will fail if O_CREAT is not set, or will create a new shared memory object if O_CREAT is set)."
Source: http://www.opengroup.org/onlinepubs/009695399/functions/shm_unlink.html

The quick fix is to simply not call unlink(), but then the shared memory needs to be unlinked manually by calling unlink_shared_memory(name).

Initially I had the issue with send() and receive() being in two separate scripts/processes. I will have to test whether the issue is actually the same.

@andrewgodwin
Copy link
Member

Urgh, yes, that seems to be what this is; it behaves differently if it's two inside one process versus two in different processes.

I'm still very much tempted to try out a sqlite-based backend as a replacement for this shared memory stuff, given that the performance testing we did showed this was surprisingly slow.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants