bpo-30931: Asyncore alternative fix #2764

beltran · 2017-07-19T13:28:41Z

This PR is related to #2707 and is trying to fix the same problem.
Thank you @nirs for the review,

the-knights-who-say-ni · 2017-07-19T13:28:43Z

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

nirs · 2017-07-19T13:39:10Z

Lib/asyncore.py


        for obj in r:
-            if obj.fileno() is None:
+            if obj.fileno() is -1:


This may fail, there may be multiple -1 instances, unlike the single None instance. Better use == when comparing a number. But see my comment bellow about using obj.closing instead.

Yes... that's right

nirs · 2017-07-19T13:40:00Z

Lib/asyncore.py

            if flags:
                pollster.register(fd, flags)

        ready = []


We use ready only after the poll(), so better introduce it there.

nirs · 2017-07-19T13:43:02Z

Lib/asyncore.py

        self.del_channel()
        if self.socket is not None:
            try:
+                self._fileno = -1


_fileno is initialized to None in __init__, and set to None in del_channel, so we should change it there.

nirs · 2017-07-19T13:50:48Z

Lib/asyncore.py


        def fileno(self):
-            return self.socket.fileno()
+            return self._fileno


I wonder if there is code out there, assuming that file_dispatcher.fileno() raises socket.error(EBADF) when the socket is closed, and this changes the behavior in incompatible way.

So maybe it is better use self.socket.fileno() as you used before, and use another check to find if a dispatcher is closed. For example, we have the unused "closing" flag, that asyncore.dispatcher define, but never use - we can do:

--- a/Lib/asyncore.py +++ b/Lib/asyncore.py @@ -391,6 +391,9 @@ class dispatcher: return self._fileno def close(self): + if self.closing: + return + self.closing = True self.connected = False self.accepting = False self.connecting = False

And then use:

if obj.closing:

instead of:

if obj.fileno() == -1:

It may also be more efficient.

Yes that makes sense, there are certainly some subtleties that I'm missing.

nirs · 2017-07-19T13:54:53Z

@bjmb, I think we can use the same issue number, having several solutions for a bug is a good thing.

nirs · 2017-07-19T16:59:25Z

Lib/asyncore.py

            self._fileno = self.socket.fileno()
            self.add_channel()
+
+        def fileno(self):


I did not notice before that you added this - since we inherit from dispatcher, we don't need to implement it.

nirs · 2017-07-19T17:10:34Z

Lib/asyncore.py

                    raise

+    def fileno(self):
+        return self.socket.fileno()


This is compatible with 2.7 and older 3.x, when this class had getattr, and fileno() was delegated to the self.socket.

But I found that we have incompatible implementation in file_wrapper:

def close(self): if self.fd < 0: return os.close(self.fd) self.fd = -1 def fileno(self): return self.fd

I think we should have consistent implementation of fileno(), I would check how it behave in regular file objects and make both implementations the same. Both should return -1 when you close the fd, or raise EBADF. For backward compatibility raising EBADF should be less risky.

Finally, using self.socket means it will raise AttributeError if the socket is None - this was fixed lately in python 3 in close(), by checking if the socket is None. If we go with socket.fileno(), we need to check for None and raise EBADF.

Looks like that in close self.fd is only set to -1 in python3, in python2 just os.close(self.fd) is called. The behaviour for regular file object when fileno() is called on them is to raise

>>> f.close() >>> f.fileno() Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: I/O operation on closed file

would raising this be a possible solution?

dispatcher and file_wrapper try to behave like a socket, so using using socket.fileno() seems like the best approach, and it is also compatible with older code that was calling dispatcher.fileno() and dispatches.socket.fileno().

I think we can leave file_wrapper as is for now, since the patch fixes the issue without this change.

So the only issue in this version of the patch is calling fileno() when dispatcher.socket is None. Not sure that this is a real issue, it means that the instance is not created or configured properly.

Regarding setting self.fd to -1 - this is a bug in 2.7 that should be fixed.

beltran · 2017-07-19T20:59:24Z

Another idea is to wrap the socket with an int, and use in select like

class int_with_socket(int):
    def __new__(cls, socket):
        obj = int.__new__(cls, socket._fileno)
        obj.socket = socket
        return obj

then we would append that object to the list

r.append(int_with_socket(obj))

and we would call the socket in the int:

for int_wrapper in r:
    read(int_wrapper.socket)

we wouldn't have to worry about compatibility issues with fileno(), the downside is that is probably less efficient

nirs · 2017-07-20T08:06:19Z

Wrapping a dispatcher with an int is interesting, but I think it fits David Beazley talk more than production code. We have very simple and elegant fileno() interface, and using it would be best choice.

nirs · 2017-07-20T14:36:17Z

Lib/asyncore.py

-            if obj is None:
-                continue
-            readwrite(obj, flags)
+            ready.append((map[fd], flags))


This loop can be simplified to:

ready = [(map[fd], flags) for fd, flags in r]

nirs · 2017-07-20T14:37:54Z

Lib/asyncore.py

+        if self.closing:
+            return
+        self.closing = True
+


You separated this to make closing state change more clear, right?

Yes, visually is what I would do for my code

nirs

Look good to me, we need to find someone who can merge this :-)

beltran · 2017-07-20T17:54:11Z

👍 thank you for the reviews and your patience!

nirs · 2017-07-20T20:13:21Z

@bjmb, you need to sign the CLA, look at the comments from @the-knights-who-say-ni.

beltran · 2017-07-20T20:20:21Z

Yes, I did it on Friday, haven't hear back since, I guess it may takes some days to get processed

vstinner · 2017-07-20T22:02:07Z

Lib/asyncore.py

            time.sleep(timeout)
            return

        r, w, e = select.select(r, w, e, timeout)


asyncore is close to its death. I'm not able to accept such large changes. I would prefer changes as small as possible, so please use the same method than for poll():

ready = [(map[fd], flags) for fd, flags in r]

The thing is that asyncore code base is old and error-prone, the test suite is very small. I cannot affort the risk of introducing a regression.

It can be done like this but I think the closing variable should remain

Just write a second PR when the first one is merged. I would just be easier to review it. IMHO you are fixing two bugs at once, it makes the review harder.

vstinner · 2017-07-20T22:05:17Z

Lib/asyncore.py

+        ready = [(map[fd], flags) for fd, flags in r]
+
+        for obj, flags in ready:
+            if not obj.closing:


Would you mind to first address the first bug in a first PR, map modified in this loop, and then work on a new PR (once the first PR is merged) to add closing?

Again, I prefer very small changes.

adding closing is part of fixing the bug, the point of it is not to called readwrite on a socket that has been closed by another socket on a previous readwrite.

Using "ready = [...]", I don't see how you could have such race condition. Again, let's move slowly, step by step.

But if you store all the objects on ready you can't call just call all of them because an object previous in the list might have closed a posterior one.

vstinner · 2017-07-20T22:07:11Z

Lib/test/test_asyncore.py

+                    raise unittest.SkipTest("The test is meaningful only if the fd for the old and "
+                                            "the new dispatcher are the same")
+
+                self.flag = True


Sorry, I don't understand why we need so much complicated code to test an altered map. Can't we just have two readable objects and the first one clears the map, so the second loop iteartion is supposed to fail? Just make sure that we called the read handler of the two objects even if the map was cleared?

I don't think I understand this, if the first readable object closes the second and removes it from the map then it will hit the comparison
if obj is None:
and the second readable object won't be called as it shouldn't.

Using "ready = [...]", you can remove "if obj is None:" from the loop. Problem solved.

Same as before, some check must be done you have to check if the object has been closed by a previous one in while readwrite was called.

Sorry, we are talking about two different bugs. I'm talking about a first class of bugs. I didn't say that a fix PR using "ready=[...]" would fix all bugs. Just that it would be easier to review, easy to merge, backport, etc. And then it would become easier to discuss solutions for more complex bugs.

vstinner · 2017-07-20T22:08:26Z

Lib/asyncore.py

+    def fileno(self):
+        if self.socket is None:
+            raise socket.error(EBADF, 'Bad file descriptor')
+        return self.socket.fileno()


Please remove this method.

beltran

Thanks for the review @Haypo

beltran · 2017-07-20T22:47:28Z

Lib/asyncore.py

+        ready = [(map[fd], flags) for fd, flags in r]
+
+        for obj, flags in ready:
+            if not obj.closing:


adding closing is part of fixing the bug, the point of it is not to called readwrite on a socket that has been closed by another socket on a previous readwrite.

beltran · 2017-07-20T23:05:26Z

Lib/test/test_asyncore.py

+                    raise unittest.SkipTest("The test is meaningful only if the fd for the old and "
+                                            "the new dispatcher are the same")
+
+                self.flag = True


I don't think I understand this, if the first readable object closes the second and removes it from the map then it will hit the comparison
if obj is None:
and the second readable object won't be called as it shouldn't.

beltran · 2017-07-20T23:08:16Z

Lib/asyncore.py

            time.sleep(timeout)
            return

        r, w, e = select.select(r, w, e, timeout)


It can be done like this but I think the closing variable should remain

vstinner · 2017-07-20T23:14:13Z

Lib/asyncore.py

            time.sleep(timeout)
            return

        r, w, e = select.select(r, w, e, timeout)


Just write a second PR when the first one is merged. I would just be easier to review it. IMHO you are fixing two bugs at once, it makes the review harder.

vstinner · 2017-07-20T23:15:29Z

Lib/asyncore.py

+        ready = [(map[fd], flags) for fd, flags in r]
+
+        for obj, flags in ready:
+            if not obj.closing:


Using "ready = [...]", I don't see how you could have such race condition. Again, let's move slowly, step by step.

vstinner · 2017-07-20T23:16:27Z

Lib/test/test_asyncore.py

+                    raise unittest.SkipTest("The test is meaningful only if the fd for the old and "
+                                            "the new dispatcher are the same")
+
+                self.flag = True


Using "ready = [...]", you can remove "if obj is None:" from the loop. Problem solved.

nirs · 2017-07-21T00:22:40Z

@bjmb, I think we should first fix the closing issue - this is a very small and safe path that will be easy to get merged, and also to backport to older versions. We also need a test for this.

terryjreedy · 2017-07-24T22:00:10Z

nirs and haypo, the way to fix [CLA not signed], once it has been, it to delete the tag. The knight will then recheck and comfirm.

vstinner · 2017-07-24T22:30:53Z

I wrote an alternative to this alternative fix: PR #2854.

vstinner · 2017-07-26T12:25:14Z

IMHO adding a closing attribute has a too high risk of regression: see http://bugs.python.org/issue30985 for the discussion. I prefer to only rely on the map, your PR 2707 or my PR 2854.

bjmb added 3 commits July 17, 2017 00:16

Added failing tests for asyncore

5aeb009

Fix asyncore

ab29440

Fixed review comments

33f1819

the-knights-who-say-ni added the CLA not signed label Jul 19, 2017

nirs reviewed Jul 19, 2017

View reviewed changes

beltran changed the title ~~Asyncore alternative fix~~ bpo-30931: Asyncore alternative fix Jul 19, 2017

Use closing variable

8d9aec1

nirs reviewed Jul 19, 2017

View reviewed changes

Review comments

d9dfa94

nirs reviewed Jul 20, 2017

View reviewed changes

Simplified loop

1b692ed

nirs approved these changes Jul 20, 2017

View reviewed changes

vstinner requested changes Jul 20, 2017

View reviewed changes

vstinner reviewed Jul 20, 2017

View reviewed changes

beltran commented Jul 20, 2017

View reviewed changes

vstinner requested changes Jul 20, 2017

View reviewed changes

terryjreedy removed the CLA not signed label Jul 24, 2017

the-knights-who-say-ni added the CLA signed label Jul 24, 2017

beltran mentioned this pull request Jul 25, 2017

bpo-30931: Fix asyncore race condition #2854

Closed

vstinner closed this Jul 26, 2017

Uh oh!

bpo-30931: Asyncore alternative fix #2764

bpo-30931: Asyncore alternative fix #2764

Uh oh!

Conversation

beltran commented Jul 19, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

the-knights-who-say-ni commented Jul 19, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirs commented Jul 19, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

beltran commented Jul 19, 2017

Uh oh!

nirs commented Jul 20, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nirs left a comment

Choose a reason for hiding this comment

Uh oh!

beltran commented Jul 20, 2017

Uh oh!

nirs commented Jul 20, 2017

Uh oh!

beltran commented Jul 20, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

beltran left a comment

Choose a reason for hiding this comment

Uh oh!

beltran commented Jul 19, 2017 •

edited

Loading