-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: Reader to writer privilege escalation not allowed #36
Comments
After a bit of initial investigation, it looks like this doesn't reproduce consistently or in isolation. Running the failed tests by themselves works fine. I was able to reproduce one failure by running the entire functional test suite for Nova, but that's a rather heavy reproducer, and suggests there may be race condition going on here. I'm going to keep poking at it, but since it will likely take a while (each Nova functional test run takes about 15 minutes for me) I thought I'd provide my initial findings. |
With 467ed75 reverted I haven't been able to reproduce the Nova functional test failure. Previously it reproduced pretty consistently for me. My guess would be that there's some sort of race in the eventlet fix that hits us when we're running highly concurrent things like functional tests. That's a WAG though. I'm going to push a PR to revert that change. While the workaround may be a little ugly, at least it works and it only affects monkey-patched environments (which are ugly by definition ;-). |
This reverts commit 467ed75. It seems there is a race in the Eventlet fix for this that is causing issues in the OpenStack Nova functional tests. Fixes harlowja#36
* Update requirements from branch 'master' - Merge "Updated from generate-constraints" - Updated from generate-constraints -grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -pypowervm===1.1.21 +pypowervm===1.1.20 powervm/pypowervm#12 - will mask the bad version -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 Change-Id: Ida5b084be8b68c0eaa021f548199ad477398e008
-grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -pypowervm===1.1.21 +pypowervm===1.1.20 powervm/pypowervm#12 - will mask the bad version -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 Change-Id: Ida5b084be8b68c0eaa021f548199ad477398e008
* Update requirements from branch 'master' - Updated from generate-constraints -grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 - will mask bad version -SQLAlchemy===1.3.4 +SQLAlchemy===1.2.19 waiting on a neutron-lib release - waiting on periodic jobs to be fixed then run updated the old to 1.2.19 from 1.2.18 as we should still update within the minor release Change-Id: Ib7193870baa1cd4f9f3b9ad2d9108758090b8a55
-grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 - will mask bad version -SQLAlchemy===1.3.4 +SQLAlchemy===1.2.19 waiting on a neutron-lib release - waiting on periodic jobs to be fixed then run updated the old to 1.2.19 from 1.2.18 as we should still update within the minor release Change-Id: Ib7193870baa1cd4f9f3b9ad2d9108758090b8a55
@cybertron I hope you still remember this 1.5 year old issue :) I'm hesitating to revert the eventlet thing in PR #37, so I'd like to investigate this a little more. You write that you can hit this in the Nova functional test suite, do you have some pointers on how to run it? And what system were you using, Linux? |
Okay, luckily I was able to dredge up some memories of this. :-) It does still reproduce with the nova functional tests. I created a fresh Fedora VM and installed the necessary bits, then tweaked the package versions to use the latest fasteners, and my first test run failed on this error. Here are the highlights of what I did:
You don't necessarily have to use Fedora, bindep is distro-aware so it will give you the right package names for your distro. You'll need to figure out the tox and gcc package names if you don't already have them, of course. |
I can confirm, that this replicates, on ubuntu 20.04 in my case. I'm having some success with running a limited test
Which can be repeated until failure with this nice trick:
|
Repro 🎉 🎉 🎉
So at the moment fasteners do not support eventlet green threads. Like at all. Question for those who know - are there any other "thread" like objects we would like fasteners to support? Is there a list of libraries of this sort? |
Actually scratch that, this is not even supposed to work, as it lacks the magic
Adding this repro example works. Back to investigation :( |
OK, I got it now - nova is not properly I don't like monkey patching, so I'm leaning towards supporting it, but I would have to think what is the impact to our API. I'd love to keep it clear and simple, with no magic, and definitely as an optional dependency. Would love to hear some opinions! Details from the investigation: Fasteners ReaderWriterLock is used in the I added print statements above each call to
And the output is as follows:
This means that the Then, the error itself is triggered when read lock is aquired at line 472, method yields |
Huh, that's strange. The functional tests should be fully monkey patched. The functional module has https://opendev.org/openstack/nova/src/branch/master/nova/tests/functional/__init__.py#L23 which calls https://opendev.org/openstack/nova/src/branch/master/nova/monkey_patch.py#L63 unless the debugger is enabled, in which case the thread module would be excluded, but at least in my env I wasn't using the debugger. There is some funny business with the threading module later in that function though. Apparently it's related to eventlet/eventlet#592 . Maybe that's causing an issue here? |
I'm closing this, as the issue here most likely was related to I don't have any guidance on how nova should migrate to fixed versions, the project is very complicated :) Let me know if you ever try the fixed version and it does not work. |
Patch Set 2: > Patch Set 2: > > expected, but only nova is good I'm looking at this and have found that we're hitting this issue again: harlowja/fasteners#36 due to this commit which was released in fasteners==0.15: harlowja/fasteners@467ed75 Inside of the nova test fixture, we are somehow getting the unpatched version of threading.current_thread instead of the eventlet version that we need. This implies that at the time of the creation of our ReaderWriterLock inside the fixture, the threading module is not eventlet monkey patched: harlowja/fasteners#36 (comment) I don't know how or why this is the case, same as bnemec mentioned here: harlowja/fasteners#36 (comment) I'm going to keep looking but will need help from other nova devs who know more about eventlet. Patch-set: 2 CC: Gerrit User 4690 <4690@4a232e18-c5a9-48ee-94c0-e04e7cca6543>
Patch Set 3: > Patch Set 2: > > > Patch Set 2: > > > > expected, but only nova is good > > I'm looking at this and have found that we're hitting this issue again: > > harlowja/fasteners#36 > > due to this commit which was released in fasteners==0.15: > > harlowja/fasteners@467ed75 > > Inside of the nova test fixture, we are somehow getting the unpatched version of threading.current_thread instead of the eventlet version that we need. This implies that at the time of the creation of our ReaderWriterLock inside the fixture, the threading module is not eventlet monkey patched: > > harlowja/fasteners#36 (comment) > > I don't know how or why this is the case, same as bnemec mentioned here: > > harlowja/fasteners#36 (comment) > > I'm going to keep looking but will need help from other nova devs who know more about eventlet. Correction: the threading module *is* eventlet monkey patched at that point but we are going into the fallback to the native threading.current_thread because we are apparently not in a greenthread at that point: https://github.com/eventlet/eventlet/blob/690464aee772b62a54eedf5275843dce30318597/eventlet/green/threading.py#L94-L96 Patch-set: 3
Patch Set 3: > Patch Set 3: > > > Patch Set 2: > > > > > Patch Set 2: > > > > > > expected, but only nova is good > > > > I'm looking at this and have found that we're hitting this issue again: > > > > harlowja/fasteners#36 > > > > due to this commit which was released in fasteners==0.15: > > > > harlowja/fasteners@467ed75 > > > > Inside of the nova test fixture, we are somehow getting the unpatched version of threading.current_thread instead of the eventlet version that we need. This implies that at the time of the creation of our ReaderWriterLock inside the fixture, the threading module is not eventlet monkey patched: > > > > harlowja/fasteners#36 (comment) > > > > I don't know how or why this is the case, same as bnemec mentioned here: > > > > harlowja/fasteners#36 (comment) > > > > I'm going to keep looking but will need help from other nova devs who know more about eventlet. > > Correction: the threading module *is* eventlet monkey patched at that point but we are going into the fallback to the native threading.current_thread because we are apparently not in a greenthread at that point: > > https://github.com/eventlet/eventlet/blob/690464aee772b62a54eedf5275843dce30318597/eventlet/green/threading.py#L94-L96 That ^ was also wrong. I put in some print statements and see that it's weirder than I expected. We're hitting a condition where the current thread is not of type GreenThread ... it's just 'greenlet.greenlet': https://github.com/eventlet/eventlet/blob/v0.32.0/eventlet/green/threading.py#L124-L128 so it's falling back to the native threading.current_thread() method bc of that. Patch-set: 3
-grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -pypowervm===1.1.21 +pypowervm===1.1.20 powervm/pypowervm#12 - will mask the bad version -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -urllib3===1.25.3 +urllib3===1.24.3 -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 Change-Id: Ida5b084be8b68c0eaa021f548199ad477398e008
-grpcio===1.21.1 +grpcio===1.15.0 taskflow email, needs a release as well to unblock -pypowervm===1.1.21 +pypowervm===1.1.20 powervm/pypowervm#12 - will mask the bad version -tornado===5.1.1;python_version=='2.7' +tornado===4.5.3;python_version=='2.7' py27 keeps pulling in too new versions, will need to either maintain a secondary requirements list for for py27 caps in constraints only libraries (those not tracked by global-requirements.txt). -kubernetes===9.0.0 +kubernetes===8.0.1 openshift is capping things, they don't seem to want to change perhaps we should request that the only two projects using openshift to not use it anymore. openstack/kuryr-tempest-plugin x/flame -jsonschema===3.0.1 +jsonschema===2.6.0 bunch of stuff in openstack still has a cap -fasteners===0.15 +fasteners===0.14.1 harlowja/fasteners#36 Change-Id: Ida5b084be8b68c0eaa021f548199ad477398e008
Patch Set 1: Code-Review-1 I'd make this not dependant on the upper-constraint review, also, it is causing some issues with nova tests see harlowja/fasteners#36 Patch-set: 1 Reviewer: Gerrit User 14288 <14288@4a232e18-c5a9-48ee-94c0-e04e7cca6543> Label: Code-Review=-1
Abandoned Since issue [1] not proposing this right now, making abandon [1]: harlowja/fasteners#36 Patch-set: 1 Status: abandoned
467ed75
Hi, I think this commit caused an issue with nova (at the very least). Had an oslo guy (bnemec) look at it and he pointed to the commit.
Here's the log while it's still up.
http://logs.openstack.org/54/660254/2/check/cross-nova-functional/a0617a9/testr_results.html.gz
Backtraces follow.
The errors look more or less the same to me. If I could get indication if I should mask 0.15 in global-requirements.txt (as a known bad release) that'd be nice, right now it's just constraints that are keeping openstack safe.
Thanks
The text was updated successfully, but these errors were encountered: