-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State messaging channel for services to report on async work for better tests #266
Conversation
Codecov ReportBase: 93.95% // Head: 94.17% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #266 +/- ##
==========================================
+ Coverage 93.95% 94.17% +0.21%
==========================================
Files 59 59
Lines 4928 5027 +99
==========================================
+ Hits 4630 4734 +104
+ Misses 298 293 -5
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
Small comments for now, will look at it a little closer soon as I'm looking at it from my phone only: Wondering if we could a) give our two "status channels" more distinctive meaning and b) make it more ergonomic to use our API.
Also I'd like to look again at the underlying problem, just to understand if it's really expected behavior and not a bug (so far I think it's not).
@sandreae, after some experimentation I thought it might be a good idea to keep this PR open for now and rather only merge the parts which fix the race conditions. You can find it here: #269 - What do you think? Is this a good idea? I think it will make us get to what we want faster: Fix the annoying race conditions. The newly introduced feature very helpful and definitely more elegant, but is maybe not "solved" yet as we're both not happy with its API design. It doesn't give us much functionality outside of tests but already affects the API for other non-tests parts of the stack, making them harder to grasp. I'd rather merge the important bit for now and keep this open until we find something pretty! Leaving your feature changes unmerged still keeps the second class of race conditions unfixed, but we can at least "control" them with bumping the waiting times. Also not beautiful, but it would come with no big changes for now. We can keep this here open and make a new ticket which deals with introducing that feature? If you're happy with this I open #269 for review and create the ticket. In any case, thanks so much for digging into all of that! What a ride! |
In this and the new branch I sadly had still the following failing test (as I've mentioned on the chat):
|
Hey! Thanks for looking into this. I think what you're suggesting is a good move, it's a shame we might not use a lot of the code I did here, but maybe it was unavoidable, in any case it's good research which will feed into the next design phase! |
Hmm... interesting.... let me see if I can reproduce it. |
Tried reproducing this but so far never seen a fail, ran it 20x or so.... 😞 |
Let's move this conversation to the new PR though 👍 |
I've created two new issues to tackle the separate topics: |
This is definitely not going to waste, think we got some really good learnings from this which will help us to keep on working on it 👍 |
I've closed this PR for now (keeping the My suggestion would be to have a "timeout helper" which checks for a condition every x milliseconds and times out if that condition wasn't reached after y seconds. Like that we can test async conditions to arrive eventually. I've seen that pattern in other projects as well and think that this is the way to go for at least test-related problems. |
Turns out there were at least two places causing tests to sometimes fail, all quite interesting in the end. Although this PR was mainly about getting the tests working consistently, I actually ended up adding a little feature ;-)
Cross test pollution of mutably shared schema provider
static SCHEMA_PROVIDER: Lazy<Mutex<Vec<Schema>>> = Lazy::new(|| Mutex::new(Vec::new()));
We're using the above to allow the GraphQL schema builder access to the current schemas, in particular in
create_type_info
. This works, but when running multiple tests concurrently I thinkSCHEMA_PROVIDER
is being over-written to in one process before it is read in another.We take a
MutexGuard
insave_static_schemas()
here but it is dropped as it goes out of scope and this leaves an opening where another lock can be taken out before it's read in the same test.What we want to do then is keep the lock until we read again, we can return it, keeping it alive longer, but if we keep it too long we simply block our own read, and we can't pass it on because we don't have access to the
async_graphql
methods.So then maybe we need a yielding mutex like
tokio::sync::Mutex
which wakes a future when a lock is released? But we can't cos it'sasync
andcreate_type_info
isn't.SOLUTION: We can force the problematic tests to run in series: https://docs.rs/serial_test/0.4.0/serial_test/ ⭐
Waiting for stuff to happen
SOLUTION: a new state messaging channel for all services ⭐
closes: #247
📋 Checklist
CHANGELOG.md