Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaces httpmock with mockito #194

Merged
merged 1 commit into from
Mar 13, 2023
Merged

Conversation

sr-gi
Copy link
Member

@sr-gi sr-gi commented Feb 23, 2023

This PR replaces our current mock for the API's HTTP interface (httpmock) by mockito). The main reason for replacing one with the other is that the latter is more actively maintained and developed on, apart from it having some functionality that would be useful for us but the former lacked.

On top of that, httpmock was the root of some issues that we had worked around but were still carrying, such as #119 and #165.

It needs to be noted that, currently, we have replaced some of out time-based wait-and-check for the Retrier with some loops waiting for a condition to be met. This is due to mockito having some unexpected behavior for API delays when tests are run in parallel. Notice how this was also an issue with httpmock, even though we had the timings better nailed down (#193). We may be able to get rid of the loops once lipanski/mockito#160 is fixed, even though it may be worth discussing whether wait-and-test is better than polling for a certain condition.

I've also spent some time redesigning the Retrier and the RetryManager to add some instrumentation points (such as adding a channel between the WTClient and the Retrier so the former can be notified when the state of the latter is changed). However, after several different iterations, I think it may not be worth the additional complexity for something that is merely used in testing (and can be checked with an ugly-but-simple loop)

@sr-gi
Copy link
Member Author

sr-gi commented Mar 2, 2023

I've added an additional commit improving the design given the devs from mockito added a new way of building mocks considering the request body when building the response (based on this issue I opened when I started working on the revamp: lipanski/mockito#159).

Both commits can be squashed before merging, leaving them separate for now so it is easier to review. The delay issue seems to remain for now. If it happens to get fixed before we merge this I may add a last commit addressing it.

Copy link
Collaborator

@mariocynicys mariocynicys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the 4 failing tests in #193 are patched with a loop, but I suspect other tests might fail too at some time.

note: The middle commit says Fixes Carrier but is actually for wt client retrier.

.with_status(200)
.with_header("content-type", "application/json")
.with_body_from_fn(move |w| {
thread::sleep(Duration::from_secs_f64(API_DELAY));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tokio::test is a single threaded async runtime. I think such an std::thread::sleep will block the whole runtime, shouldn't we use an async alternative?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replacing this line with thread::sleep(Duration::from_secs_f64(7.0) had a very weird side effect. It made other tests fail.
For me that was retrier::tests::test_manage_retry_unreachable & retrier::tests::test_manage_retry_while_idle.
Any idea how is that possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to do with the mock having a synchronous wait. However, the function does not allow async closures atm.

I PRed a change to add delays to the repo but the maintainers seemed not to like it on the way it was (standalone, instead of trough something like this method), so I may rework it so async closures are accepted (I'm still debating about it).

@sr-gi
Copy link
Member Author

sr-gi commented Mar 6, 2023

This should do. I've ended up going for the loop alternative after seeing that even if the delays are properly working (as for the .with_delay method rejected by mockito) test can sometimes fail. wait-based tests with hardcoded times suck, especially when the underlying function is an exponential backoff. I'd be nice to have with_body_from_fn accept async closures so tokio::time::sleep can be used instead, but I'm honestly not fluent enough with async to do the change (nor do I think it's worth putting the time atm given the usecase).

I've added a macro to do the polling-based wait so things are less verbose.

All commits can be squashed.

@sr-gi sr-gi force-pushed the replace-httpmock branch 2 times, most recently from 2e51b57 to 86e3f2a Compare March 8, 2023 09:07
@sr-gi sr-gi requested a review from mariocynicys March 8, 2023 09:07
Copy link
Collaborator

@mariocynicys mariocynicys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits, looks good otherwise.

@@ -600,6 +601,19 @@ mod tests {
const MAX_INTERVAL_TIME: u16 = 1;
const MAX_RUN_TIME: f64 = 0.2;

#[macro_export]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to export the macro.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, this was originally intended to go on teos-common, but I changed my mind 😅

.with_status(200)
.with_header("content-type", "application/json")
.with_body_from_request(move |_| {
std::thread::sleep(Duration::from_secs_f64(API_DELAY));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We imported std::thread in the tests, let's omit the std:: namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That must have been vscode on its own, I'll remove the std::thread import since that'll create a smaller diff.

watchtower-plugin/src/retrier.rs Show resolved Hide resolved
watchtower-plugin/src/retrier.rs Show resolved Hide resolved
Comment on lines 707 to 715
{
let state = wt_client.lock().unwrap();
assert!(state.get_tower_status(&tower_id).unwrap().is_reachable());
assert!(!state.retriers.contains_key(&tower_id));
assert!(!state
.towers
.get(&tower_id)
.unwrap()
.pending_appointments
.contains(&appointment.locator));
}
api_mock.assert_async().await;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this doesn't have to be in a nested scope? no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually due to clippy:

error: this `MutexGuard` is held across an `await` point
   --> watchtower-plugin/src/retrier.rs:705:13
    |
705 |         let state = wt_client.lock().unwrap();
    |             ^^^^^
    |
    = help: consider using an async-aware `Mutex` type or ensuring the `MutexGuard` is dropped before calling await
note: these are all the `await` points this lock is held through
   --> watchtower-plugin/src/retrier.rs:705:9
    |
705 | /         let state = wt_client.lock().unwrap();
706 | |         assert!(state.get_tower_status(&tower_id).unwrap().is_reachable());
707 | |         assert!(!state.retriers.contains_key(&tower_id));
708 | |         assert!(!state
...   |
717 | |         task.abort();
718 | |     }
    | |_____^
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock
    = note: `-D clippy::await-holding-lock` implied by `-D warnings`

Comment on lines 1182 to 1186
{
let state = wt_client.lock().unwrap();
assert!(!state.retriers.contains_key(&tower_id));
let tower = state.towers.get(&tower_id).unwrap();
assert!(tower.status.is_reachable());
assert!(tower.pending_appointments.is_empty());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a redundant? scope here.

{
let state = wt_client.lock().unwrap();
assert!(state.get_tower_status(&tower_id).unwrap().is_reachable());
assert!(!state.retriers.contains_key(&tower_id));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be analogous to state.get_retrier_status(&tower_id).is_none(), which we waited on above already.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

.unwrap()
.get_tower_status(&tower_id)
.unwrap()
.is_reachable());
assert!(!wt_client.lock().unwrap().retriers.contains_key(&tower_id));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the one above.

Comment on lines 1026 to 1035
wait_until!(wt_client
.lock()
.unwrap()
.get_tower_status(&tower_id)
.unwrap()
.is_misbehaving());

// Retriers are wiped every polling interval, so we'll need to wait a bit more to check it
tokio::time::sleep(Duration::from_secs(POLLING_TIME)).await;
tokio::time::sleep(Duration::from_secs_f64(1.5 * POLLING_TIME as f64)).await;
assert!(!wt_client.lock().unwrap().retriers.contains_key(&tower_id));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wait_until! the retrier is removed first and assert that it's misbehaving and not the other way around?

// Wait until the tower is no longer being retried.
wait_until!(wt_client
            .lock()
            .unwrap()
            .get_retrier_status(&tower_id)
            .is_none());

// Retriers are wiped every polling interval, so we'll need to wait a bit more to check it
tokio::time::sleep(Duration::from_secs_f64(1.5 * POLLING_TIME as f64)).await;

// The tower should have a misbehaving status.
assert!(wt_client
            .lock()
            .unwrap()
            .get_tower_status(&tower_id)
            .unwrap()
            .is_misbehaving());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may even be able to get rid of the intermediate sleep in this case given the retrier should have already been wiped, wouldn't we?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may even be able to get rid of the intermediate sleep in this case given the retrier should have already been wiped, wouldn't we?

That's correct. The sleep was guarding against early assertion of the retrier is not found, so we can drop it.

assert!(tower.pending_appointments.is_empty());
{
let state = wt_client.lock().unwrap();
assert!(!state.retriers.contains_key(&tower_id));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we asserting on tower_id not in retriers but we already waited on it.

Also:

- Replaced time based waits for loops with the expected conditions to
get given the former is way more error prone.
- Updates tests that needed a response based on the request (main reason why
    were're switching to mockito)
@sr-gi
Copy link
Member Author

sr-gi commented Mar 8, 2023

Addressed all comments but the ones related to #194 (comment)

Copy link
Collaborator

@mariocynicys mariocynicys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sr-gi sr-gi merged commit 67aa345 into talaia-labs:master Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants