Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add poll watcher support to spec watcher for M1 #8449

Merged
merged 29 commits into from Mar 18, 2022
Merged

Conversation

pozsgaic
Copy link
Contributor

@pozsgaic pozsgaic commented Feb 17, 2022

Signed-off-by: pozsgaic pozsgai@progress.com

closes #8406

Note that this is only for SpecWatcher - there is more work for PeerWatcher and UserConfigWatcher

@netlify
Copy link

netlify bot commented Feb 17, 2022

👷 Deploy Preview for chef-habitat processing.

🔨 Explore the source changes: 53a660f

🔍 Inspect the deploy log: https://app.netlify.com/sites/chef-habitat/deploys/62337a85d3c93f0009e7aead

@pozsgaic pozsgaic force-pushed the cjp/spec_watcher_update branch 2 times, most recently from 63aca58 to d8ab682 Compare February 24, 2022 13:18
@pozsgaic pozsgaic marked this pull request as ready for review February 24, 2022 13:19
Copy link
Contributor

@atrniv atrniv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mentioned some changes that could ensure the code is more idiomatic

components/hab/src/command/studio/enter.rs Show resolved Hide resolved
components/hab/src/command/studio/enter.rs Outdated Show resolved Hide resolved
components/hab/src/command/studio/enter.rs Outdated Show resolved Hide resolved
components/sup/src/manager/file_watcher.rs Outdated Show resolved Hide resolved
components/sup/src/manager/sup_watcher.rs Outdated Show resolved Hide resolved
@themightychris
Copy link
Contributor

themightychris commented Mar 2, 2022

Doesn't the supervisor also have a file watcher for user.toml files? Is that covered here too?

Edit: looks like yes: https://github.com/habitat-sh/habitat/pull/8449/files#diff-dc64834de212928eec8772c9ea784a58977fd1fc5e5ffd10100dd930bc23ac0a

Copy link
Contributor

@mwrock mwrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure these tests are getting run in CI. Looking at both of the linux sup builds, all the tests are filtered out according to the output. The windows tests are eunning the unit tests but these watcher tests are all linux only.

}
assert_eq!(expected_members, members);
env::remove_var("HAB_STUDIO_HOST_ARCH");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's extract the common code in the above 2 functions into a common function that both call

Copy link
Contributor Author

@pozsgaic pozsgaic Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that these tests need to be run in order otherwise the environmental variable will be incorrect.
By default rust will run cargo tests in multiple threads.
If I use the --test-threads=1 switch then the results are as expected.

UPDATE: Use locked_env_var! for concurrency for these tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still looks like the above 2 tests are mostly the same. Was this overlooked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The original was refactored by introducing the function peer_watcher_member_load_test(). These two tests use different values in for the Member constructors. We could pass in the raw data into the above function as well, but I did not think that it added value or readability.

};
env::remove_var(TEST_STUDIO_HOST_ARCH_ENVVAR);
assert_eq!(watcher_type, "Fallback");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want a test validating that you get a native watcher when the env variable is not set.

Copy link
Contributor Author

@pozsgaic pozsgaic Mar 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions will need to be run sequentially to work. If we can run tests for this file in a single thread it should work.

UPDATE: Use locked_env_var! for concurrency for these tests.

}

env::remove_var("HAB_STUDIO_HOST_ARCH");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets extract the common code from these 2 tests into a sharable function.

match &self.watcher_type {
WatcherType::NotifyWatcherType => false,
WatcherType::PollWatcherType => true,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need watcher_type? Could you use watcher.get_mut_underlying_watcher().real_watcher instead?

1
} else {
5
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and all the other places where you are returning the number of events feels hacky to handle here because the interface is now misleasing. Seems like you are trying to return an arbitrary number of events for the pollwatcher which does not actually match up with any actual event count. Looks like this is all in service of bumping the iteration count later. Seems like things would be more clear and involve less code changes if you consolidated this in spin_watcher and have that multiply the iterations for the poll watcher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was refactored

pozsgaic added 17 commits March 11, 2022 14:09
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
pozsgaic added 7 commits March 11, 2022 14:09
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
#[test]
fn file_watcher() {
let lock = lock_env_var();
lock.set("");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could just be lock.unset() which will clear out the variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - there are a number of other instances as well across our test cases, so will fix all of them.

Signed-off-by: pozsgaic <pozsgai@progress.com>
lock.set("aarch64-darwin");

// When using the PollWatcher variant of SupWatcher, the
// behavior is different than the NotifyWatcher. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you briefly describe the difference and how that affects those cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description updated in the code:
// When using the PollWatcher variant of SupWatcher, the
// behavior is different than the NotifyWatcher. The NotifyWatcher
// receives event callbacks in response to a file or directory change,
// while the PollWatcher must poll and check for changes.
// As such, there were observed differences in the timing and number
// of events that the PollWatcher will handle. It was determined
// through analyzing the output that the poll watcher as written is not handling
// test cases correctly after the second test case. It does not receive
// the required events to pass the test case regardless of timing.

}
assert_eq!(expected_members, members);
env::remove_var("HAB_STUDIO_HOST_ARCH");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still looks like the above 2 tests are mostly the same. Was this overlooked?

let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
iterations *= 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why the iteration count needs to be bumped?

Copy link
Contributor Author

@pozsgaic pozsgaic Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Through experimentation it was determined that the PollWatcher is less responsive
and emits more events than the NotifyWatcher. The initial sleep used in NotifyWatcher
was not adequate to pass the tests and was increased as a result. Also the number of iterations
required is larger for the PollWatcher case as there were intermediate events observed that would lead to
test case failure with the original iteration count used. The test cases will fail if the desired events are not emitted, so the iteration count was increased to account for the increased number of events in the PollWatcher."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to make this 3 iterations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to make this 3 iterations

let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
iterations *= 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain why the iteration count needs to be bumped?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Through experimentation it was determined that the PollWatcher is less responsive
and emits more events than the NotifyWatcher. The initial sleep used in NotifyWatcher
was not adequate to pass the tests and was increased as a result. Also the number of iterations
required is larger for the PollWatcher case as there were intermediate events observed that would lead to
test case failure with the original iteration count used. The test cases will fail if the desired events are not emitted, so the iteration count was increased to account for the increased number of events in the PollWatcher."


let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 seconds seems really high. Did we play much with smaller delays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally got things working with a 30 second delay and then reduced it to 15 seconds and did not find it to be a problem. I can try. Will reduce if it will continue to pass the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update - 5 second sleep failed. 10 second sleep failed. Reverting to 15 second delay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to bring this to 5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to bring this to 5


let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15 seconds seems really high. Did we play much with smaller delays?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally got things working with a 30 second delay and then reduced it to 15 seconds and did not find it to be a problem. I can try. Will reduce if it will continue to pass the tests.

// For the poll watcher tehre are more than one Debounced event received
// and this test will fail. Instead we can look at the last event and
// ensure that it is correct, as it will be the last entry that will
// determine the final state of the watched.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have both tests just look at the last one if that is the only one that matters?

Copy link
Contributor Author

@pozsgaic pozsgaic Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the reason for testing only the last element in the PollWatcher tests as it can't test the entire set of events in order as the NotifyWatcher can. That would be a degradation of the NotifyWatcher test. It does work though if you think we should add it as a test for NotifyWatcher.

// For the poll watcher tehre are more than one Debounced event received
// and this test will fail. Instead we can look at the last event and
// ensure that it is correct, as it will be the last entry that will
// determine the final state of the watched.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we have both tests just look at the last one if that is the only one that matters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate question

pozsgaic added 2 commits March 15, 2022 18:11
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
lock.set("aarch64-darwin");

// When using the PollWatcher variant of SupWatcher, the
// behavior is different than the NotifyWatcher. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you briefly describe the difference and how that affects the tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


if is_poll_watcher {
thread::sleep(Duration::from_secs(5));
};
self.test_dirs(&step.dirs, &setup.watcher.paths.dirs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on my buildkite and local tests, we can remove this sleep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


if is_poll_watcher {
thread::sleep(Duration::from_secs(5));
};
self.test_dirs(&step.dirs, &setup.watcher.paths.dirs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

based on my buildkite and local tests, we can remove this sleep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to bring this to 5


let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to bring this to 5

let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
iterations *= 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to make this 3 iterations

let mut iterations = expected_event_count;
if is_poll_watcher {
thread::sleep(Duration::from_secs(15));
iterations *= 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be able to make this 3 iterations

pozsgaic added 2 commits March 17, 2022 18:07
Signed-off-by: pozsgaic <pozsgai@progress.com>
Signed-off-by: pozsgaic <pozsgai@progress.com>
@pozsgaic pozsgaic merged commit 1ace224 into main Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Supervisor crashes in Docker studio on M1 macbooks
4 participants