New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement FastTrack handshake with CRP #2514
Conversation
@@ -219,6 +219,7 @@ def save_to_history(etag, text): | |||
pass | |||
except VmSettingsError as exception: | |||
save_to_history(exception.etag, exception.vm_settings_text) | |||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this error needs to be propagated to the caller, I missed that in a previous commit
@@ -480,8 +480,9 @@ def format_message(msg): | |||
logger.info(message) | |||
add_event(op=WALAEventOperation.HostPlugin, message=message, is_success=True) | |||
|
|||
# Don't support HostGAPlugin versions older than 115 | |||
if vm_settings.host_ga_plugin_version < FlexibleVersion("1.0.8.115"): | |||
# Don't support HostGAPlugin versions older than 123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
123 contains all the fixes we need on the HostGAPlugin before releasing to Prod. Many VMs in the canary region are still on 117 so using that for now
Codecov Report
@@ Coverage Diff @@
## develop #2514 +/- ##
===========================================
+ Coverage 71.77% 71.88% +0.11%
===========================================
Files 101 101
Lines 15116 15147 +31
Branches 2398 2401 +3
===========================================
+ Hits 10850 10889 +39
- Misses 3780 3781 +1
+ Partials 486 477 -9
Continue to review full report at Codecov.
|
@@ -1,26 +0,0 @@ | |||
<Extensions version="1.0.0.0" goalStateIncarnation="9"><GuestAgentExtension xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments in tests/protocol/mockwiredata.py
@@ -111,9 +113,11 @@ def generate_put_handler(*emulators): | |||
def mock_put_handler(url, *args, **_): | |||
|
|||
if HttpRequestPredicates.is_host_plugin_status_request(url): | |||
return | |||
status_blob = WireProtocolData.get_status_blob_from_hostgaplugin_put_status_request(args[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments in tests/protocol/mockwiredata.py
from tests.protocol import mockwiredata | ||
from tests.protocol.HttpRequestPredicates import HttpRequestPredicates | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved ReportStatusTestCase to its own file.
test_report_status_should_log_errors_only_once_per_goal_state is basically the same as in the original file. The other tests are new or full rewrites of the older tests.
|
||
yield update_handler | ||
|
||
def test_update_handler_should_report_status_when_fetch_goal_state_fails(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full rewrite of older tests
raise HttpError() | ||
|
||
@patch("azurelinuxagent.common.conf.get_enable_fast_track", return_value=True) | ||
def test_update_handler_should_report_status_even_on_failed_goal_state_fetch(self, _): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test and the one below have been problematic for several of the code iterations I've been working on.
I just went ahead and rewrote them as test_update_handler_should_report_status_when_fetch_goal_state_fails
@@ -2868,171 +2868,6 @@ def test_it_should_process_goal_state_only_on_new_goal_state(self): | |||
self.assertEqual(2, remote_access_handler.run.call_count, "remote_access_handler.run() should have been called on a new goal state") | |||
|
|||
|
|||
class ReportStatusTestCase(AgentTestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class was moved to its own file
@@ -305,17 +302,18 @@ def mock_http_put(self, url, data, **_): | |||
|
|||
if url.endswith('/vmAgentLog'): | |||
self.call_counts['/vmAgentLog'] += 1 | |||
elif url.endswith('/StatusBlob'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"/StatusBlob" is a fake url that was used in the test file that I removed. It's usage was making the put status on the hostgaplugin fail, and forcing the request to be retried as a direct call to storage.
I removed this and added support in the mock protocol for direct and hostgaplugin PUT status requests
@@ -327,6 +325,12 @@ def mock_gen_trans_cert(self, trans_prv_file, trans_cert_file): | |||
with open(trans_cert_file, 'w+') as cert_file: | |||
cert_file.write(self.trans_cert) | |||
|
|||
@staticmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when the status is PUT via the hostgaplugin, it is wrapped in an enclosing object as property "content" and base64-encoded.
this function unwraps it and decodes it
@@ -1,27 +0,0 @@ | |||
# Copyright (c) Microsoft Corporation. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 2 tests were moved to test_extensions_goal_state_from_extensions_config/from_vm_settings.py, along with the other parse tests. They are now named test_it_should_default_to_block_blob_when_the_status_blob_type_is_not_valid
|
||
@patch("azurelinuxagent.common.conf.get_enable_fast_track", return_value=True) | ||
def test_extension_goal_state_should_parse_requested_version_properly(self, _): | ||
def test_it_should_parse_vm_settings(self, _): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified these tests to go thru update_goal_state instead of create_from_* to make them more consistend with other goal state tests
remove_timestamps(second_status) | ||
|
||
self.assertEqual(first_status, second_status) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
existing test moved from the previous file; almost identical to the original
update_handler._try_update_goal_state(exthandlers_handler.protocol) | ||
update_handler._report_status(exthandlers_handler) | ||
self.assertEqual(2, logger_warn.call_count, "UpdateHandler._report_status() should continue reporting errors after a new goal state") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest are new tests
data_file["ext_conf"] = "hostgaplugin/ext_conf-invalid_blob_type.xml" | ||
with mock_wire_protocol(data_file) as protocol: | ||
extensions_goal_state = protocol.get_goal_state().extensions_goal_state | ||
self.assertEqual("BlockBlob", extensions_goal_state.status_upload_blob_type, 'Expected BlockBob for an invalid statusBlobType') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit typo: BlockBob
-> BlockBlob
in the error message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, fixed
@@ -45,3 +53,8 @@ def test_it_should_parse_empty_depends_on_as_dependency_level_0(self): | |||
|
|||
self.assertEqual(0, extensions[0].settings[0].dependencyLevel, "Incorrect dependencyLevel}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another typo, if you want to fix it in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the typo? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it. fixed
self.assertEqual(2, len(extensions_goal_state.agent_manifests), "Incorrect number of agent manifests. Got: {0}".format(extensions_goal_state.agent_manifests)) | ||
self.assertEqual("Prod", extensions_goal_state.agent_manifests[0].family, "Incorrect agent family.") | ||
self.assertEqual(2, len(extensions_goal_state.agent_manifests[0].uris), "Incorrect number of uris. Got: {0}".format(extensions_goal_state.agent_manifests[0].uris)) | ||
self.assertEqual("https://zrdfepirv2cdm03prdstr01a.blob.core.windows.net/7d89d439b79f4452950452399add2c90/Microsoft.OSTCLinuxAgent_Prod_uscentraleuap_manifest.xml", extensions_goal_state.agent_manifests[0].uris[0], "Incorrect number of uris.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this error message can be better. Instead of, Incorrect number of uris
, maybe first URI does not match. Expected: {0}, got {1}
or something to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# Don't support HostGAPlugin versions older than 115 | ||
if vm_settings.host_ga_plugin_version < FlexibleVersion("1.0.8.115"): | ||
# Don't support HostGAPlugin versions older than 123 | ||
# TODO: update the minimum version to 1.0.8.117 before release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, you mean minimum version to 123 before release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i'll fix the comment
if vm_settings.host_ga_plugin_version < FlexibleVersion("1.0.8.115"): | ||
# Don't support HostGAPlugin versions older than 123 | ||
# TODO: update the minimum version to 1.0.8.117 before release | ||
if vm_settings.host_ga_plugin_version < FlexibleVersion("1.0.8.117"): | ||
raise_not_supported(reset_state=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not supported, how does this update supports_fast_track
flag as false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the flag is maintained by the UpdateHandler and computed like this
supports_fast_track = self._goal_state.extensions_goal_state.source_channel == GoalStateChannel.HostGAPlugin
If the HostGAPlugin does not support vmsettings (or is the wrong version) we fallback to the WireServer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that the HostGAPlugin may support fast track, but the feature may be disabled in waagent.conf. The above condition takes care of that too.
The naming is a little awkward: CRP needs FastTrack to be added to the supportedFeatures, but what the flag really means is that the agent is requesting CRP to start using FastTrack. Of course, if the agent sets this flag, it must support FT. But the agent could support FT and still not request CRP to use it (e.g. the user disabled the feature, or the hostgaplugin is not the correct version).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got you. Thanks for explaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am doing another pass of the code (separate PRs) and will add comments and tests to explain the behavior of the feature, as well as to address the TODOs (the behavior is already described in the spec for fast track on the linux agent, but i should capture that in tests)
If the HostGAPlugin supports the VM settings API, the agent needs to add FastTrack to the supportedFeatures in the status report. When CRP detects this, it enables the feature and starts producing FastTrack goal states for that VM.
The PR includes some test fixes; I added comments in the PR for those.