Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update WebHookHandler to run as background thread (SC-456) #1491

Merged
merged 7 commits into from Jun 27, 2022

Conversation

TheRealFalcon
Copy link
Member

Proposed Commit Message

Update WebHookHandler to run as background thread

When reporting via the WebHookHandler, the events are handled on the
main thread. This slows the execution of cloud-init significantly.
If there are timeouts in POSTing, this can make cloud-init appear to be
stuck.

This commit updates the WebHookHandler to run in the background as to
not block the main thread.

Documentation was also added as this feature wasn't previously
documented outside of examples.

LP: #1910552

Additional Context

https://bugs.launchpad.net/maas/+bug/1910552

I intentionally didn't document the Azure-specific reporter as I'm not sure the current state of it or if they would want it enabled arbitrarily. I plan on discussing it at our next Azure sync.

Test Steps

I used docker run -p 55555:55555 darklynx/request-baskets on my host machine to ensure requests are coming through or canceled as expected.

Note that I moved two unit test files into a subdirectory for organization. No changes to those two files other than the move, and I added a 3rd file for testing this behavior.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

Copy link
Contributor

@aciba90 aciba90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement and schema addition. I really like the documentation to help users configuring this. I have left some suggestions to harden the schema addition.

cloudinit/config/schemas/schema-cloud-config-v1.json Outdated Show resolved Hide resolved
"properties": {
"type": {
"type": "string",
"enum": ["hyperv"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not be useful to document this for the user?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought it up in our Azure meeting this week. This isn't really intended to be user configurable and gets enabled automatically in /etc/cloud on Azure instances. I asked if they wanted this to be more broadly documented at our last sync up and they said they'd get back to me. For now, let's leave it.

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I like the way this is looking. I have a couple of small nits, and found some issues in the schema definition. I think we'll probably want some tests for the schema once we get it fixed too.

I wasn't very familiar with the reporting code, but I like the use of polymorphism with the publish()/flush() interface used.

Note: I have more to review, but I have enough text stored in ram/cache that I'd rather save my work and give you something to get started chewing on for this round of review.

cloudinit/reporting/handlers.py Show resolved Hide resolved
doc/rtd/topics/logging.rst Outdated Show resolved Hide resolved
cloudinit/reporting/__init__.py Show resolved Hide resolved
"type": "string",
"format": "uri",
"description": "The URL to send the event to.",
"required": true
Copy link
Member

@holmanb holmanb Jun 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a manual validation test and noticed an unrelated issue that the schema validation isn't actually validating our required keys:

> PYTHONPATH=. python3 -m cloudinit.cmd.main schema -c 1.yml
Valid cloud-config: 1.yml
> cat 1.yml                 
#cloud-config
reporting:
  reporter_1:
    consumer_key: "42"

The above validation should fail (no schema match).

Copy link
Member

@holmanb holmanb Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I figured it out. The required key must contain at least one string in Draft4.

I think the change requires something like this at the object level:

                   "required": ["type", "endpoint"],

@TheRealFalcon
Copy link
Member Author

@holmanb , I believe I have addressed all of your existing comments.

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I previously made a comment about the schema failing to validate schemas with missing required keys. I made an inline comment about the cause, here's what I did to get schema validation to warn about required properties.

diff --git a/cloudinit/config/schemas/schema-cloud-config-v1.json b/cloudinit/config/schemas/schema-cloud-config-v1.json
index 7f2971d2..e8e4232d 100644
--- a/cloudinit/config/schemas/schema-cloud-config-v1.json
+++ b/cloudinit/config/schemas/schema-cloud-config-v1.json
@@ -2220,6 +2220,7 @@
       "properties": {
         "reporting": {
           "type": "object",
+          "additionalProperties": false,
           "patternProperties": {
             "^.+$": {
               "label": "<arbitrary_name>",
@@ -2254,6 +2255,7 @@
                 },
                 {
                   "additionalProperties": false,
+                  "required": ["type", "endpoint"],
                   "properties": {
                     "type": {
                       "type": "string",

After applying the above changes, schema validation fails (as expected), albeit not the most intuitive failure message (1.yml contains correct keys, 2.yml is missing required keys):

arc~/cloud-init(reporting|✚1…) PYTHONPATH=. python3 -m cloudinit.cmd.main schema -c 1.yml
Valid cloud-config: 1.yml
arc~/cloud-init(reporting|✚1…) PYTHONPATH=. python3 -m cloudinit.cmd.main schema -c 2.yml
Error:
Cloud config schema errors: reporting.reporter_1: {'consumer_key': '42'} is not valid under any of the given schemas
arc~/cloud-init(reporting|✚1…) cat 2.yml                                                 
#cloud-config
reporting:
  reporter_1:
    consumer_key: "42"

In addition to that diff I think we'll want to clean up the required: true/false statements which don't appear to do anything.

"type": "string",
"format": "uri",
"description": "The URL to send the event to.",
"required": true
Copy link
Member

@holmanb holmanb Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I figured it out. The required key must contain at least one string in Draft4.

I think the change requires something like this at the object level:

                   "required": ["type", "endpoint"],

@TheRealFalcon
Copy link
Member Author

Thanks for all the help here @holmanb . I updated the schema based on your comments and also pushed new schema unit tests as a new commit. The later force push is just a rebase and didn't change anything else.

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheRealFalcon: I think you've addressed my previous concerns. I have a couple more small change requests, but I think we're getting closer to merge than before :)

cloudinit/reporting/handlers.py Outdated Show resolved Hide resolved
cloudinit/config/schemas/schema-cloud-config-v1.json Outdated Show resolved Hide resolved
cloudinit/config/schemas/schema-cloud-config-v1.json Outdated Show resolved Hide resolved
cloudinit/config/schemas/schema-cloud-config-v1.json Outdated Show resolved Hide resolved
@holmanb holmanb closed this Jun 27, 2022
@holmanb holmanb reopened this Jun 27, 2022
@holmanb
Copy link
Member

holmanb commented Jun 27, 2022

I just close/reopened to force a re-run of tests, but I think this will need a rebase to pick up the setuptools pin added last week.

TheRealFalcon and others added 7 commits June 27, 2022 13:57
When reporting via the WebHookHandler, the events are handled on the
main thread. This slows the execution of cloud-init significantly.
If there are timeouts in POSTing, this can make cloud-init appear to be
stuck.

This commit updates the WebHookHandler to run in the background as to
not block the main thread.

Documentation was also added as this feature wasn't previously
documented outside of examples.

LP: #1910552
Co-authored-by: Alberto Contreras <aciba90@gmail.com>
Co-authored-by: Alberto Contreras <aciba90@gmail.com>
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for the test additions and fixes. I have one more concern around the threading code. Please let me know if I missed anything.

cloudinit/reporting/handlers.py Show resolved Hide resolved
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants