Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vendordata arrays and dicts #837

Merged
merged 12 commits into from
Mar 16, 2021

Conversation

eb3095
Copy link
Contributor

@eb3095 eb3095 commented Mar 10, 2021

Fix vendordata arrays and dicts

Fix a bug with vendordata_raw when arrays or arrays of dicts are used

summary: An error is thrown when arrays or arrays of dicts are used

When you use an array or an array of dicts with vendordata_raw it is passed to write_files as a string from _store_rawdata causing an error to be thrown. Bellow is the error thrown and a sample of the code used from another PR to trigger the bug.
2021-03-10 05:00:19,275 - util.py[WARNING]: failed stage init
2021-03-10 05:00:19,275 - util.py[DEBUG]: failed stage init
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/cmd/main.py", line 652, in status_wrapper
    ret = functor(name, args)
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/cmd/main.py", line 376, in main_init
    init.update()
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/stages.py", line 368, in update
    'vendordata')
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/stages.py", line 398, in _store_rawdata
    util.write_file(self._get_ipath('%s_raw' % datasource), data, 0o600)
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/util.py", line 1837, in write_file
    content = encode_text(content)
  File "/usr/local/lib/python3.6/site-packages/cloud_init-21.1-py3.6.egg/cloudinit/util.py", line 117, in encode_text
    return text.encode(encoding)
AttributeError: 'list' object has no attribute 'encode'

Additional Context

@smoser had given me two examples of usage for vendordata_raw for use in the bellow listed PR. However when attempting to use it I encountered an unexpected error regarding writing it to a file. The methods he provided worked otherwise leading me to believe this may be a bug.

#827 (comment)

Test Steps

This is the code I used to trigger this failure. It originates from the following PR.

#827

    self.vendordata_raw = []
    for uscript in user_scripts:
        self.vendordata_raw.append(uscript)
    self.vendordata_raw.append("#cloud-config\n%s" % json.dumps(config))

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@eb3095 eb3095 mentioned this pull request Mar 10, 2021
3 tasks
@smoser
Copy link
Collaborator

smoser commented Mar 10, 2021

@eb3095 It definitely feels like you found a bug here, that was introduced by #777 (3cebe0d). But I'm not sure that this is the right fix.

@andrewbogott, @TheRealFalcon perhaps you have more context?

Openstack datasource gets vendordata and vendordata2 from read_v2. If I'm reading that correctly, that will return the loaded json object, so it could be any json object (dict, list, string ... ). Then it calls convert_vendordata on that. convert_vendordata says clearly that it will return a string or list.

End result is that @eb3095 is correct, and that store_rawdata has to take any type, not just bytes.

It feels like I must be reading this wrong though... because otherwise I don't see how it would have passed any actual testing.

Help?

@TheRealFalcon
Copy link
Member

@smoser I don't think this is a new issue. I can go back a few years and set vendordata_raw to a list and still get this traceback:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 650, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 372, in main_init
    init.update()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 363, in update
    self._store_vendordata()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 392, in _store_vendordata
    util.write_file(self._get_ipath('vendordata_raw'), raw_vd, 0o600)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1766, in write_file
    content = encode_text(content)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 157, in encode_text
    return text.encode(encoding)
AttributeError: 'list' object has no attribute 'encode'

@smoser
Copy link
Collaborator

smoser commented Mar 10, 2021

@smoser I don't think this is a new issue. I can go back a few years and set vendordata_raw to a list and still get this traceback:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 650, in status_wrapper
    ret = functor(name, args)
  File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 372, in main_init
    init.update()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 363, in update
    self._store_vendordata()
  File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 392, in _store_vendordata
    util.write_file(self._get_ipath('vendordata_raw'), raw_vd, 0o600)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 1766, in write_file
    content = encode_text(content)
  File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 157, in encode_text
    return text.encode(encoding)
AttributeError: 'list' object has no attribute 'encode'

yeah, you're right. I'm really not sure how this worked at all. this is really a trainwreck.
many datasources (IBM, MAAS, OpenStack, ConfigDrive) use convert_vendordata, which will raise error if the data passed to it is other than string, list, or dict({'cloud-init': []}). Even bytes there will raise ValueError. So the options are then:
a.) fix all the 'vendordata_raw' settings to be bytes... actually raw data.
b.) accept that, for vendordata, the 'raw' is a string, or list or bytes.

'b' is much easier. So what I'd suggest is handling vendordata (and vendordata2) specially here and adding 'store_raw_vendordata' method that accepts those as input and serializes a list with json.

@eb3095 eb3095 force-pushed the vendordata-array-fix branch 3 times, most recently from e868022 to 3f1b045 Compare March 10, 2021 20:26
@eb3095
Copy link
Contributor Author

eb3095 commented Mar 10, 2021

Is this more in line with what you were looking for here?

Eric Benner added 2 commits March 10, 2021 15:26
Refactor to request

Syntax error

Syntax error

Syntax error
Copy link
Collaborator

@smoser smoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you.

One comment.

cloudinit/stages.py Show resolved Hide resolved
cloudinit/stages.py Show resolved Hide resolved
@smoser
Copy link
Collaborator

smoser commented Mar 16, 2021

@eb3095 could you write a single coherent commit message?

You can either put it here (in the issue description) or squash your commits and write a single good message and then force-push.

Here is a suggested commit message:

 Fix stack trace if vendordata_raw contained an array.

 The implementation in existing datasources means that vendordata_raw is
 not "raw" as it ideally would be. Instead, actual values may include
 bytes, string or list.  If the value was a list, then the attempt to
 persist that data to a file in '_store_rawdata' would raise a
 TypeError.

 The change is to encode with util.json_dumps (which is safe for
 binary data) before writing.

 Co-authored-by: Eric Benner <ebenner@vultr.com>

We want the commit message to have less than 74 chars per line.

Thanks for your patience.

@TheRealFalcon TheRealFalcon merged commit f35181f into canonical:master Mar 16, 2021
@eb3095
Copy link
Contributor Author

eb3095 commented Mar 17, 2021

@eb3095 could you write a single coherent commit message?

You can either put it here (in the issue description) or squash your commits and write a single good message and then force-push.

Here is a suggested commit message:

 Fix stack trace if vendordata_raw contained an array.

 The implementation in existing datasources means that vendordata_raw is
 not "raw" as it ideally would be. Instead, actual values may include
 bytes, string or list.  If the value was a list, then the attempt to
 persist that data to a file in '_store_rawdata' would raise a
 TypeError.

 The change is to encode with util.json_dumps (which is safe for
 binary data) before writing.

 Co-authored-by: Eric Benner <ebenner@vultr.com>

We want the commit message to have less than 74 chars per line.

Thanks for your patience.

Looks like this was merged, did you still need that?

@TheRealFalcon
Copy link
Member

Looks like this was merged, did you still need that?

Sorry, no. I merged with the suggested commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants