Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty profile is parsed as a package with an empty name #593

Closed
ppisar opened this issue Apr 13, 2022 · 6 comments
Closed

Empty profile is parsed as a package with an empty name #593

ppisar opened this issue Apr 13, 2022 · 6 comments
Assignees

Comments

@ppisar
Copy link
Collaborator

ppisar commented Apr 13, 2022

Having this document (empty profiles are legal https://docs.fedoraproject.org/en-US/modularity/policies/packaging-guidelines/#_profiles):

document: modulemd
version: 2
data:
    name: test
    stream: A
    summary: Test.
    description: >
        Test.
    license:
        module: [ MIT ]
    profiles:
        profileA:
          rpms:

reading and writing results into:

document: modulemd
version: 2
data:
  name: test
  stream: "A"
  summary: Test.
  description: >
    Test.
  license:
    module:
    - MIT
  profiles:
    profileA:
      rpms:
      -

This is a bug on parser side:

$ cat /tmp/test.py 
#!/usr/bin/python3
import gi
gi.require_version('Modulemd', '2.0')
from gi.repository import Modulemd

module = Modulemd.read_packager_file('/tmp/out')
print("name={}".format(module.get_module_name()))
profile = module.get_profile('profileA')
print("profile={}".format(profile.get_name()))
packages = profile.get_rpms()
for package in packages:
    print("<{}>".format(package))
[test@fedora-37 libmodulemd-devel]$ GI_TYPELIB_PATH=/tmp/build/modulemd LD_LIBRARY_PATH=/tmp/build/modulemd PYTHONPATH=$PWD python3 /tmp/test.py 
name=test
profile=profileA
<>

Possibly all rpms lists are affected.

@ppisar ppisar self-assigned this Apr 13, 2022
@ppisar
Copy link
Collaborator Author

ppisar commented Apr 13, 2022

The cause is that libyaml returns YAML_SCALAR_EVENT for "" instead of YAML_MAPPING_END_EVENT:

(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parser event: YAML_SCALAR_EVENT: profileA
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: TRACE: Entering modulemd_profile_parse_yaml
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parser event: YAML_MAPPING_START_EVENT
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parser event: YAML_SCALAR_EVENT: rpms
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parser event: YAML_SCALAR_EVENT:
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parsing scalar:   
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: Parser event: YAML_MAPPING_END_EVENT
(modulemd-validator:4969): libmodulemd-DEBUG: 16:52:57.284: TRACE: Exiting modulemd_profile_parse_yaml

That's probably in line with YAML. I think that YAML represents empty/undefined/missing mapping value node for a mapping key by an empty scalar. I will have to check YAML specification. Otherwise, it's a bug in libyaml.

Then there is a special handling for this in modulemd_yaml_parse_string_set():

        case YAML_SCALAR_EVENT:
          g_debug ("Parsing scalar: %s",
                   (const gchar *)event.data.scalar.value);
          g_hash_table_add (result,
                            g_strdup ((const gchar *)event.data.scalar.value));

          if (!in_list)
            {
              /* We got a scalar instead of a sequence. Treat it as a list with
               * a single entry
               */
              done = TRUE;
            }
          break;

which is responsible for the bogus empty package name.

@sgallagher
Copy link
Collaborator

Your original document is faulty.

  profiles:
    profileA:
      rpms:
      -

Literally says that there is one entry under rpms: and it's an empty scalar.

What you actually want is:

  profiles:
    profileA:
      rpms: []

This will give you a zero-length array, which will behave as you expect. The docs may need to clarify this, but the way it's behaving is as expected.

@ppisar
Copy link
Collaborator Author

ppisar commented Apr 13, 2022

My original document is without the trailing hyphen:

  profiles:
    profileA:
      rpms:

@sgallagher
Copy link
Collaborator

sgallagher commented Apr 13, 2022

That's still incorrect, because we expect a YAML_SEQUENCE_START_EVENT here and we get nothing. The libyaml is probably just trying to treat the lack of content here as a zero-length scalar, but that's pretty much just trying to avoid aborting on an invalid syntax.

@ppisar
Copy link
Collaborator Author

ppisar commented Apr 13, 2022

I'm not sure whether it's invalid syntax. E.g. Perl YAML::XS is fine with it and reports the missing value as undef:

$ perl -MYAML::XS -MData::Dumper -e 'print Data::Dumper::Dumper(YAML::XS::LoadFile(q{/tmp/out}))'
$VAR1 = {
          'document' => 'modulemd',
          'data' => {
                      'summary' => 'Test.',
                      'stream' => 'A',
                      'license' => {
                                     'module' => [
                                                   'MIT'
                                                 ]
                                   },
                      'profiles' => {
                                      'profileA' => {
                                                      'rpms' => undef
                                                    }
                                    },
                      'description' => 'Test.
',
                      'name' => 'test'
                    },
          'version' => 2
        };

@sgallagher
Copy link
Collaborator

sgallagher commented Apr 13, 2022

OK, https://yaml.org/spec/1.2.2/#72-empty-nodes says that they are technically valid and are to be interpreted as a zero-length scalar, which is commonly treated as synonymous with a "null" value.

It's still a violation of the libmodulemd specification which requires a sequence here, but I suppose we can be overly-cautious and catch this unusual case. If we get a zero-length scalar here, just treat it as a zero-length array of rpm names.

ppisar added a commit to ppisar/libmodulemd that referenced this issue Apr 14, 2022
Parsing a profile with an undefined rpms list:

  document: modulemd
  version: 2
  data:
      name: test
      stream: A
      summary: Test.
      description: >
          Test.
      license:
          module: [ MIT ]
      profiles:
          profileA:
            rpms:

lead to a nonempty list containing an empty-name package:

  profiles:
    profileA:
      rpms:
      -

while user probably wanted:

  profiles:
    profileA:
      rpms: []

This patch changes interpretation of the top-most excerpt from the
middle one to the last one.

fedora-modularity#593
ppisar added a commit to ppisar/libmodulemd that referenced this issue Apr 14, 2022
Parsing a profile with an undefined rpms list:

  document: modulemd
  version: 2
  data:
      name: test
      stream: A
      summary: Test.
      description: >
          Test.
      license:
          module: [ MIT ]
      profiles:
          profileA:
            rpms:

lead to a nonempty list containing an empty-name package:

  profiles:
    profileA:
      rpms:
      -

while user probably wanted:

  profiles:
    profileA:
      rpms: []

This patch changes interpretation of the top-most excerpt from the
middle one to the last one.

fedora-modularity#593
@ppisar ppisar closed this as completed Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants