OID as names and multiple modules #72

vpiserchia · 2021-01-09T18:18:20Z

Allow user to specify two new arguments

--skip-missing: will ignore missing identifier not defined in the current module
--use-oid: use OID for creating the resulting file. ONLY in conjuction of --split

Together these two new switch will allow working with multiple modules using a coherent naming schema

- --skip-missing: will ignore missing identifier not defined in the current module - --use-oid: use OID for creating the resulting file. ONLY in conjuction of --split Together these two new switch will allow working with multiple modules using a coherent naming schema

kimgr

Thank you for this!

Can I ask that you split the two changes into discrete commits? The --skip-missing in one and the --use-oid in one (unless they're interdependent, but it doesn't look like it).

Some more review feedback inline.

kimgr · 2021-01-10T10:42:42Z

asn1ate/pyasn1gen.py

+
+        for imp, values in self.sema_module.imports.imports.iteritems():
+            self.writer.write_line('from _%s import %s' %
+                    ('_'.join([str(d.number) for d in imp.descendants() if isinstance(d, NameAndNumberForm)]),', '.join(map(_sanitize_identifier, values))))


Please use two temporaries for these expressions;

for imp, values in self.sema_module.imports.imports.iteritems(): module = '_'.join([str(d.number) for d in imp.descendants() if isinstance(d, NameAndNumberForm)]) symbols = ', '.join(map(_sanitize_identifier, values)))) self.writer.write_line('from _%s import %s' % (module, symbols)

kimgr · 2021-01-10T10:43:38Z

asn1ate/pyasn1gen.py

+
+        for imp, values in self.sema_module.imports.imports.iteritems():
+            self.writer.write_line('from _%s import %s' %
+                    ('_'.join([str(d.number) for d in imp.descendants() if isinstance(d, NameAndNumberForm)]),', '.join(map(_sanitize_identifier, values))))


No need for list literal inside join([...]), join works on a generator expression too.

kimgr · 2021-01-10T10:44:50Z

asn1ate/pyasn1gen.py

+
+        for imp, values in self.sema_module.imports.imports.iteritems():
+            self.writer.write_line('from _%s import %s' %
+                    ('_'.join([str(d.number) for d in imp.descendants() if isinstance(d, NameAndNumberForm)]),', '.join(map(_sanitize_identifier, values))))


I tend to prefer generator expressions over map, but I guess that expression becomes more complicated as join(_sanitize_identifier(v) for v in values). Actually, it doesn't look so bad, let's use that.

kimgr · 2021-01-10T10:46:09Z

asn1ate/pyasn1gen.py

@@ -129,6 +129,10 @@ def __init__(self, sema_module, out_stream, referenced_modules):
    def generate_code(self):
        self.writer.write_line('# %s' % self.sema_module.name)
        self.writer.write_line('from pyasn1.type import univ, char, namedtype, namedval, tag, constraint, useful')
+
+        for imp, values in self.sema_module.imports.imports.iteritems():


Does this even work without --use-oid?

kimgr · 2021-01-10T10:47:45Z

asn1ate/pyasn1gen.py

+
+        for imp, values in self.sema_module.imports.imports.iteritems():
+            self.writer.write_line('from _%s import %s' %
+                    ('_'.join([str(d.number) for d in imp.descendants() if isinstance(d, NameAndNumberForm)]),', '.join(map(_sanitize_identifier, values))))
        for module in self.referenced_modules:


Seeing this, I'm wondering why the OID-named module is not part of self.referenced_modules... Would it be possible to fold it into the same infrastructure? I don't remember the design here, so it might be impossible/messy, but it looks like it might fit better there.

need to make some experiments

I haven't found a quick way to solve this, help needed here

I looked through this, and it seems like referenced_modules is the "module corpus" being defined by the ASN.1 input (i.e. there can be more than one module definition in the same input file).

So prior to your changes, we would only allow imports and type resolution between fully defined modules.

The pyasn1 generation would also generate a distinct Python module for each ASN.1 module. Since the ASN.1 was free to refer to types and values using e.g. Module2.MyType from Module1.MyConstructedType, the generated Python would add imports of all modules in the "corpus", so they were available.

Now you're trying to introduce something similar for IMPORTS. I think the difference is that there's not necessarily any ASN.1 definition available to asn1ate for an imported construct (there would need to exist some way to map an ASN.1 module ref to an external ASN.1 file, which would also need to be parsed, etc. None of this currently exists).

So I'm a little curious how well this use_oid flag generalizes. What happens for the existing tests in testdata/imports.asn? My guess is we will codegen imports of OID-named Python modules that don't exist (e.g. _2_5_1_0_4.py for ImportMultipleWithAssignedOID in that test file).

And not all IMPORTS are NameAndNumberForm, so the module names will not necessarily be complete/correct.

Hmm. This is all pretty complex, which is probably why I never got around to implementing it properly.

It seems to me your changes must work for some particular case -- do you have a (possibly anonymized) ASN.1 input you could share that requires these changes and for which the generated pyasn1 is functional?

my use case is to allow building a module without this having all the declaration available in the same file; now I see I misused the tool and this creates more problems than it solves (for my lack of knowledge). That said I think it's better to close this PR, unless you see anything useful.

I'm thinking something must have gone right :). I like the use case of half-translating ASN.1 and leave some manual fixup to the user. It's much easier than doing it all by hand. But we want to make sure we don't silently generate something that's broken or make asn1ate more self-inconsistent.

kimgr · 2021-01-10T10:48:41Z

asn1ate/pyasn1gen.py

@@ -668,7 +672,10 @@ def main(args):

    for module in modules:
        if args.split:
-            outfile = _sanitize_module(module.name) + '.py'
+            if args.use_oid:
+                outfile = "_" + "_".join(module.oid) + '.py'


Add a _sanitize_oid?

kimgr · 2021-01-10T10:49:13Z

asn1ate/pyasn1gen.py

@@ -694,6 +701,10 @@ def main_cli():
                            help='output multiple modules to separate files')
    arg_parser.add_argument('--include-asn1', action='store_true',
                            help='output ASN.1 source as part of generated code')
+    arg_parser.add_argument('--use-oid', action='store_true',
+                            help='output module using the OID')


More like name output module using the OID, right?

kimgr · 2021-01-10T10:52:20Z

asn1ate/sema.py

    """ Build a semantic model of the ASN.1 definition
    from a syntax tree generated by asn1ate.parser.
    """
+    global skip_missing


I'm not too happy about the global, but I can't think of a nice way to trickle the flag down all the way to where it's used. 🤷, it's fine, I guess.

Actually, since it's only used in resolve_type_decl, why not make it an argument to that function instead, and keep it as an instance member in the Pyasn1Backend?

instance member is fine, I'll fix in the next version of the PR

resolve_type_decl is part of the sema.py not defined in Pyasn1Backend, cannot find a better way to accomplish this

# sema.py class Module(SemaNode): ... def resolve_type_decl(self, type_decl, referenced_modules, skip_missing=False): ...

# pyasn1gen.py class Pyasn1Backend(object): ... def __init__(self, ..., skip_missing=False): self.skip_missing = skip_missing def some_other_method(self): self.sema_module.resolve_type_decl(..., self.skip_missing) def generate_pyasn1(sema_module, out_stream, referenced_modules, header=None, footer=None, skip_missing=False): ...

kimgr · 2021-01-10T10:57:36Z

asn1ate/sema.py

@@ -302,6 +307,12 @@ def resolve_type_decl(self, type_decl, referenced_modules):
                raise Exception('Unrecognized referenced module %s in %s.' % (type_decl.module_ref.name,
                                                                              [module.name for module in
                                                                               referenced_modules]))
+
+            if type_decl.type_name not in module.user_types():


So what are the semantics of --skip-missing? Maybe it would be nice to describe it in a comment here.

It looks a little dangerous to return None here, because there are several places in pyasn1gen where the resolved type is used without None- or type checking.

+1 for comment.

please suggest a safer return value

I'm not sure there's a safer value as such -- the problem is that all callers need to have a way to deal with a missing type here now. Search for resolve_type_decl in pyasn1gen.py. I don't know what a code generator should do if the semantic model has holes in it like this.

kimgr · 2021-01-10T10:59:54Z

asn1ate/sema.py

+
+            if type_decl.type_name not in module.user_types():
+                if skip_missing:
+                    for k, value in self.imports.imports.items():


Particularly, I don't understand this part. We only skip it if it's part of the imports?

that's the idea, we skip IFF the declared type was in the IMPORTS section of the ASN.1 schema otherwise it will keep the same behaviour as before, e.g. raising an KeyError exception.

I don't really understand why, though. It's been a couple of years since I had my head in ASN.1, so I'm a little rusty...

I can see that non-imported types (defined in the same module) can never be missing. But in a correct multi-module schema, the imported ones should never be missing either, right? If a comment would say # HACK: Can't figure out imported symbols so ignore them for now that would tell one story. If it would say # It's OK for imported symbols to be missing because XYZ, it would be easier to understand depending on what XYZ says :)

maybe this would solve a use case like: no correct multi-module schema is available and the user still wants to generate a code that he can after manually change to resolve the imports?

Yeah, I think that makes sense. I wonder if you could substitute something like ASN.1 ANY to indicate that there's a hole in the schema, and print a warning that the original type is unknown.

what I can think of is to collect together all the missing referenced identifier (from all the modules in the corpus) and add them to the end of the generated output; e.g. if IPAddress is referenced in the schema but imported from a missing ASN.1 (let's call it missingschema) , then asn1ate could generate:

try: from missingschema import IPAddress except ImportError: from pyasn1.type import univ def IPAddress(univ.Any): pass

Shouldn't you be able to know that missingschema is unavailable already at codegen time? Doesn't seem very useful to force an ImportError. But on the other hand it's actually a nice visual indicator in the generated code that something is off.

sorry for the delay, I had some urgencies..
did you make your mind clear about this?

No, I've also been busy with other things. It would be nice to just conclude on something, but I think it takes some iteration to figure out what the best solution is.

vpiserchia · 2021-01-10T11:14:34Z

I'll wait for your comments where requested and going to close this PR for a v2.

kimgr · 2021-01-10T11:25:23Z

I'll wait for your comments where requested and going to close this PR for a v2.

Don't close it; just rewrite the branch in place and force-push to the existing PR branch using git push -f.

…-missing

vpiserchia mentioned this pull request Jan 9, 2021

interdependent asn1 files to py files #71

Closed

kimgr reviewed Jan 10, 2021

View reviewed changes

vitopiserchia added 7 commits January 10, 2021 16:33

skip missing imports IFF they are IMPORTS using the new switch --skip…

9599770

…-missing

generate import statements for IMPORTS

91ed499

use OID for imports and output file

dd2fc45

pass the use_oid arguments to the code generator

e718734

simplify the code for skip missing declaration

7393cf4

fix typo

a5dcbfb

rework the passing of the skip_missing arguments

037387f

OID as names and multiple modules #72

Are you sure you want to change the base?

OID as names and multiple modules #72

Conversation

vpiserchia commented Jan 9, 2021

kimgr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimgr Jan 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kimgr Jan 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vpiserchia commented Jan 10, 2021

kimgr commented Jan 10, 2021

kimgr Jan 10, 2021 •

edited

Loading

kimgr Jan 10, 2021 •

edited

Loading