Skip to content

Conversation

vecchiot-aws
Copy link
Contributor

Issue #, if available:

Description of changes:

This PR introduces the ability to specify a source file language in order to have language-specific functionality. The language is set with a flag, and defaults to C to allow currently existing uses of cbmc-viewer to continue functioning the same as before.

This PR additionally introduces language-specific regex patterns for pretty names in order to fix some issues with trace visualizations in Rust.

The two changes are coupled together in a single PR in order to discuss the design of the new Language class.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@vecchiot-aws vecchiot-aws force-pushed the source_language_flag branch from a417f67 to 33d3b02 Compare June 18, 2021 19:03
)
return parser

def source_language(parser):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we have mixed language code, e.g. Rust that calls C, which is then linked together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to approach this. We could try to infer source language from the file extension, but if we want some way of explicitly passing it in, we'd need to pass in a mapping from file to language, which seems clunky.

Language.C: C,
Language.RUST: Rust
}
return name_map[name]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the language isn't in, do we want to return None or throw an exception

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was meant to act like an enum, in which case a missing look-up shouldn't happen. I'm not sure which is better in that case (None is more generally usable, but more prone to bugs in Python).

Copy link
Contributor

@markrtuttle markrtuttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to discuss the Language class.

'--source-language',
default=Language.C,
choices=Language.ALL_LANGUAGE_NAMES,
help='Set the source language of the analyzed file.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "The source language compiled to produce the goto binary."

logging.warning("Use xml or json instead of text for "
"better results: %s", filenames)

def set_default_language(args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function appears to be unnecessary, since you set a default and you list all possible choices, so python itself seems to do the work of this function.

return cls._default_language

@classmethod
def set_default_language(cls, lang):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this function called? I'm not sure this style of using classes and subclasses is pythonic.

# specific language implementations

@classmethod
def get_name(cls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the "get_" prefix since you always invoke them with full qualification, and since they really could be class attributes. And maybe they should be (initialized by the subclass init that I expected to see).

return '^([a-zA-Z0-9_<>: ]+)$', 1

@staticmethod
def get_symbol_regex():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure the symbol name and pretty name have different regular expressions? Are you sure there are no prefixes like "tag-" and "struct-" that should be considered?

if match:
return match.group(2)
# Match the shape of the input format
line_match = re.match("^Symbol\.\.\.\.\.\.: (.*)$", sym)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to use the original construct modified to be sym.strip.split(":", 1)[-1]. This would also allow removing the following if statement.

if re.match('^[a-zA-Z0-9_]+$', name):
return name
# Match the shape of the input format
line_match = re.match("^Pretty name\.: (.*)$", sym)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to use the original construct modified to be sym.strip.split(":", 1)[-1]. This would also allow removing the following if statement.


@staticmethod
def get_pretty_name_regex():
return '.*[^\S]([a-zA-Z0-9_]+)$', 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from the regexp you are replacing. Please confirm that the C regexps match the regexps you are replacing.

return cls._default_language

@classmethod
def get_default_language_or_fail(cls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete get_default_language and make this get_default_language.

@vecchiot-aws vecchiot-aws force-pushed the source_language_flag branch from 20a56df to d1c1ec6 Compare June 22, 2021 20:03
@vecchiot-aws vecchiot-aws force-pushed the source_language_flag branch from d1c1ec6 to 3d5bc8a Compare June 22, 2021 21:16
markrtuttle pushed a commit to markrtuttle/aws-viewer-for-cbmc that referenced this pull request Aug 5, 2021
This commit has the effect of doubling the number of C symbols since
the list now includes fully-qualified symbols like function::1::x that
denotes local variable x in scope 1 of function.  But this doesn't
change the html generated by viewer for C code.

Proposing this pull request in favor of
model-checking#42
@markrtuttle
Copy link
Contributor

Please consider a simpler solution #48 and try it out on Rust.

@markrtuttle
Copy link
Contributor

Closing in favor of #48

@markrtuttle markrtuttle closed this Aug 6, 2021
markrtuttle pushed a commit that referenced this pull request Aug 6, 2021
This commit has the effect of doubling the number of C symbols since
the list now includes fully-qualified symbols like function::1::x that
denotes local variable x in scope 1 of function.  But this doesn't
change the html generated by viewer for C code.

Proposing this pull request in favor of
#42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants