Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frontend code cleanup + audits #3891

Open
Baltoli opened this issue Dec 22, 2023 · 1 comment
Open

Frontend code cleanup + audits #3891

Baltoli opened this issue Dec 22, 2023 · 1 comment

Comments

@Baltoli
Copy link
Collaborator

Baltoli commented Dec 22, 2023

Introduction

As discussed in our year-end review, we want to be making a push towards simplicity and removal of un-needed code across the various K projects in 2024.

The biggest offender in this respect is the K Frontend; this project has been maintained continuously for the best part of 15 years at this point and so presents the greatest opportunity to remove code that is no longer needed. Actionable steps that we identified in the review meeting were to:

  • Run some kind of Java code coverage tool over the set of large semantics & regression tests that we maintain; this will give us an idea if there are non-statically dead code paths that never get run in practice.
  • Schedule a review meeting with relevant parties to comb through the code tree together in a meeting.
    • Important to make sure that we record the outcome of this as documentation.
    • I have invited @dwightguth and @Scott-Guest initially; we can grow this group if desirable.

Other bits of knowledge transfer that we might want to do as part of this process:

K Frontend Deep Dive (17 Jan 24)

We went through the K Frontend in some detail in a meeting; a brief summary of the main points that we identified there were:

  • The kore module is poorly named as a consequence of history; it should be renamed now that KORE refers to an actual language.
    • We may be able to fully remove this module in the near future; the main reason that it exists is to hive away the Scala code in the frontend.
    • Once we can Update JDK to LTS 21 #3699, it should be possible to remove the Scala code entirely by using the new Java pattern matching features.
    • This is blocked only on updating our build and release infrastructure to Ubuntu Noble (due April 2024), so I think we can wait for that update to go through before taking a further look at the Scala code.
  • There are a number of different data structures defined in the frontend across several different locations:
    • The "kore" ADT, which is actually a representation of K's inner syntax1. The compiler pipeline that lowers a K definition is implemented over this representation.
    • The TreeNodes structure that is generated by the parser; this is the parse tree precursor to the "kore" structure's AST.
    • A representation of what we refer to today as KORE (i.e. the language understood by the various backends). This representation is used in a few code paths (kast conversions from KORE, the LLVM backend's pattern matching compiler2).
    • A largely-legacy format called KIL that today is only really used as an intermediate step between the parser and our inner syntax representation.
  • There may be some code left over from the legacy (pre kprovex) Java prover that can be removed.
  • Can we make some progress on Coupling the compilation of Kore backends #1989? The main blockers there are:
    • Conditional compilation steps depending on the backend; these should be factored out and any remaining differences in behaviour somehow pushed into the KORE.
    • Tools understanding new directory layouts.
  • There are duplicated implementations of the K unparser in the frontend and LLVM backend. We can perhaps try to unify these two implementations; this will need a bit of an audit to see where the Java path can be used without the LLVM backend being present.
  • We have a binary format that serializes inner syntax. Does anything ever actually use this?
  • There are a couple of JavaCC parsers that we might be able to deprecate as well (for an outdated representation of K inner syntax - confusingly referred to here as KAST - and for output from Z3).
  • One place that complexity gets hidden inside the compiler is in the various compilation pipelines. It's not very clear from reading these what the ordering / dependencies between phases are, or how the intermediate code looks after each one. We should do a focused cleanup and documentation effort on these:

K Distribution Deep Dive (24 Jan 24)

A summary of the main points covered when doing our deep dive into the broader K distribution beyond the core Java code:

  • Mechanisms for loading backends; dynamic / reflection based registration and lookup for these to allow them to be developed semi-separately to the rest of the K repository.
    • Logic for discarding modules relevant should we begin to look at coupled compilation.
  • Possible locations of dead code and other cruft:
  • We should document the compiler <> standard library glue in kast.md better
  • Document the Makefile infrastructure that runs the test suite
  • Document the packaging work that we do
  • Down the line, we might consider separate sessions for Nix and CI setup if there's interest

LLVM Backend Deep Dive (31 Jan 24)

upcoming

Footnotes

  1. An interesting question would be to compare this inner syntax representation to Pyk's. What representations / decisions did we do differently in each place? Did we learn any lessons representing K inner syntax in Pyk that could be backported to the frontend?

  2. Maybe this code could be pulled out into a separate library? The deployment of these data structures caused us some weirdnesses when we needed to change them during the change to make \and and \or n-ary.

@Baltoli Baltoli changed the title Dead or near-dead frontend code audit Frontend code cleanup + audits Jan 18, 2024
@Baltoli Baltoli pinned this issue Jan 26, 2024
rv-jenkins pushed a commit that referenced this issue Feb 8, 2024
As part of the cleanup discussed in #3891, this PR removes statically
dead code as identified by IntelliJ.

IntelliJ's analysis reports some false positives, e.g. with dependency
injection and usages in Scala code, so I had to manually review and
delete everything. It also seems to miss some dead Scala code.

The "Batch N" commits just delete dead code, while the following commits
perform some easy simplifications I noticed.
@Baltoli
Copy link
Collaborator Author

Baltoli commented Mar 26, 2024

Moving this out of the main overview section as it's run its course wrt. active development, but we might still want to take a look at it at some point.

@PetarMax PetarMax unpinned this issue Apr 25, 2024
@PetarMax PetarMax pinned this issue Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant