Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate rules and sentences in compiled output #3906

Open
ehildenb opened this issue Jan 16, 2024 · 12 comments
Open

Duplicate rules and sentences in compiled output #3906

ehildenb opened this issue Jan 16, 2024 · 12 comments

Comments

@ehildenb
Copy link
Member

Using the following definition:

module TEST
    imports BOOL
    imports INT

    configuration <k> $PGM:Pgm </k>

    syntax Pgm ::= Bool | Int

endmodule

module TEST-SYMBOLIC [kore, symbolic]
    imports TEST

    rule A andBool B => B andBool A [simplification]
endmodule

module TEST-LLVM [symbolic]
    imports TEST-SYMBOLIC
endmodule

I call kompile test.k, to get output test-kompiled/compiled.txt.

Then I search for a rule that looks like this (the isK sort predicate):

  rule isK(K)=>#token("true","Bool") requires #token("true","Bool") ensures #token("true","Bool") [UNIQUE_ID(16ff77cff0ef50026a8b3f4614b87bda465701918596b7ad2280baffff56f847)]

And find that there are 50 occurrences of the same rule! This is not the only sort predicate with many duplicate occurrences.

Can we eliminate these duplicates by:

  • Having the compiler do something about it? Like factor out the sort predicates into their own module that is imported where it's needed?
  • Refactoring our prelude/import structure a bit so that commonly duplicated sorts like K and KItem and List are declared early so their sort predicates only occur once?
  • Note that we have an instance where module KSEQ declares the isK predicate, and module KSEQ-SYMBOLIC imports KSEQ but still also redeclares this sort predicate! So maybe there is a way to minimize the number of sentences directly already by not regenerating sort predicates if they already exist in imported modules.

Example from compiled.txt (snipped):

module KSEQ
  imports KAST
  imports K-TOP-SORT
  syntax {Sort} Sort ::= "(" Sort ")" [applyPriority(1), bracket, bracketLabel((_)_KSEQ_Sort_Sort{Sort}), defaultBracket]
  syntax associativity left #KSequence [org.kframework.attributes.Location(Location(104,3,104,25)), org.kframework.attributes.Source(Source(/nix/store/hdn1n6x3lclacnv59xc8wzmgravia6jw-k-6.1.77-18ce58892d77b05f8aa5
  syntax associativity left #KSequence
  syntax K ::= K "~>" K [assoc, klabel(#KSequence), left, symbol, unit(#EmptyK)]
  syntax K ::= ".K" [klabel(#EmptyK), symbol, unparseAvoid]
  syntax K ::= ".::K" [klabel(#EmptyK), symbol, unparseAvoid]
  syntax K ::= "." [klabel(#EmptyK), symbol] 
    ... 
  rule isK(K)=>#token("true","Bool") requires #token("true","Bool") ensures #token("true","Bool") [UNIQUE_ID(16ff77cff0ef50026a8b3f4614b87bda465701918596b7ad2280baffff56f847)] 
    ... 
endmodule


module KSEQ-SYMBOLIC
  imports ML-SYNTAX
  imports KVARIABLE-SYNTAX
  imports KSEQ
  syntax KConfigVar ::= r"(?<![A-Za-z0-9_\\$!\\?@])(\\$)([A-Z][A-Za-z0-9'_]*)" [token]
  syntax KBott ::= KConfigVar
  syntax KBott ::= #KVariable
  syntax #KVariable ::= r"(?<![A-Za-z0-9_\\$!\\?@])(\\!|\\?|@)?([A-Z][A-Za-z0-9'_]*|_|_[A-Z][A-Za-z0-9'_]*)" [prec(1), token]
  syntax #KVariable ::= #UpperId [token] 
    ... 
  rule isK(K)=>#token("true","Bool") requires #token("true","Bool") ensures #token("true","Bool") [UNIQUE_ID(16ff77cff0ef50026a8b3f4614b87bda465701918596b7ad2280baffff56f847)] 
    ... 
endmodule
@ehildenb
Copy link
Member Author

Related: #3540

@dwightguth
Copy link
Member

The most common cause of duplicated sentences in the frontend is due to incorrectly implemented functions which call sentences rather than localSentences. It's probably worth bisecting to see if we can find a revision where this is not happening.

If it does indeed prove to be specific to sort predicates, the most likely explanation is that we're declaring the same sort in multiple modules and then combining those modules later. It might be worth making sorts unable to be redeclared explicitly or implicitly.

@ehildenb
Copy link
Member Author

Hmmmm, perhaps here we are using allSorts and should be using local sorts?

for (Sort sort : iterable(mod.allSorts())) {

@dwightguth
Copy link
Member

dwightguth commented Jan 16, 2024

That class only adds the sentence if it doesn't already exist in one of its submodules, so it should be fine. You were noticing duplicated rules in the example you mentioned above; that pass only adds productions. Did you see duplicated productions also?

@Baltoli
Copy link
Collaborator

Baltoli commented Jan 18, 2024

Related to the compiler pipeline auditing in #3891

@gtrepta
Copy link
Contributor

gtrepta commented Jan 23, 2024

I've found that the GenerateSortPredicateRules and GenerateSortPredicateSyntax passes in the pipeline are indeed iterating over sorts in supermodules and creating redundant rules/productions. Here's a diff that cuts the number of isK rules in the original example from 50 down to 11, and compiled.txt down from 1600 lines to 1114 lines:

diff --git a/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateRules.java b/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateRules.java
index 9ec11724e1..fa49185a5c 100644
 --- a/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateRules.java
 +++ b/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateRules.java
@@ -43,7 +43,7 @@ public class GenerateSortPredicateRules {
         mod.imports(),
         (Set<Sentence>)
             mod.localSentences()
-                .$bar(stream(mod.allSorts()).flatMap(this::gen).collect(Collections.toSet())),
+                .$bar(stream(mod.localSorts()).flatMap(this::gen).collect(Collections.toSet())),
         mod.att());
   }

diff --git a/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateSyntax.java b/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateSyntax.java
index f1a5b0b757..485a2c0252 100644
 --- a/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateSyntax.java
 +++ b/kernel/src/main/java/org/kframework/compile/GenerateSortPredicateSyntax.java
@@ -20,7 +20,7 @@ public class GenerateSortPredicateSyntax {

   public Module gen(Module mod) {
     Set<Sentence> res = new HashSet<>();
-    for (Sort sort : iterable(mod.allSorts())) {
+    for (Sort sort : iterable(mod.localSorts())) {
       res.addAll(gen(mod, sort));
     }
     if (!res.isEmpty()) {
diff --git a/kore/src/main/scala/org/kframework/definition/outer.scala b/kore/src/main/scala/org/kframework/definition/outer.scala
index 3afffbc43c..23f8e93e1a 100644
 --- a/kore/src/main/scala/org/kframework/definition/outer.scala
 +++ b/kore/src/main/scala/org/kframework/definition/outer.scala
@@ -342,11 +342,19 @@ case class Module(
   private def mergeAttributes[T <: Sentence](p: Set[T]) =
     Att.mergeAttributes(p.map(_.att))

+  private def sortsFromSentences(s: Set[Sentence]) =
+    s.collect {
+      case SyntaxSort(_, sort, _)                                      => sort
+      case p @ Production(_, _, sort, _, _) if !p.isSortVariable(sort) => sort
+    }
+
   lazy val definedSorts: Set[SortHead] = (productions
     .filter(p => !p.isSortVariable(p.sort))
     .map(_.sort.head)) ++ (sortDeclarations.filter(s => s.params.isEmpty).map {
     _.sort.head
   }) ++ definedInstantiations.values.flatten.flatMap(_.params).filter(_.isNat).map(_.head)
+  lazy val importedSorts: Set[Sort] = sortsFromSentences(importedSentences)
+  lazy val localSorts: Set[Sort]    = sortsFromSentences(localSentences) -- importedSorts
   lazy val definedInstantiations: Map[SortHead, Set[Sort]] = {
     val nonempty = ((productions
       .filter { p =>

However, this solution doesn't work at the moment. If the definition uses any parameterized sorts like MInt then it breaks, and I'm still investigating why. An exception gets thrown during the llvm backend's matching tree generation for a missing SortWidth key, presumably the Width parameter in MInt{Width} is somehow making its way into the localSorts val that I created.

@Baltoli
Copy link
Collaborator

Baltoli commented Jan 23, 2024

@gtrepta there is an allSorts method that may not collect the sort parameters

@gtrepta
Copy link
Contributor

gtrepta commented Jan 25, 2024

Drafted a PR for this ^^^.

There are still a number of duplicated rules after this, one reason for this is because a sort will be declared locally in multiple modules but not in any of those modules' imports, and the pipeline doesn't have a way to know whether or not the sort predicates in any of those modules will be necessary.

Additionally, there are stages in the pipeline before sort predicate generation that add sort declarations of common sorts like K to the module and predicates for those will be added for the reason above.

Interestingly, I also noticed that the pipeline is doing the predicate generation and some other stages more than once:

.andThen(generateSortPredicateSyntax)
.andThen(generateSortProjections)
.andThen(constantFolding)
.andThen(propagateMacroToRules)
.andThen(expandMacros)
.andThen(checkSimplificationRules)
.andThen(guardOrs)
.andThen(AddImplicitComputationCell::transformDefinition)
.andThen(resolveFreshConfigConstants)
.andThen(resolveFreshConstants)
.andThen(generateSortPredicateSyntax)
.andThen(generateSortProjections)

generateSortPredicateSyntax and generateSortProjections are listed twice in this snippet. I played around with removing the duplicates and moving them around but it would just break the compiler. Looks like something to look into in your audit of the pipeline, @Baltoli.

@Baltoli
Copy link
Collaborator

Baltoli commented Jan 25, 2024

Interesting, well spotted @gtrepta!

rv-jenkins added a commit that referenced this issue Jan 26, 2024
…3923)

For #3906

This cuts down on the number of duplicated sort predicate rules by only
generating ones for sorts that are declared locally in a module and not
anywhere in its imported modules.

Co-authored-by: rv-jenkins <admin@runtimeverification.com>
@gtrepta
Copy link
Contributor

gtrepta commented Feb 5, 2024

Unassigning myself after finishing my PR for this. There is still some work left to do based on my comment above.

@gtrepta gtrepta removed their assignment Feb 5, 2024
@Baltoli
Copy link
Collaborator

Baltoli commented Feb 6, 2024

The work remaining here is to work out which passes in the compilation pipeline are declaring rules / productions over (say) the K sort; the compiler can't globally identify where there are redundancies and so ends up repeating work.

@Baltoli
Copy link
Collaborator

Baltoli commented Mar 26, 2024

Moving to the backlog for now as we've made some headway, but will need more of the ongoing work on the compiler pipeline to be resolved to make further progress here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants