Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify register kinds and regions in BIR #995

Open
1 of 2 tasks
ushirask opened this issue Apr 20, 2022 · 16 comments
Open
1 of 2 tasks

Verify register kinds and regions in BIR #995

ushirask opened this issue Apr 20, 2022 · 16 comments

Comments

@ushirask
Copy link
Contributor

ushirask commented Apr 20, 2022

Description:

Verify register kinds and regions in the BIR verfier

Check if

  • Temp registers are assigned once and only once as a result of an instruction.

  • List/map operand in ListSetInsn / MappingSetInsn is not a final register.

  • Temp registered are assigned in a previous instruction in the block before being used as a operand in another instruction.

  • Narrow register is a subtype of underlying unnarrowed register.

  • Verify regions

Check if

  • All blocks in the function code can be reached from block 0 via forward edges
  • Backward branches must point to a block that is a loop head
  • Region records are created for each single-entry single-exit region in the function code
  • Graphs formed by forward edges must be acyclic

Suggested Labels:

Suggested Assignees:

Affected Product Version:

OS, DB, other environment details and versions:

Steps to reproduce:

Related Issues:

@jclark
Copy link
Contributor

jclark commented May 2, 2022

Graphs are acyclic along forward edges and along backward edges

What does this mean? It's the graph formed by the forward edges that is acyclic.

@jclark
Copy link
Contributor

jclark commented May 2, 2022

All blocks in the function code can be reached when navigating the graph

Reached from where? Must be reachable via forward edges only.

@jclark
Copy link
Contributor

jclark commented May 2, 2022

Backward branches will only be allowed in a region that is a loop

Point is that backward branch must point to a block that is the entry of a loop.

@jclark
Copy link
Contributor

jclark commented May 2, 2022

Region records are created for each single-entry single-exit region in the function code

Really? I don't think so. How are you going to find every single-entry/single-exit region.

@ushirask
Copy link
Contributor Author

ushirask commented May 2, 2022

Region records are created for each single-entry single-exit region in the function code

Really? I don't think so. How are you going to find every single-entry/single-exit region.

Can't we determine the location of single-entry/single-exit regions with conditional branch instructions?

@ushirask
Copy link
Contributor Author

ushirask commented May 3, 2022

When the graph created by edges in the function code diverges into a different path and eventually rejoin, such a branching is considered a region to be used for WASM.

@jclark
Copy link
Contributor

jclark commented May 4, 2022

@ushirask Does it mean that every conditional branch is the entry point of a single-entry, single-exit region? Or does it mean every place that is a join has to be the exit point of a region? Or something else?

@manuranga
Copy link
Contributor

We had a chat yesterday and concluded, yes, every conditional should be a region entry, but exit should only appear when all the paths that diverged at the entry are merged. Given that Ballerina is a structured language that may mean every merge, but Ushira need to find out how thereturn, break, continue affects this.

Definition is not finalized since we found some complicated cases where it was not clear to us how the regions should be created.
Eg:

import ballerina/io;

public function main() {
    int i = 6;
    while i < 10 {
        io:println(i);
        if i == 8 {
            break;
        }
        i += 1;
    }
    io:println(true);
}

Current algorithm gives:

[{"entry":1,"exit":3,"parent":null,"kind":"REGION_LOOP"},{"entry":2,"exit":5,"parent":0,"kind":"REGION_COND"}]

We'll have another chat with Poorna to see if this is what is expected.

@jclark
Copy link
Contributor

jclark commented May 4, 2022

We had a chat yesterday and concluded, yes, every conditional should be a region entry

Sounds plausible.

but exit should only appear when all the paths that diverged at the entry are merged.

Otherwise it wouldn't be single exit, right?

@manuranga
Copy link
Contributor

but exit should only appear when all the paths that diverged at the entry are merged.

Otherwise it wouldn't be single exit, right?

Correct.

@manuranga
Copy link
Contributor

manuranga commented May 4, 2022

We had a long chat today. We came to some rough conclusions. After some reflection, I have simplified them as follows.

  • A given block is a region's entry if and only if:

    • If it is a target of a backward branch. In this case the region kind is REGION_LOOP.
      Else
    • If its terminator is a conditional, unless one (or both) of its paths solely leads to the region's or ancestor's exit (without going through the entry) or returns. [1]
      In this case the region kind is REGION_COND.
  • To define exit, let's consider all paths originating from entry, ignoring the paths that return:

    • For REGION_LOOP:
      • If all paths lead back to entry, exit is nil.
      • Otherwise, exit is the first acyclic block where all the other paths merge.
    • For REGION_COND:
      • exit is the first block where all paths merge excluding any path that lead to an ancestor's exit.

[1] Latter half of the 2nd requirement (following "unless") is not currently implemented. As far as I can see, it doesn't affect the correctness of the wback. But I think we should fix it.

@manuranga
Copy link
Contributor

manuranga commented May 6, 2022

Though about the above "unless ..." part. But It's only complicated because I am explaining how it should appear in the BIR. If we think about Ballerina code, it's an if where only one or more arms doesn't complete normally. In other words, if it really has a merge.

So I think generating it will be trivial🤞. Verification may take a bit of work.

It is correct even without "unless ..." part. But since it's trivial to generate and harder to recover in the backend, I suggest we'll include that as a part of the definition.

@ushirask
Copy link
Contributor Author

ushirask commented May 12, 2022

In the given graph according to the above definition the loop region starting from bb1 would be 1 to 3. However the region currently created in the frontend is from 1 to 4, which also seems to be allowed by WASM. We can either,
  1. Modify the region creation in the frontend to generate the 1 to 3 region
  2. Modify the definition to include that any block after the current exit, as long as it can be only reached via that, can also be considered an exit
public function main() {
    int i = 0;
    while true {
        if !(i < 3) {
            break;
        }
        io:println(i);
        i = i + 1;
    }
}

@manuranga
Copy link
Contributor

LLVM's loop definition has multiple exit blocks https://llvm.org/docs/LoopTerminology.html#terminology
In our exit block is merging of what LLVM calls exit blocks.

@ushirask
Copy link
Contributor Author

To define exit, let's consider all paths originating from entry, ignoring the paths that return:

* For `REGION_LOOP`:
  
  * If all paths lead back to `entry`, `exit` is nil.
  * Otherwise, `exit` is the first acyclic block where all the other paths merge.
  • If paths don't merge exit is nil.

@ushirask
Copy link
Contributor Author

Tmp registers should only be used within the block it was initialized, but this is not currently correct in the front end and should be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants