Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partly out of range part-select gives wrong runtime result #1027

Open
veripoolbot opened this issue Jan 24, 2016 · 9 comments
Open

Partly out of range part-select gives wrong runtime result #1027

veripoolbot opened this issue Jan 24, 2016 · 9 comments

Comments

@veripoolbot
Copy link

@veripoolbot veripoolbot commented Jan 24, 2016


Author Name: Johan Bjork
Original Redmine Issue: 1027 from https://www.veripool.org
Original Date: 2016-01-24


According to 1800-2012:
"A part-select that addresses a range of bits that are completely out of the address bounds of the vector, packed array, packed structure, parameter or concatenation, or a part-select that is x or z shall yield the value x when read and shall have no effect on the data stored when written. Part-selects that are partially out of range shall, when read, return x for the bits that are out of range and shall, when written, only affect the bits that are in range." (11.5.1)

Code in V3Unknown.cpp seems to handle the first case, but not partly out of range assignments

     void replaceBoundLvalue(AstNode* nodep, AstNode* condp) {
	// Spec says a out-of-range LHS SEL results in a NOP.
	// This is a PITA.  We could:
	//  1. IF(...) around an ASSIGN,
	//     but that would break a "foo[TOO_BIG]=$fopen(...)".
	//  2. Hack to extend the size of the output structure
	//     by one bit, and write to that temporary, but never read it.
	//     That makes there be two widths() and is likely a bug farm.
	//  3. Make a special SEL to choose between the real lvalue
	//     and a temporary NOP register.
	//  4. Assign to a temp, then IF that assignment.
	//     This is suspected to be nicest to later optimizations.
	// 4 seems best but breaks later optimizations.  3 was tried,
	// but makes a mess in the emitter as lvalue switching is needed.  So 4.
	// SEL(...) -> temp
	//             if (COND(LTE(bit<=maxlsb))) ASSIGN(SEL(...)),temp)

Basic hardcoded example (Silly, but still exhibits the issue, more relevant when using variables)

module t (clk);
    input clk;

    reg [135:0] uneven;
    initial begin
       // Static
       uneven = '0;
       uneven[7   -: 64] = '1;
       uneven[71  -: 64] = '1;
       uneven[135 -: 64] = '1;
       $display("uneven: %B", uneven);
       if (uneven != '1) $stop;
       $write("*-* All Finished *-*\n");
	  $finish;
    end
endmodule

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Jan 29, 2016


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2016-01-29T03:50:24Z


This is related to #�. The problem is fixing it requires significant runtime code which will may massively slow simulations. More thought is needed on how to reduce this burden.

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 5, 2016


Original Redmine Comment
Author Name: Johan Bjork
Original Date: 2016-02-05T00:17:11Z


Did you have something in progress on this one (as mentioned in the linked ticket)? we have multiple designs impacted by this at the moment so if not I'll likely take a look soon.

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 5, 2016


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2016-02-05T00:25:52Z


I don't. Basically right now it looks at the msb/lsb range and makes a Bound check, which either turns on or turns off the indexing. It needs to become a mask. (and perhaps leave the Bound conditional as all-on will be the common case.) Thanks

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Johan Bjork
Original Date: 2016-02-10T18:07:31Z


So I started with this, and it's a complete mess -- Any advice how to make this less terrible would be appreciated.

A couple of high level concerns:

  1. V3Width sets the dtype for the AstSel indexing to the number of bits required to index the variable. However, this needs to now allow negative values (to be able to correctly calculate offsets to use in a [var -: const] case. I've simply added 1 to allow for the sign bit and disabled the warning. Any advice how to improve the situation here?

  2. In the code to figure out the masks/shifting the correct values, I use a fair number of AstAdd/AstSub. I realized that these operations simply use the dtype of the lhsp and truncates the result. I've thrown in a few AstExtendS on the AstSel selection index (hardcoded to 32 at the moment, could be smarter there), but it feels very ugly.

I've uploaded a very early draft here, it still has plenty of failing tests and indentation/comments is a mess. Some early feedback would be very appreciated, just on a high-level whether this approach seems about right?

phb/verilator-dev@e530ee4#diff-6c80b812607617c162e46fd56e6579c8R951

Thanks
/Johan

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2016-02-10T18:24:01Z


  1. The spec says a select index is an integer. If we're going to do the right thing, do that.

Changing all of this will have huge runtime penalties unless we can eliminate most of this generated code by the end. I wonder if it would be better to have V3Width protect all selects with some new Ast type (similar to AstCondBound), then have later stages tune as needed, and teach V3Const how to simplify them away. V3Unknown would do the final cleanup searchng for that new node type and add the AstIf's.

  1. Try to reduce it to a constant when you insert it. Otherwise huge expansions sometimes result and while V3Const will clean them up it add a lot of runtime and mess when debugging is needed.
@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Johan Bjork
Original Date: 2016-02-10T18:29:13Z


Realized I broke one of the tests I have been caring about in the last upload, again relating to the question on how to do sign extensions and Adds/Subtracts properly. Ie, I get results like this:

- V3Const.cpp:615:    BICONST -> 32'h1b4
  const_old:  SUB 0x7ff5e2c8ee50 <e5834#> {e27} @dt=0x7ff5e2c30b60@(G/sw32)
  const_old: 1: CONST 0x7ff5e2c6f920 <e5830#> {e27} @dt=0x7ff5e2c30b60@(G/sw32)  32'sh4
  const_old: 2: CONST 0x7ff5e2c6fa50 <e5831#> {e27} @dt=0x7ff5e2c43280@(G/w9)  9'h50
        _new:  CONST 0x7ff5e2c8ef00 <e5844#> {e27} @dt=0x7ff5e2c30b60@(G/sw32)  32'h1b4
- V3Const__gen.cpp:3341:0x7ff5e2c6fec0 TREEOP ( AstNodeBiCom !$lhsp.castConst, $rhsp.castConst , swapSides(nodep) )
- V3Const__gen.cpp:4372:0x7ff5e2c6fe10 TREEOP ( AstSub $lhsp.castAdd, operandSubAdd(nodep) , AstAdd AstSub $lhsp->castAdd()->lhsp(),$rhsp , $lhsp->castAdd()->rhsp() )
- V3Const__gen.cpp:3391:0x7ff5e2c8f0c0 TREEOPC( AstNodeBiop $lhsp.castConst, $rhsp.castConst , replaceConst(nodep) )
- V3Const.cpp:615:    BICONST -> 32'h1b4

(for t_select_plus)

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Johan Bjork
Original Date: 2016-02-10T18:42:13Z


Wilson Snyder wrote:

  1. The spec says a select index is an integer. If we're going to do the right thing, do that.

Makes sense, will do.

Changing all of this will have huge runtime penalties unless we can eliminate most of this generated code by the end. I wonder if it would be better to have V3Width protect all selects with some new Ast type (similar to AstCondBound), then have later stages tune as needed, and teach V3Const how to simplify them away. V3Unknown would do the final cleanup searchng for that new node type and add the AstIf's.

Agreed, the generated code looks awful at the moment. I'm trying to think of specific optimizations that could be done to optimize away this new node type. Are you thinking about checking the range of the input variable (or constant) and remove it in the cases where it's always in range. Any others?

  1. Try to reduce it to a constant when you insert it. Otherwise huge expansions sometimes result and while V3Const will clean them up it add a lot of runtime and mess when debugging is needed.
    Will do.
@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2016-02-10T18:49:36Z


Agreed, the generated code looks awful at the moment. I'm trying to think of specific optimizations that could be done to optimize away this new node type. Are you thinking about checking the range of the input variable (or constant) and remove it in the cases where it's always in range. Any others?

Yes. Another common pattern will be var[var2] of width 1. If var is a power-of-two, and var-2 is log2 that size, then there's no way to go out of range.

Practically I think the thing to do is take one of your large designs (plus those in test_regress), run with the old version, run with the new version, and do a tree diff with var names removed, to see what new spots got ugly and to see if they can be optimized away.

Also if after all of that we still have too many extra instructions we may want a switch to disable this.

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 10, 2016


Original Redmine Comment
Author Name: Johan Bjork
Original Date: 2016-02-10T19:30:00Z


I'll keep whacking away and see if I get somewhere.

More about adding various math operators to the Ast after V3Width. Is the contract that the two values passed in are the same type? Or that dtype lhsp is large enough to store the result, but length of rhsp does not really matter?
Ie, for the case I mentioned earlier, V3Number::opSub() will negate the rhs value using length of rhs, then performing the addition without any further extensions, ie, incorrect results if L(lhsp) > L(rhsp). Changing it to take the width of lhs makes it correct in my case (and maybe more correct in general?), but not sure if it matters because I broke the assumption that the two widths should be identical?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.