Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDL generation failure with clash 1.8.1 ghc 9.4.7 #2628

Open
pbreuer opened this issue Dec 19, 2023 · 14 comments · May be fixed by #2668
Open

HDL generation failure with clash 1.8.1 ghc 9.4.7 #2628

pbreuer opened this issue Dec 19, 2023 · 14 comments · May be fixed by #2668

Comments

@pbreuer
Copy link

pbreuer commented Dec 19, 2023

I'm bringing this over from the google group discussion as it appears to be a bug in clash, not with me , and Christiaan says to!
Code compiles and runs fine under clash 1.6.4 and ghc 9.0.7, and generates verilog fine too. Under clash 1.8.1 it compiles and runs without complaint, but verilog generation fails. The error arises from Netlist.hs function mkDcApplication in a case switch. With debugging provided by Christiaan the error message is

...
Clash: Normalization took 0.421s
:
Clash error call:
Clash.Netlist(1041): Under-applied constructor Product "GHC.Tuple.(,)" Nothing [Vector 2 (SP "GHC.Maybe.Maybe" > [("GHC.Maybe.Nothing",[]),("GHC.Maybe.Just",[Product "GHC.Tuple.(,)" Nothing [Signed 126,SP "GHC.Maybe.Maybe" [("GHC.Maybe.Nothing",[]),("GHC.Maybe.Just",[Signed 32])]]])]),Signed 126]

Applied args:
[]

The newly instrumented line 1041 is:

   case compare (length dcArgs) (length argExprsFiltered) of
     EQ -> return (HW.DataCon dstHType (DC (dstHType,0)) argExprsFiltered)
     LT -> error $ $(curLoc) ++ "Over-applied constructor"
     GT -> error $ $(curLoc) ++ "Under-applied constructor "  <-- HERE
                             ++ show dstHType
                             ++ "\n\nApplied args:\n"
                             ++ show argExprsFiltered
                             ++ "\n\nFunction args:\n"
                             ++ show (dstHType,dc,args,argHWTys)

I will append the part with the printout of the mkDcApplication function args as a separate file. Please tell me if that is (or is not)
sufficient info to duplicate the problem. Christian says the tuple constructor has not enough term level arguments but I do not know exactly what is meant although yes, I see nothing in the error message that I would call a term, I only see types. I see type ( Vec 1 (Maybe(Signed 126,Signed 32), Signed 126 ) and that is both OK and recognizable as something from my source code. Here is the "Function args" part of the error message:

err2.txt

Simplifying the type constraints in the source code has brought the error message down to this and pinpointed this clause in the case switch. Earlier it was just dropping straight through to the catchall error at the bottom. The error then had a type equality constraint/assertion as the top constructor and is attached here:

err.txt

I don't see anything in common between the two, but presumably the latter represents an earlier stage in the transformations
that are being applied. Hopefully y'all have a pretty prenter for that lot!

If you do figure out the bug or want to try something, please post patches or suggestions and I will try them. I can recompile.

Regards

PTB

@pbreuer
Copy link
Author

pbreuer commented Dec 19, 2023

PS. I have been able to locate the problem as stemming from a particular source code stanza. This does NOT generate error:

        -- strong inval, weak inval req
        f (Just x1,Just x2,Nothing,Just (idx1,v1),Just (idx2,v2)) = 
            let (idx1',tag1) = tacache_split_cxdr x1
                (idx2',tag2) = tacache_split_cxdr x2
            in
            if idx1' /= idx2' then
              let v1' = fst $ tazcache_line_inval_step v1 tag1
                  v2' = fst $ tazcache_line_weak_inval_step v2 tag2
              in ( if idx1 == idx1' then Just(idx1',v1') else Nothing
                 , if idx2 == idx2' then Just(idx2',v2') else Nothing
                 )
            else
              let v1' = fst $ tazcache_line_inval_step v1 tag1
                  v2' = fst $ tazcache_line_weak_inval_step v1' tag2
              in ( if idx2 == idx2' then Just(idx2',v2') else Nothing
                 , Nothing
                 )

This DOES generate error:

        -- strong inval, weak inval req
        f (Just x1,Just x2,Nothing,Just (idx1,v1),Just (idx2,v2)) = 
            let (idx1',tag1) = tacache_split_cxdr x1
                (idx2',tag2) = tacache_split_cxdr x2
            in
            if idx1' /= idx2' then
              let **(v1',_)** = tazcache_line_inval_step v1 tag1
                  **(v2' ,_)** = tazcache_line_weak_inval_step v2 tag2
              in ( if idx1 == idx1' then Just(idx1',v1') else Nothing
                 , if idx2 == idx2' then Just(idx2',v2') else Nothing
                 )
            else
              let **(v1',_)** = tazcache_line_inval_step v1 tag1
                  **(v2',_)** = tazcache_line_weak_inval_step __v1'__ tag2
              in ( if idx2 == idx2' then Just(idx2',v2') else Nothing
                 , Nothing
                 )

I've marked the difference in bold (hopefully - nope no hope, oh well, look for the double asterisks). It is pattern match for the first of a pair, vs applying fst. Also with v1 instead of v1' in that final else (marked with italics - ibid, look for the extra underlines fore and aft), everything is good with or without pattern match.

@pbreuer
Copy link
Author

pbreuer commented Dec 21, 2023

That bug only appears when the function above is in a where clause to another function. Raise the function to top level (with the same type declarations) and there is no problem generating verilog.

win1,win2 :: Maybe (idx, Vec (2^m) (Maybe (tag, Maybe addr)))
(win1,win2) = f (dx, d_x, dw, out1, out2)
where f :: ( Maybe cxdr , Maybe cxdr, Maybe(cxdr,addr), Maybe (idx,Vec(2^m)(Maybe(tag,Maybe addr))), Maybe (idx,Vec(2^m) (Maybe(tag,Maybe addr))))
-> ( Maybe(idx,CacheLine m tag addr), Maybe(idx,CacheLine m tag addr))
....

Yes, that is also a where clause in another function. The specialization of those type variables was originally left to much higher in the call hierarchy but in the effort to pin down the problem I specialized them lower and lower down and the governing constraints are now:

forall (m::Nat) (n::Nat) (p::Nat) (q::Nat) cxdr addr idx tag
. ( KnownNat q, KnownNat n, KnownNat m, KnownNat p
, n <= p, cxdr ~ Signed p, addr ~ Signed q, idx ~ Signed n, tag ~ Signed (p-n)
)

but that has never made any difference one way or another.

So, in summary, at top level the function "f" causes no verilog generation problems. In a (second level) where clause it stops verilog generation with error indicating not enough lowe level terms in a tuple, so they have been disapeared. That is repaired even when in the where clause by making the trivial change of replacing a (foo,_) = pattern with a foo = fst $ .

@leonschoorl
Copy link
Member

I would be very helpful if you could provide a Short, Self Contained, Correct (Compilable), Example that demonstrates the problem.

@pbreuer
Copy link
Author

pbreuer commented Dec 21, 2023 via email

@pbreuer
Copy link
Author

pbreuer commented Dec 22, 2023

To add some context, this error usually disappears when I try to pin it, which is not easy because it has always manifested first as "never produces output" in large compilations. So finding out which part of the code it is in requires binary search and use of a stopwatch.

It is quite common ... it seems to be at about the 1:30000 lines of code level. That is, three instances per 100000 lines of code. The commonalities in the instances I have seen (roughly) are that they all manipulate streams, not data. The example above I obtained by replacing the manipulation of streams by a mealy machine with a nil state. The state transform part then got sliced up until I got down to the stanza above which evokes the hdl generation error shown. Putting that part as a top level function instead of a local function inside a "where" for the mealy machine state transform fixes the error and hdl generation works again.

I have just seen an instance in which foo ... bar ... generates verilog perfectly, but when bar is replaced with gum which also generates verilog perfectly, then it doesn't work. Foo and bar and gum are noinline and defined in different modules. By "doesn't work" I mean silently never produces output, the clash verilog generation just runs forever.

All these codes work fine for hdl generation under clash 1.6.4 on the same machine. I am wondering if perhaps the way this clash was compiled may be the problem (I am using the one from debian, except that I recompiled it to put in more error reporting, but I would have used the same infrastructure as the debian build). About all the info I can get out is "Glasgow Haskell Compiler, Version 9.4.7, stage 2 booted by GHC version 9.4.7". Maybe optimization flags were overoptimistic or something similar.

I'll make an effort today to further isolate a piece of code that you can use as a testbed, but I would urge a look at the mkDcApplication function to see what it does with those arguments from the error printout. Hopefully supplying them to it is "just" a matter of editing the error output.

Regards

PTB

@pbreuer
Copy link
Author

pbreuer commented Dec 22, 2023

I've gathered code that evokes the error into the attached self-contained file. It's a delicate bug so I haven't dared trim beyond what is there. I have commented with FIX changes to make verilog generation work. It is to replace (v,_) = with v = fst $ at the indicated places, and to replace the indicated case stanzas each with a call to a new top level function that contains just that stanza. Please run verilog generation on the code and confirm the error.

I don't think any of the cache_line support functions are involved, I have replaced them with dummies of the same type in the past without affecting things one way or the other. Please cut and slice away. I'll try reducing this further too. It's just so tricky to pin that I want to record this and see if you agree that it evokes an error on your platform. Instructions on command line to run are at top of file, with the recorded output.

Regards

PTB
Test.hs.txt

@bgamari
Copy link
Contributor

bgamari commented Dec 30, 2023

I have independently stumbled upon this issue and found that it may be related to separate compilation. See #2634.

@pbreuer
Copy link
Author

pbreuer commented Dec 30, 2023

I'm not doing anything special with compilation .. it's just one file, compiled in one go. This is the command line:

clash -XCPP -fconstraint-solver-iterations=0 -package silently
-fclash-no-render-enums -fclash-spec-limit=80 -fclash-inline-limit=80
-freduction-depth=400 -fclash-clear --verilog Test.hs

Reformatting the source code makes the error go away. No functionality change. Just replace (a,_) = .. with a= fst $ ... and make some case statement stanzas in a where subdefinition each into calls to separately defined top level functions containing the stanza.

Perhaps I misunderstand what is meant by "separate compilation"? Two object files, linked together?

@pbreuer
Copy link
Author

pbreuer commented Dec 31, 2023

I have checked against the cross-referenced bug #2628 above and -fno-worker-wrapper does not solve the issue here, though it is reported as doing so there. Shrug - maybe it's the same bug at heart, maybe not, can't tell. If some transformation going wrong is at the bottom of it then that could show up many ways.

I'd welcome a confirming report that generating hdl from the Test.hs file of this thread produces error on another platform. It should do. I recompiled clash as well as using the debian version.

@pbreuer
Copy link
Author

pbreuer commented Jan 4, 2024

I've simplified the test code more. See attached source file. It no longer makes much sense as code, but it still errors on generating verilog, not otherwise.

As expected, the little support functions could all be replaced by identity or simple constants and it was/is only necessary that they produce a pair output and be opaque. The error seems to say a tuple doesn't have enough elements after transformations have been applied.

Test5.hs.txt

@pbreuer
Copy link
Author

pbreuer commented Jan 4, 2024

Now reduced to logic only, still evoking error generating verilog with clash 1.8.1.

A couple of dummy support functions marked NOINLINE are needed to opaquely produce a pair value, of which only the first component is wanted by the error-producing code.

Three sites in this short code are marked. All of them together are required to produce the error. Fixing any one
of them removes the error. The fixes are commented.

I don't think it can get much shorter and still generate error.

Source file attached below. Compilation: clash -fconstraint-solver-iterations=0 -package silently -fclash-no-render-enums -fclash-spec-limit=80 -fclash-inline-limit=80 -freduction-depth=400 -fclash-clear --verilog FOO.hs

Test7a.hs.txt

PTB

@pbreuer
Copy link
Author

pbreuer commented Jan 23, 2024

Is there any progress on this? The source above is just a few lines, and the fixes are marked.

PTB

christiaanb added a commit that referenced this issue Feb 14, 2024
Create multiple selectors, one for each non-shared argument.

The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 14, 2024
Create multiple selectors, one for each non-shared argument.

The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
@christiaanb christiaanb linked a pull request Feb 14, 2024 that will close this issue
2 tasks
@christiaanb
Copy link
Member

I believe I have fixed the error. #2668 compiles your Test7a.hs without errors.

@pbreuer
Copy link
Author

pbreuer commented Feb 14, 2024

Wonderful! Well done and thank you.
I'll have a look to see if that works also on the full unabridged original code.

Regards
PTB

christiaanb added a commit that referenced this issue Feb 15, 2024
Create multiple selectors, one for each non-shared argument.

The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 16, 2024
Create multiple selectors, one for each non-shared argument.

The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 28, 2024
Create multiple selectors, one for each non-shared argument.

The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 29, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 29, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 29, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Feb 29, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Apr 8, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Apr 8, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
christiaanb added a commit that referenced this issue Apr 22, 2024
The previous code was a big mess where we partioned arguments
into shared and non-shared and then filtered the case-tree
depending on whether they were part of the shared arguments
or not. But then with the normalisation of type arguments,
the second filter did not work properly. This then resulted in
shared arguments becoming part of the tuple in the alternatives
of the case-expression for the non-shared arguments.

The new code is also more robust in the sense that shared and
non-shared arguments no longer need to be partioned (shared
occur left-most, non-shared occur right-most). They can now
be interleaved. The old code would also generate bad Core if
ever type and term arguments occured interleaved, this is no
longer the case for the new code.

Fixes #2628
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants