New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization produces incorrect results #603

Closed
rahulmutt opened this Issue Jan 5, 2018 · 36 comments

Comments

Projects
None yet
2 participants
@rahulmutt
Member

rahulmutt commented Jan 5, 2018

@jarekratajski reported this:

fibbtcoinner 0 sum presum  = sum
fibbtcoinner n sum presum = fibbtcoinner  (n-1) (sum + presum) sum

fibbtco n = fibbtcoinner n 1 0

fibbnaive  0 = 1
fibbnaive  1 = 1
fibbnaive  n = fibbnaive ( n-1) + fibbnaive ( n - 2)

main = do
        putStrLn $ show $ fibbtco <$> arr
        putStrLn $ show $ fibbnaive <$> arr
        where
arr = [0..10]

produces

[1,1,2,4,8,16,32,64,128,256,512]
[1,1,2,3,5,8,13,21,34,55,89]

when it should produce:

[1,1,2,3,5,8,13,21,34,55,89]
[1,1,2,3,5,8,13,21,34,55,89]

Reproduced on v0.7.

@rahulmutt rahulmutt added the bug label Jan 5, 2018

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt
Member

rahulmutt commented Jan 5, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 5, 2018

Contributor

To be more precise: with -O0 problem prevails.
Correct behavior only with -O0 AND BangPatterns params. I am just looking at code of eta compiler... do you have any hint where to look at first?

Contributor

jarekratajski commented Jan 5, 2018

To be more precise: with -O0 problem prevails.
Correct behavior only with -O0 AND BangPatterns params. I am just looking at code of eta compiler... do you have any hint where to look at first?

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 5, 2018

Contributor

While analyzing decompiled code :

     public final Closure apply6(StgContext var1, Closure var2, Closure var3, Closure var4, Closure var5, 
             Closure var6, Closure var7) {
            Object var9 = null; //i guess n
            Object var10 = null; //i guess sum
            Object var11 = null; //i guess presum
            boolean var8 = false;

            while(true) {
                while(var8) {
                    Main.sat_s7YH var12 = new Main.sat_s7YH(var3);
                    var1.R1 = var2;
                    Closure var13 = Classes.zeze().enter(var1).apply2(var1, (Closure)var9, var12);
                    if (!(var13 instanceof False)) {
                        return ((Closure)var10).evaluate(var1);
                    }

                    Main.sat_s7YM var14 = new Main.sat_s7YM(var4, (Closure)var10, (Closure)var11); //this must be sum + presum
                    Main.sat_s7YL var15 = new Main.sat_s7YL(var3, (Closure)var9); // this must be n-1
                    var9 = var15;
                    var10 = var14; // here is the error IMO 
                    var11 = var14; // 
                  /* this would be correct
                    var11 = var10; 
                    var10 = var14; 
                   
                     */
                    var8 = true;
                }

                var9 = var5;
                var10 = var6;
                var11 = var7;
                var8 = true;
            }
        }

I am only guessing... but this brought me to idea of swapping order of params :

   fibbtcoinner 0 presum sum  = sum
   fibbtcoinner n presum sum = fibbtcoinner  (n-1) sum (sum + presum)


   fibbtco n = fibbtcoinner n 0 1

and now this code gives correct results also in O2.
Hope this helps.

Contributor

jarekratajski commented Jan 5, 2018

While analyzing decompiled code :

     public final Closure apply6(StgContext var1, Closure var2, Closure var3, Closure var4, Closure var5, 
             Closure var6, Closure var7) {
            Object var9 = null; //i guess n
            Object var10 = null; //i guess sum
            Object var11 = null; //i guess presum
            boolean var8 = false;

            while(true) {
                while(var8) {
                    Main.sat_s7YH var12 = new Main.sat_s7YH(var3);
                    var1.R1 = var2;
                    Closure var13 = Classes.zeze().enter(var1).apply2(var1, (Closure)var9, var12);
                    if (!(var13 instanceof False)) {
                        return ((Closure)var10).evaluate(var1);
                    }

                    Main.sat_s7YM var14 = new Main.sat_s7YM(var4, (Closure)var10, (Closure)var11); //this must be sum + presum
                    Main.sat_s7YL var15 = new Main.sat_s7YL(var3, (Closure)var9); // this must be n-1
                    var9 = var15;
                    var10 = var14; // here is the error IMO 
                    var11 = var14; // 
                  /* this would be correct
                    var11 = var10; 
                    var10 = var14; 
                   
                     */
                    var8 = true;
                }

                var9 = var5;
                var10 = var6;
                var11 = var7;
                var8 = true;
            }
        }

I am only guessing... but this brought me to idea of swapping order of params :

   fibbtcoinner 0 presum sum  = sum
   fibbtcoinner n presum sum = fibbtcoinner  (n-1) sum (sum + presum)


   fibbtco n = fibbtcoinner n 0 1

and now this code gives correct results also in O2.
Hope this helps.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

Thanks for taking a look! So swapping the order of params fixes it. We still need to investigate why the original problem happened.

The way to debug stuff like this is:

  1. Check that the optimized intermediate code (in this case, STG) does what's expected. To check this, you can:
  • Add -ddump-stg -ddump-to-file flags when compiling directly with eta

  • Add eta-options: -ddump-stg -ddump-to-file in the executable stanza of your Cabal file if compiling with etlas

    STG code is fairly simple to understand if you ignore all the unnecessary info, think of it as a very simple functional language.

    If you have trouble following it, share in a Gist and I can help you out with the parts that are tricky.

  1. Now if everything looks fine at the STG level, the next thing to look at is what you did above - the generated Java code. We have to make sure that the translation is happening correctly - most bugs come out of a mismatch between STG & the generated code, and most of the time the issue is with the generated code.
Member

rahulmutt commented Jan 5, 2018

Thanks for taking a look! So swapping the order of params fixes it. We still need to investigate why the original problem happened.

The way to debug stuff like this is:

  1. Check that the optimized intermediate code (in this case, STG) does what's expected. To check this, you can:
  • Add -ddump-stg -ddump-to-file flags when compiling directly with eta

  • Add eta-options: -ddump-stg -ddump-to-file in the executable stanza of your Cabal file if compiling with etlas

    STG code is fairly simple to understand if you ignore all the unnecessary info, think of it as a very simple functional language.

    If you have trouble following it, share in a Gist and I can help you out with the parts that are tricky.

  1. Now if everything looks fine at the STG level, the next thing to look at is what you did above - the generated Java code. We have to make sure that the translation is happening correctly - most bugs come out of a mismatch between STG & the generated code, and most of the time the issue is with the generated code.
@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 5, 2018

Contributor

Disclaimer: I am total newbie to Haskell and GHC and even compilers. Sorry.
But looking at STG I think it is proper

Fibb.fibbtcoinner
  :: forall a_a7U9 a1_a7UA.
     (GHC.Classes.Eq a_a7U9, GHC.Num.Num a_a7U9, GHC.Num.Num a1_a7UA) =>
     a_a7U9 -> a1_a7UA -> a1_a7UA -> a1_a7UA
[GblId,
 Arity=6,
 Caf=NoCafRefs,
 Str=<S(C(C(S))L),U(C(C1(U)),A)><L,U(A,C(C1(U)),A,A,A,A,C(U))><L,U(C(C1(U)),A,A,A,A,A,A)><L,U><L,U><L,U>,
 Unf=OtherCon []] =
    \r srt:SRT:[] [$dEq_s8MJ
                   $dNum_s8MK
                   $dNum1_s8ML
                   eta_s8MM
                   eta1_s8MN
                   eta2_s8MO]
        let {
          lvl1_s8MP [Occ=OnceL] :: a_a7UF
          [LclId, Str=] =
              \u srt:SRT:[] []
                  GHC.Num.fromInteger $dNum_s8MK Fibb.fibbnaive3; } in
        let {
          lvl2_s8MQ [Occ=OnceL] :: a_a7UF
          [LclId, Str=] =
              \u srt:SRT:[] [] GHC.Num.fromInteger $dNum_s8MK Fibb.fibbnaive1;
        } in 
          let-no-escape {
            fibbtcoinner1_s8MR [Occ=LoopBreaker]
              :: a_a7UF -> a1_a7UG -> a1_a7UG -> a1_a7UG
            [LclId, Arity=3, Str=<L,U><L,U><L,U>, Unf=OtherCon []] =
                sat-only \r srt:SRT:[] [ds_s8MS sum_s8MT presum_s8MU]
                    case GHC.Classes.== $dEq_s8MJ ds_s8MS lvl2_s8MQ of _ [Occ=Dead] {
                      GHC.Types.False ->
                          let {
                            sat_s8MX [Occ=Once] :: a1_a7UG
                            [LclId, Str=] =
                                \u srt:SRT:[] [] GHC.Num.+ $dNum1_s8ML sum_s8MT presum_s8MU; } in
                          let {
                            sat_s8MW [Occ=Once] :: a_a7UF
                            [LclId, Str=] =
                                \u srt:SRT:[] [] GHC.Num.- $dNum_s8MK ds_s8MS lvl1_s8MP;
                          } in  fibbtcoinner1_s8MR sat_s8MW sat_s8MX sum_s8MT;
                      GHC.Types.True -> sum_s8MT;
                    };
          } in  fibbtcoinner1_s8MR eta_s8MM eta1_s8MN eta2_s8MO;

Because the call (loop?) has in fact 2 distinct variables:
} in fibbtcoinner1_s8MR sat_s8MW sat_s8MX sum_s8MT;
Seems that Java produced somehow messes sat_s8MX with sum_s8MT.
But at least here at STG level they seem to be different.

Contributor

jarekratajski commented Jan 5, 2018

Disclaimer: I am total newbie to Haskell and GHC and even compilers. Sorry.
But looking at STG I think it is proper

Fibb.fibbtcoinner
  :: forall a_a7U9 a1_a7UA.
     (GHC.Classes.Eq a_a7U9, GHC.Num.Num a_a7U9, GHC.Num.Num a1_a7UA) =>
     a_a7U9 -> a1_a7UA -> a1_a7UA -> a1_a7UA
[GblId,
 Arity=6,
 Caf=NoCafRefs,
 Str=<S(C(C(S))L),U(C(C1(U)),A)><L,U(A,C(C1(U)),A,A,A,A,C(U))><L,U(C(C1(U)),A,A,A,A,A,A)><L,U><L,U><L,U>,
 Unf=OtherCon []] =
    \r srt:SRT:[] [$dEq_s8MJ
                   $dNum_s8MK
                   $dNum1_s8ML
                   eta_s8MM
                   eta1_s8MN
                   eta2_s8MO]
        let {
          lvl1_s8MP [Occ=OnceL] :: a_a7UF
          [LclId, Str=] =
              \u srt:SRT:[] []
                  GHC.Num.fromInteger $dNum_s8MK Fibb.fibbnaive3; } in
        let {
          lvl2_s8MQ [Occ=OnceL] :: a_a7UF
          [LclId, Str=] =
              \u srt:SRT:[] [] GHC.Num.fromInteger $dNum_s8MK Fibb.fibbnaive1;
        } in 
          let-no-escape {
            fibbtcoinner1_s8MR [Occ=LoopBreaker]
              :: a_a7UF -> a1_a7UG -> a1_a7UG -> a1_a7UG
            [LclId, Arity=3, Str=<L,U><L,U><L,U>, Unf=OtherCon []] =
                sat-only \r srt:SRT:[] [ds_s8MS sum_s8MT presum_s8MU]
                    case GHC.Classes.== $dEq_s8MJ ds_s8MS lvl2_s8MQ of _ [Occ=Dead] {
                      GHC.Types.False ->
                          let {
                            sat_s8MX [Occ=Once] :: a1_a7UG
                            [LclId, Str=] =
                                \u srt:SRT:[] [] GHC.Num.+ $dNum1_s8ML sum_s8MT presum_s8MU; } in
                          let {
                            sat_s8MW [Occ=Once] :: a_a7UF
                            [LclId, Str=] =
                                \u srt:SRT:[] [] GHC.Num.- $dNum_s8MK ds_s8MS lvl1_s8MP;
                          } in  fibbtcoinner1_s8MR sat_s8MW sat_s8MX sum_s8MT;
                      GHC.Types.True -> sum_s8MT;
                    };
          } in  fibbtcoinner1_s8MR eta_s8MM eta1_s8MN eta2_s8MO;

Because the call (loop?) has in fact 2 distinct variables:
} in fibbtcoinner1_s8MR sat_s8MW sat_s8MX sum_s8MT;
Seems that Java produced somehow messes sat_s8MX with sum_s8MT.
But at least here at STG level they seem to be different.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

Here's the output with -O2 which is less code to look at and makes the problem a lot more clear (it's just as you pointed out):

  public static Closure main_fibbtcoinner(StgContext paramStgContext, Closure paramClosure1, Closure paramClosure2, Closure paramClosure3)
  {
    for (;;)
    {
      Type.eqIntegerzh(paramStgContext, paramClosure1, main6());
      int i = paramStgContext.I1;
      if (i == 1) {
        break;
      }
      Closure localClosure1 = Type.plusInteger(paramStgContext, paramClosure2, paramClosure3);
      Closure localClosure2 = Type.minusInteger(paramStgContext, paramClosure1, main5());
      paramClosure1 = localClosure2;
      paramClosure2 = localClosure1;
      paramClosure3 = paramClosure2;
    }
    return paramClosure2.evaluate(paramStgContext);
  }
Member

rahulmutt commented Jan 5, 2018

Here's the output with -O2 which is less code to look at and makes the problem a lot more clear (it's just as you pointed out):

  public static Closure main_fibbtcoinner(StgContext paramStgContext, Closure paramClosure1, Closure paramClosure2, Closure paramClosure3)
  {
    for (;;)
    {
      Type.eqIntegerzh(paramStgContext, paramClosure1, main6());
      int i = paramStgContext.I1;
      if (i == 1) {
        break;
      }
      Closure localClosure1 = Type.plusInteger(paramStgContext, paramClosure2, paramClosure3);
      Closure localClosure2 = Type.minusInteger(paramStgContext, paramClosure1, main5());
      paramClosure1 = localClosure2;
      paramClosure2 = localClosure1;
      paramClosure3 = paramClosure2;
    }
    return paramClosure2.evaluate(paramStgContext);
  }
@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

This problem is extremely subtle which is why we didn't encounter this yet. It is more likely to happen with very small and simple recursive functions which don't transform all their arguments when making a recursive call.

The problem is that last line paramClosure3 = paramClosure2 which corresponds to passing the sum parameter in the recursive call in a different position. The issue is that this needs to be re-ordered. A topological sort must be done to ensure that we don't set the bindings in incorrect order.

It should be done like this:

      paramClosure3 = paramClosure2;
      paramClosure1 = localClosure2;
      paramClosure2 = localClosure1;

or even this:

      paramClosure1 = localClosure2;
      paramClosure3 = paramClosure2;
      paramClosure2 = localClosure1;

You made nice observations for not looking at this stuff before :)

Member

rahulmutt commented Jan 5, 2018

This problem is extremely subtle which is why we didn't encounter this yet. It is more likely to happen with very small and simple recursive functions which don't transform all their arguments when making a recursive call.

The problem is that last line paramClosure3 = paramClosure2 which corresponds to passing the sum parameter in the recursive call in a different position. The issue is that this needs to be re-ordered. A topological sort must be done to ensure that we don't set the bindings in incorrect order.

It should be done like this:

      paramClosure3 = paramClosure2;
      paramClosure1 = localClosure2;
      paramClosure2 = localClosure1;

or even this:

      paramClosure1 = localClosure2;
      paramClosure3 = paramClosure2;
      paramClosure2 = localClosure1;

You made nice observations for not looking at this stuff before :)

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 5, 2018

Contributor

Does it have to do something with?

-- TODO: Verify that this is valid in all cases,
--       otherwise fall back on the strongly connected components
--       algorithm a la GHC
multiAssign :: [CgLoc] -> [Code] -> Code
multiAssign locs codes = fold $ zipWith storeLoc locs codes

in Layout.sh

Contributor

jarekratajski commented Jan 5, 2018

Does it have to do something with?

-- TODO: Verify that this is valid in all cases,
--       otherwise fall back on the strongly connected components
--       algorithm a la GHC
multiAssign :: [CgLoc] -> [Code] -> Code
multiAssign locs codes = fold $ zipWith storeLoc locs codes

in Layout.sh

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

Yes exactly that :) It turns out that #478 has the same problem so that will also be fixed if we fix this.

The tricky part here is that we can't really inspect inside a Code data type since it just has a monadic action that generates it when supplied with a constant pool and some other parameters. This lack of introspection actually cripples our ability to do bytecode optimization, so we really need to create an API to work with the direct bytecode AST instead.

We'll need to expose a simple analysis in codec-jvm that runs the Instr monad and inspects the resulting ByteString for the *load instructions and returns a [Int] - all the slots that were loaded from in the code. Would you be interested in giving this a try?

Member

rahulmutt commented Jan 5, 2018

Yes exactly that :) It turns out that #478 has the same problem so that will also be fixed if we fix this.

The tricky part here is that we can't really inspect inside a Code data type since it just has a monadic action that generates it when supplied with a constant pool and some other parameters. This lack of introspection actually cripples our ability to do bytecode optimization, so we really need to create an API to work with the direct bytecode AST instead.

We'll need to expose a simple analysis in codec-jvm that runs the Instr monad and inspects the resulting ByteString for the *load instructions and returns a [Int] - all the slots that were loaded from in the code. Would you be interested in giving this a try?

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

For reference, this is GHC's code to handle the problem:
https://github.com/ghc/ghc/blob/master/compiler/codeGen/StgCmmUtils.hs#L375-L444

Member

rahulmutt commented Jan 5, 2018

For reference, this is GHC's code to handle the problem:
https://github.com/ghc/ghc/blob/master/compiler/codeGen/StgCmmUtils.hs#L375-L444

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 5, 2018

Contributor

I understand the problem and more or less get the code. I will try - but as for now I am too weak to really solve it. So please do not count on me too much.

Contributor

jarekratajski commented Jan 5, 2018

I understand the problem and more or less get the code. I will try - but as for now I am too weak to really solve it. So please do not count on me too much.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 5, 2018

Member

Sure no worries. Give it a try and share any attempts here.

Member

rahulmutt commented Jan 5, 2018

Sure no worries. Give it a try and share any attempts here.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 6, 2018

Member

@jarekratajski I was taking a look at multiAssign usages and it turns out that both code locations allow you to access CgLoc (Code-generation locations) where they are currently sending in Code. CgLoc's you can at least pattern match on so it would be easier to construct the topological sort.

Member

rahulmutt commented Jan 6, 2018

@jarekratajski I was taking a look at multiAssign usages and it turns out that both code locations allow you to access CgLoc (Code-generation locations) where they are currently sending in Code. CgLoc's you can at least pattern match on so it would be easier to construct the topological sort.

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 7, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 7, 2018

Contributor

I did small trial where I fix only one call of multiAssign and it seems to work.
There are some dirty places there.
The fix also does not solve the problem of potential Cycles. I've created another sample with cycle and it is still broken. I will continue working on this.

Contributor

jarekratajski commented Jan 7, 2018

I did small trial where I fix only one call of multiAssign and it seems to work.
There are some dirty places there.
The fix also does not solve the problem of potential Cycles. I've created another sample with cycle and it is still broken. I will continue working on this.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 7, 2018

Member

Ok great. I gave some feedback on your work so far - great start! Please put your test cases in tests/basic/multiAssign/. We don't run those tests yet, but we will eventually and it'll be great to keep track of them.

Member

rahulmutt commented Jan 7, 2018

Ok great. I gave some feedback on your work so far - great start! Please put your test cases in tests/basic/multiAssign/. We don't run those tests yet, but we will eventually and it'll be great to keep track of them.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 9, 2018

Member

@jarekratajski Do you have a timeline for the fix for this? Ideally, I'd like to see a fix by the end of the week since this is a pretty important bug. If your current schedule does not allow for that, please create a PR with the work you've done so far and I'll finish up the remaining bits.

Member

rahulmutt commented Jan 9, 2018

@jarekratajski Do you have a timeline for the fix for this? Ideally, I'd like to see a fix by the end of the week since this is a pretty important bug. If your current schedule does not allow for that, please create a PR with the work you've done so far and I'll finish up the remaining bits.

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 9, 2018

Contributor

@rahulmutt I will have some time in next 48 hours for that - so hopefully I can do it or at least push something better.

Contributor

jarekratajski commented Jan 9, 2018

@rahulmutt I will have some time in next 48 hours for that - so hopefully I can do it or at least push something better.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 9, 2018

Member

@jarekratajski Great, thanks for the update.

Member

rahulmutt commented Jan 9, 2018

@jarekratajski Great, thanks for the update.

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

@rahulmutt I need some help.
I've tried to port 1 by 1 iines from GHC with Digraph
and it does not build packages anymore.

And I do not know how to "debug" it efficiently.

I suspect I've mixed somehow "stores" with "loads" in Layout.hs unscramble
or I did not get how CodeGen monad works
Line 161 - 162 in Expr.hs ... are not codes generated by emitMultiAssign lost?

If I only knew how to print debugs during build process sensibly maybe I would find out.

Contributor

jarekratajski commented Jan 10, 2018

@rahulmutt I need some help.
I've tried to port 1 by 1 iines from GHC with Digraph
and it does not build packages anymore.

And I do not know how to "debug" it efficiently.

I suspect I've mixed somehow "stores" with "loads" in Layout.hs unscramble
or I did not get how CodeGen monad works
Line 161 - 162 in Expr.hs ... are not codes generated by emitMultiAssign lost?

If I only knew how to print debugs during build process sensibly maybe I would find out.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 10, 2018

Member

@jarekratajski Can you show me a diff of your changes? I can take a look at where you are and see if I can help you forward.

Are you getting compile-time errors or run-time errors? I can help out with debugging methods if you're specific on what kind of info you need.

Member

rahulmutt commented Jan 10, 2018

@jarekratajski Can you show me a diff of your changes? I can take a look at where you are and see if I can help you forward.

Are you getting compile-time errors or run-time errors? I can help out with debugging methods if you're specific on what kind of info you need.

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

moment... maybe I did very stupid thing and compiler even says me what
code is as ion jarekratajski@3ac051d
basically I was preparing PR there

Contributor

jarekratajski commented Jan 10, 2018

moment... maybe I did very stupid thing and compiler even says me what
code is as ion jarekratajski@3ac051d
basically I was preparing PR there

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 10, 2018

Member

I noticed that you changed the version number from 0.7.0 to 0.0.9 - that may be causing problems. The build relies on the fact that all the version numbers are consistent. What is happening when you run ./cleaninstall.sh?

Member

rahulmutt commented Jan 10, 2018

I noticed that you changed the version number from 0.7.0 to 0.0.9 - that may be causing problems. The build relies on the fact that all the version numbers are consistent. What is happening when you run ./cleaninstall.sh?

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

What I need is to compile eta - without recompiling base packages.
I've "screweed" something but I need to test it on a single small file.
Otherwise I wil not find out.

I did a stupid pull request to easily track changes.

Contributor

jarekratajski commented Jan 10, 2018

What I need is to compile eta - without recompiling base packages.
I've "screweed" something but I need to test it on a single small file.
Otherwise I wil not find out.

I did a stupid pull request to easily track changes.

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

Version number was changed before - that is now fixed.

Contributor

jarekratajski commented Jan 10, 2018

Version number was changed before - that is now fixed.

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

cleaninstall.sh results in


uilding library for ghc-prim-0.4.0.0..
[1 of 5] Compiling GHC.Types        ( GHC/Types.hs, dist/build/GHC/Types.jar )
[2 of 5] Compiling GHC.Tuple        ( GHC/Tuple.hs, dist/build/GHC/Tuple.jar )
[3 of 5] Compiling GHC.CString      ( GHC/CString.hs, dist/build/GHC/CString.jar )
eta: panic! (the 'impossible' happened)
  (Eta version 0.7.0b1):
        Prelude.head: empty list

Must be that I somehow do not emit correct assigns now.

Contributor

jarekratajski commented Jan 10, 2018

cleaninstall.sh results in


uilding library for ghc-prim-0.4.0.0..
[1 of 5] Compiling GHC.Types        ( GHC/Types.hs, dist/build/GHC/Types.jar )
[2 of 5] Compiling GHC.Tuple        ( GHC/Tuple.hs, dist/build/GHC/Tuple.jar )
[3 of 5] Compiling GHC.CString      ( GHC/CString.hs, dist/build/GHC/CString.jar )
eta: panic! (the 'impossible' happened)
  (Eta version 0.7.0b1):
        Prelude.head: empty list

Must be that I somehow do not emit correct assigns now.

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

Or after last change in Layout.hs I get now.
[4 of 5] Compiling GHC.Magic ( GHC/Magic.hs, dist/build/GHC/Magic.jar )
[5 of 5] Compiling GHC.Classes ( GHC/Classes.hs, dist/build/GHC/Classes.jar )
eta: panic! (the 'impossible' happened)
(Eta version 0.7.0b1):
getCgIdInfo: no module eta_B1

which probably happens here:
getGetDepCgLoad (NonVoid (StgVarArg var)) = Right <$> cgLocation <$> getCgIdInfo var

Contributor

jarekratajski commented Jan 10, 2018

Or after last change in Layout.hs I get now.
[4 of 5] Compiling GHC.Magic ( GHC/Magic.hs, dist/build/GHC/Magic.jar )
[5 of 5] Compiling GHC.Classes ( GHC/Classes.hs, dist/build/GHC/Classes.jar )
eta: panic! (the 'impossible' happened)
(Eta version 0.7.0b1):
getCgIdInfo: no module eta_B1

which probably happens here:
getGetDepCgLoad (NonVoid (StgVarArg var)) = Right <$> cgLocation <$> getCgIdInfo var

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 10, 2018

Member

@jarekratajski A nice way to debug codegen stuff is to add -ddump-cg-trace into the project you want to debug. In this case it's ghc-prim - so go to libraries/ghc-prim/ghc-prim.cabal and add -ddump-cg-trace -ddump-to-file to the ghc-options: field. In the trace, you can see which precise part of the codegen choked on your changes and where eta_B1 comes from.

Member

rahulmutt commented Jan 10, 2018

@jarekratajski A nice way to debug codegen stuff is to add -ddump-cg-trace into the project you want to debug. In this case it's ghc-prim - so go to libraries/ghc-prim/ghc-prim.cabal and add -ddump-cg-trace -ddump-to-file to the ghc-options: field. In the trace, you can see which precise part of the codegen choked on your changes and where eta_B1 comes from.

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 10, 2018

Member

That issue typically happens when you try to load an Id which hasn't been inserted into the bindings yet in the CodeGen monad. Seems like an ordering problem.

Member

rahulmutt commented Jan 10, 2018

That issue typically happens when you try to load an Id which hasn't been inserted into the bindings yet in the CodeGen monad. Seems like an ordering problem.

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

Thanks.
I've changed it
ghc-options: -ddump-cg-trace -ddump-to-file -this-unit-id ghc-prim

but still do not see any traceCG outputs. even in the created file

jarek@ubuntu:~/.etlas/logs/eta-0.7.0.1$ cat ghc-prim-0.4.0.0-Jhi6UgHuZdoBZWUpVo3WKE.log 
etlas: Entering directory '.'
Configuring ghc-prim-0.4.0.0...
Preprocessing library for ghc-prim-0.4.0.0..
Building library for ghc-prim-0.4.0.0..
[1 of 5] Compiling GHC.Types        ( GHC/Types.hs, dist/build/GHC/Types.jar )
[2 of 5] Compiling GHC.Tuple        ( GHC/Tuple.hs, dist/build/GHC/Tuple.jar )
[3 of 5] Compiling GHC.CString      ( GHC/CString.hs, dist/build/GHC/CString.jar )
[4 of 5] Compiling GHC.Magic        ( GHC/Magic.hs, dist/build/GHC/Magic.jar )
[5 of 5] Compiling GHC.Classes      ( GHC/Classes.hs, dist/build/GHC/Classes.jar )
eta: panic! (the 'impossible' happened)
  (Eta version 0.7.0b1):
	getCgIdInfo: no module eta_B1

Please report this as a Eta bug: http://github.com/typelead/eta/issues

etlas: Leaving directory '.'
Contributor

jarekratajski commented Jan 10, 2018

Thanks.
I've changed it
ghc-options: -ddump-cg-trace -ddump-to-file -this-unit-id ghc-prim

but still do not see any traceCG outputs. even in the created file

jarek@ubuntu:~/.etlas/logs/eta-0.7.0.1$ cat ghc-prim-0.4.0.0-Jhi6UgHuZdoBZWUpVo3WKE.log 
etlas: Entering directory '.'
Configuring ghc-prim-0.4.0.0...
Preprocessing library for ghc-prim-0.4.0.0..
Building library for ghc-prim-0.4.0.0..
[1 of 5] Compiling GHC.Types        ( GHC/Types.hs, dist/build/GHC/Types.jar )
[2 of 5] Compiling GHC.Tuple        ( GHC/Tuple.hs, dist/build/GHC/Tuple.jar )
[3 of 5] Compiling GHC.CString      ( GHC/CString.hs, dist/build/GHC/CString.jar )
[4 of 5] Compiling GHC.Magic        ( GHC/Magic.hs, dist/build/GHC/Magic.jar )
[5 of 5] Compiling GHC.Classes      ( GHC/Classes.hs, dist/build/GHC/Classes.jar )
eta: panic! (the 'impossible' happened)
  (Eta version 0.7.0b1):
	getCgIdInfo: no module eta_B1

Please report this as a Eta bug: http://github.com/typelead/eta/issues

etlas: Leaving directory '.'
@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 10, 2018

Member

I changed those exact same options and I was able to find it here:

libraries/ghc-prim/dist/build/GHC/CString.dump-cg-trace (for example)

Member

rahulmutt commented Jan 10, 2018

I changed those exact same options and I was able to find it here:

libraries/ghc-prim/dist/build/GHC/CString.dump-cg-trace (for example)

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

Thanks. That was it. Sorry, had no idea which file to look at.
I see as last lines of
Classes.dump-cg-trace

StgOpApp tagToEnum# [sat_sZF7] Bool
generating  $wccall
creating new closure...$wccall
StgOpApp __pkg_ccall_value ghc-prim False|False|2,True,False,False,"java/lang/Comparable","compareTo","(Ljava/lang/Object;)I" [sZFA] [ds_sZF8,
                                                                                                                                      ds1_sZF9,
                                                                                                                                      eta_B1] (# State#
                                                                                                                                                   RealWorld,
                                                                                                                                                 Int# #)
StgApp sat_sZFB [eta_B1]
cgIdApp: JumpToIt

so it must fail on
deps <- dependencies args
and then probably on this line

getGetDepCgLoad (NonVoid (StgVarArg var)) = Right <$> cgLocation <$> getCgIdInfo var

where getCgIdInfo fails

Contributor

jarekratajski commented Jan 10, 2018

Thanks. That was it. Sorry, had no idea which file to look at.
I see as last lines of
Classes.dump-cg-trace

StgOpApp tagToEnum# [sat_sZF7] Bool
generating  $wccall
creating new closure...$wccall
StgOpApp __pkg_ccall_value ghc-prim False|False|2,True,False,False,"java/lang/Comparable","compareTo","(Ljava/lang/Object;)I" [sZFA] [ds_sZF8,
                                                                                                                                      ds1_sZF9,
                                                                                                                                      eta_B1] (# State#
                                                                                                                                                   RealWorld,
                                                                                                                                                 Int# #)
StgApp sat_sZFB [eta_B1]
cgIdApp: JumpToIt

so it must fail on
deps <- dependencies args
and then probably on this line

getGetDepCgLoad (NonVoid (StgVarArg var)) = Right <$> cgLocation <$> getCgIdInfo var

where getCgIdInfo fails

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

After addeing some debug

JumpToIt label cgLocs mLne -> do
      traceCg (str "cgIdApp: JumpToIt")
      --  codes <- getNonVoidArgCodes args
      printBindings
      traceCg (ppr args)
      deps <- dependencies args
      traceCg (str "JAREK: deps calculated")
      emitMultiAssign cgLocs deps
      emit $ maybe  mempty
                                              (\(target, targetLoc) ->
                                               storeLoc targetLoc (iconst (locFt targetLoc) $ fromIntegral target))
                                               mLne
                  <>   goto label

I get the file as in attachment:
Classes.dump-cg-trace.txt

I do not know why getCgIdInfo var does not work even though I've copied more or less code from the previous getNonVoidArgCodes function.
(Very likely it is something obvious and stupid that I do not see. My experience with haskell is mostly trying to solve this bug + doing exercises with quicksort and fibonacci ... not that great).

Contributor

jarekratajski commented Jan 10, 2018

After addeing some debug

JumpToIt label cgLocs mLne -> do
      traceCg (str "cgIdApp: JumpToIt")
      --  codes <- getNonVoidArgCodes args
      printBindings
      traceCg (ppr args)
      deps <- dependencies args
      traceCg (str "JAREK: deps calculated")
      emitMultiAssign cgLocs deps
      emit $ maybe  mempty
                                              (\(target, targetLoc) ->
                                               storeLoc targetLoc (iconst (locFt targetLoc) $ fromIntegral target))
                                               mLne
                  <>   goto label

I get the file as in attachment:
Classes.dump-cg-trace.txt

I do not know why getCgIdInfo var does not work even though I've copied more or less code from the previous getNonVoidArgCodes function.
(Very likely it is something obvious and stupid that I do not see. My experience with haskell is mostly trying to solve this bug + doing exercises with quicksort and fibonacci ... not that great).

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

And I see.
I have not fully copied logic from getNonVoidArgCodes .. cause I forgot about nonvoid check.

Contributor

jarekratajski commented Jan 10, 2018

And I see.
I have not fully copied logic from getNonVoidArgCodes .. cause I forgot about nonvoid check.

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

@jarekratajski

This comment has been minimized.

Show comment
Hide comment
@jarekratajski

jarekratajski Jan 10, 2018

Contributor

PR is is ready for review.
I've added test cases that work. Cycles also seem to work.

One strange thing is I had to actually reverse results of "topological sort" from Digraph
see Line 83 from Layout.hs - but with this reverse all makes sense.

For reference commit #2582101 contains lot of debugs that can be used (show how graph looks like).

@rahulmutt Thank You for all the help!
That was really nice way for me to learn Haskell (and do sth else than boring JavaEE/Tomcat stuff :-) I do at work ).
I will have some time to fix polish this code if you have further hints. (Sorry for style and naming - probably I am doing Javaskell a little).

Contributor

jarekratajski commented Jan 10, 2018

PR is is ready for review.
I've added test cases that work. Cycles also seem to work.

One strange thing is I had to actually reverse results of "topological sort" from Digraph
see Line 83 from Layout.hs - but with this reverse all makes sense.

For reference commit #2582101 contains lot of debugs that can be used (show how graph looks like).

@rahulmutt Thank You for all the help!
That was really nice way for me to learn Haskell (and do sth else than boring JavaEE/Tomcat stuff :-) I do at work ).
I will have some time to fix polish this code if you have further hints. (Sorry for style and naming - probably I am doing Javaskell a little).

jarekratajski added a commit to jarekratajski/eta that referenced this issue Jan 10, 2018

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 11, 2018

Member

Don't worry about styling/naming - I'll do it later in a cleanup commit. You got the core fix working which is enough :) I'll merge it now.

Member

rahulmutt commented Jan 11, 2018

Don't worry about styling/naming - I'll do it later in a cleanup commit. You got the core fix working which is enough :) I'll merge it now.

rahulmutt added a commit that referenced this issue Jan 11, 2018

rahulmutt added a commit that referenced this issue Jan 11, 2018

rahulmutt added a commit that referenced this issue Jan 11, 2018

rahulmutt added a commit that referenced this issue Jan 11, 2018

rahulmutt added a commit that referenced this issue Jan 11, 2018

@rahulmutt

This comment has been minimized.

Show comment
Hide comment
@rahulmutt

rahulmutt Jan 11, 2018

Member

Verified the master's fix is working. Closing. Thanks again @jarekratajski.

Member

rahulmutt commented Jan 11, 2018

Verified the master's fix is working. Closing. Thanks again @jarekratajski.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment