Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Test] Dump UPLC for 'strictLetRec' #5963

Merged
merged 3 commits into from
May 30, 2024

Conversation

effectfully
Copy link
Contributor

As requested.

As a non-cylon I also made it Readable.

@effectfully effectfully force-pushed the effectfully/test/dump-uplc-for-strictLetRec branch from a4bfde3 to 0faf38e Compare May 7, 2024 00:53
@effectfully effectfully added the No Changelog Required Add this to skip the Changelog Check label May 7, 2024
@effectfully effectfully requested review from Unisay and removed request for Unisay May 7, 2024 00:53
@effectfully effectfully force-pushed the effectfully/test/dump-uplc-for-strictLetRec branch from 0faf38e to f8967ac Compare May 7, 2024 00:55
@effectfully effectfully requested a review from Unisay May 7, 2024 00:55
@Unisay
Copy link
Contributor

Unisay commented May 7, 2024

I definitely see how this PR improves the golden file readability.

Unfortunately it sacrifices something else (something that I personally wouldn't sacrifice) in order to achieve it:
it expands the scope of the PIR test to also include TPLC, UPLC compilation steps thus sacrificing my ability to reason about PIR transformation in isolation from effects (changes) contributed by the PLC compilation.

Let me explain it pseudographically.
Before:

... ----> Original PIR -----------------------> Transformed PIR 
          |   a mismatch with the golden file shows        |
          |   that something is wrong in this scope        |
          |________________________________________________|

After:

... ----> Original PIR -------> Transformed PIR -------> TPLC ---------> UPLC
          |   a mismatch with the golden file shows                       |
          |   that something is wrong in this scope                       |
          |_______________________________________________________________|

The first diagram depicts something I can call a unit test, as it tests a single transformation step, unit.
The second diagram depicts something integration test-like, as several compilation steps are tested in integration.

The https://martinfowler.com/bliki/TestPyramid.html explains that we should prefer unit tests

image

you should prioritize fast unit tests above other slower types of testing

from here.

Copy link
Contributor

@Unisay Unisay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd would try to improve readability of PIR without sacrificing the test scope (isolated PIR transformation vs. integrated with TPLC + UPLC)

@effectfully
Copy link
Contributor Author

Unfortunately it sacrifices something else (something that I personally wouldn't sacrifice) in order to achieve it:
it expands the scope of the PIR test to also include TPLC, UPLC compilation steps thus sacrificing my ability to reason about PIR transformation in isolation from effects (changes) contributed by the PLC compilation.

I agree with Ziyang's point: nobody's gonna reason about the 150-lines golden file. It's very common that people just gloss over changes in golden files because we're so accustomed to seeing hundreds and hundreds of them, and I'm no exception here. Next time you see a change in a golden file that is larger than a few lines, try to reason about it, I can almost guarantee you that it won't go anywhere, because it's just so damn complicated.

Otherwise I agree with your point that testing the smallest possible thing is ideal. But here we don't really care that the produced example is of the very specific shape that we see in the PIR file. In particular, we could get rid of the CPS there, that would be a completely unreadable change and it wouldn't affect anything that we've done to fix the overly lazy fix being generated.

Anyways, I can't care less whether we have PIR or UPLC or whatever here, I just wanna have a test that the original program evaluates properly, everything else I could do manually if the test starts failing. So I'll just dump both PIR and UPLC to satisfy both Ziyang and you.

I guess I'll do that after my refactoring of the TestNested machinery, otherwise I'm gonna be stepping on my own toes too much, so I'm converting this PR to a draft.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Unfortunately it sacrifices something else (something that I personally wouldn't sacrifice) in order to achieve it:
it expands the scope of the PIR test to also include TPLC, UPLC compilation steps

There are exceptions to every general guideline, and this is a pretty good example where an exception should be made. From the golden UPLC, it is quite easy to tell whether "hello" is printed, and why. From the golden PIR, it is almost impossible, thus defeating the purpose.

@Unisay
Copy link
Contributor

Unisay commented May 14, 2024

From the golden UPLC, it is quite easy to tell whether "hello" is printed, and why. From the golden PIR, it is almost impossible, thus defeating the purpose.

If that's all we care about (if "hello" is printed or not) then I'd say that golden test is a wrong tool for the job: a better tool would be a test that collects printed output and verifies that it contains a sub-string "hello", never requiring a human to eyeball IR or UPLC.

My understanding that a question answered by this test is "what IR generated by the (`compileLetsPassSC` RecTerms) pass?".

@zliu41
Copy link
Member

zliu41 commented May 14, 2024

a better tool would be a test that collects printed output and verifies that it contains a sub-string "hello"

We have that already, and it is useful to have the golden files to look at in case the test fails. It's the same with the budget tests: we not only report the budget numbers, but also have the golden PIR and golden UPLC, so that if the budget of a script goes up or down, we can see why by looking at the golden files.

My understanding that a question answered by this test is "what IR generated by the (compileLetsPassSC RecTerms) pass?".

On the one hand it's a useful question to answer. but on the other hand we also need to be careful not to train people to ignore test results. There's a reason why we have much more golden PIR and golden UPLC than golden TPLC: TPLC is too hard to read. By adding it to all the tests, people will just be trained to ignore them.

@Unisay
Copy link
Contributor

Unisay commented May 15, 2024

a better tool would be a test that collects printed output and verifies that it contains a sub-string "hello"

We have that already, and it is useful to have the golden files to look at in case the test fails. It's the same with the budget tests: we not only report the budget numbers, but also have the golden PIR and golden UPLC, so that if the budget of a script goes up or down, we can see why by looking at the golden files.

Okay, so according to this, the question being answered is "What went wrong?" in case if unit test fails.
This is, indeed, useful. It reminds me the annotate[Show] functions in Hedgehog (and I very often used them).
While golden files can help to answer such question I am not really sure that's the best tool for the job ("What went wrong?") because it does more than needed: it comes with and expectation and is also evaluated every time not only upon a failure.

On the one hand it's a useful question to answer. but on the other hand we also need to be careful not to train people to ignore test results. There's a reason why we have much more golden PIR and golden UPLC than golden TPLC: TPLC is too hard to read. By adding it to all the tests, people will just be trained to ignore them.

If something went wrong with the IR.compileLetsPassSC and I have a UPLC output in the golden file then I'd ignore UPLC output and produce IR output just to be sure that UPLC compilation didn't obscure the actual result 🤷🏼‍♂️ At this point I don't care as much about readability, as I care about the isolation from unrelated effects.

…to effectfully/test/dump-uplc-for-strictLetRec
@effectfully
Copy link
Contributor Author

If something went wrong with the IR.compileLetsPassSC and I have a UPLC output in the golden file then I'd ignore UPLC output and produce IR output just to be sure that UPLC compilation didn't obscure the actual result 🤷🏼‍♂️ At this point I don't care as much about readability, as I care about the isolation from unrelated effects.

PIR is not only noisy and unwieldy, it's also vulnerable to changes in the compiler. You're not isolating a single thing with that golden test, there's the whole compilation of recursive functions monstrocity going on as well and that can be subject to unrelated changes (like e.g. changes in how types are compiled etc).

produce IR output just to be sure that UPLC compilation didn't obscure the actual result

The point is that you're going to have a very hard time establishing the equivalence (or lack thereof) between the original and the updated PIR code. And given that a test failure in that case may not necessarily be indicative of any misbehavior, we shouldn't introduce the possibility of getting false positives when figuring out whether the positive is false is that complicated.

So overall I agree with Ziyang more than I agree with you (I do think you have a point), so I'm going to pretend that Ziyang decided we have to have UPLC and make that golden test UPLC without adding the unreadable PIR alongside. I don't really think we even need to have that golden test at all, but I think having it is better than not having it, so let's make it the minimal readable UPLC and be done with bikeshedding.

@effectfully effectfully merged commit cd20770 into master May 30, 2024
8 checks passed
@effectfully effectfully deleted the effectfully/test/dump-uplc-for-strictLetRec branch May 30, 2024 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Changelog Required Add this to skip the Changelog Check Plutus IR Test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants