Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding --proof-hint flag to PyK KRun #4532

Open
wants to merge 9 commits into
base: develop
Choose a base branch
from

Conversation

Robertorosmaninho
Copy link
Contributor

This PR attempts to add the --proof-hint krun's flag to PyK infrastructure so we can use it in semantics that currently relies on PyK for building itself and execute programs.

@Robertorosmaninho Robertorosmaninho added feature pyk Issues transferred from runtimeverification/pyk labels Jul 18, 2024
@Robertorosmaninho Robertorosmaninho self-assigned this Jul 18, 2024
@rv-jenkins rv-jenkins changed the base branch from master to develop July 18, 2024 23:30
)

if proof_hint:
# Print the binary proof hint to stdout regardless of the output option
write(1, result.stdout)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the intent of the code here is; it seems confusing that the proof hint mode will unconditionally escape the Python framework and dump directly to stdout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be better to expose this operation mode as a separate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual behavior of the flag in the krun. kprint can't handle this kind of binary, so the idea is just to dump it in the console as krun does. The alternative, from my point of view, would be to Dump into a file but escape the Python framework anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, krun only cat .../result.kore when --prof-hint is passed as opposite to call kore-print ... result.kore when this flag isn't in use.

So I believe this mimics the same behavior we already have.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative, from my point of view, would be to Dump into a file but escape the Python framework anyway.

The point is that we manage stdout and stderr through the CompletedProcess object. If the hint has been captured by the subprocess library, then the way to get it to the calling app's stdout is not to bypass that management entirely; you should probably instead have the calling code write it out to a file yourself.

pyk/src/pyk/ktool/krun.py Outdated Show resolved Hide resolved
)

if proof_hint:
# Print the binary proof hint to stdout regardless of the output option
write(1, result.stdout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it would be better to expose this operation mode as a separate function.

@@ -115,6 +118,7 @@ def run(
pipe_stderr: bool = True,
bug_report: BugReport | None = None,
debugger: bool = False,
proof_hint: bool = False,
Copy link
Contributor

@tothtamas28 tothtamas28 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this approach won't work: under the hood it leads to a call like

subprocess.run(['krun', 'test.imp', '--proof-hint'], text=True, stdout=subprocess.PIPE)

which will raise an exception:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 8: invalid start byte

This is because

  • stdout=subprocess.PIPE, so CompletedProcess.stdout will be populated.
  • text=True, so the function will try to decode the bytes output to produce CompletedProcess.stdout.

You drop any of these arguments and it'll work.

Please add a test to reproduce this issue.

Workarounds:

  1. Reuse _build_arg_list and call subprocess.run directly.
  2. If we know how the bytes output is supposed to be decoded, we can thread argument encoding through to subprocess.run (standard encodings).
  3. You can pass pipe_stdout=False to run_process, so that you have stdout=None in subprocess.run.
  4. We can patch run_process (and it's newer version, run_process_2 ) to accept text=False, but that's not straightforward.

I'd prefer a solution that won't prevent us from dropping run_process in favor of run_process_2 in the future, i.e. (1) or (2).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! I dropped text=True locally and forgot to commit this.

Copy link
Contributor Author

@Robertorosmaninho Robertorosmaninho Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About your workarounds:

  1. Isn't that just a different way to create a third run_process?
  2. I don't think we can do this as the encoding was created by us on the llvm backend serializer; we really need to output the binary itself, not its text representation/deserialization.
  3. That's a good approach; we can have pipe_stdout = not prof_hint.
  4. Indeed, that's not trivial and isn't worth spending time on.

If you don't mind, I would prefer to stick to option 3 to unblock Pi2 folks, and when we need to drop run_process in favor of run_process_2, we can sit together again to brainstorm a way to make it compatible with proof hints.

Robertorosmaninho and others added 2 commits July 19, 2024 07:57
Co-authored-by: Tamás Tóth <tothtamas28@users.noreply.github.com>
Baltoli
Baltoli previously approved these changes Jul 19, 2024
Copy link
Collaborator

@Baltoli Baltoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me post revision but let @tothtamas28 review before you merge!

Copy link
Contributor

@tothtamas28 tothtamas28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments for the current approach (i.e. (3) from the list). However, I personally think (1) is a better solution, which you can implement as follows:

def run_proof_hint(...) -> bytes:
    args = _build_arg_list(...)
    ...
    proc_res = subprocess.run(args, text=False, stdout=subprocess.PIPE, ...)
    return proc_res.stdout

This is easy to implement and test, gives you a better interface, and does not prevent us from dropping run_process later.

pyk/src/pyk/ktool/krun.py Outdated Show resolved Hide resolved
pyk/src/pyk/ktool/krun.py Outdated Show resolved Hide resolved
@@ -23,6 +23,7 @@
('search_final', False),
('no_pattern', False),
('debugger', False),
('proof_hint', False),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an integration test for checking the proof hints output for some execution.

@Robertorosmaninho
Copy link
Contributor Author

I've left a few comments for the current approach (i.e. (3) from the list). However, I personally think (1) is a better solution, which you can implement as follows:

def run_proof_hint(...) -> bytes:
    args = _build_arg_list(...)
    ...
    proc_res = subprocess.run(args, text=False, stdout=subprocess.PIPE, ...)
    return proc_res.stdout

This is easy to implement and test, gives you a better interface, and does not prevent us from dropping run_process later.

The problem with that approach is that we have to have almost all logic duplicated just to get a different way to output the program instead of only modifying the function that outputs the result itself. I can understand that this makes things more coupled right now for a possible future refactoring, but I'm sure that in future refactoring, we'll not need to have hundreds of code duplicated.

@tothtamas28
Copy link
Contributor

The problem with that approach is that we have to have almost all logic duplicated

I don't think there's a lot to duplicate honestly, the interesting part is _build_arg_list.

just to get a different way to output the program

That's a significant difference in my opinion.

instead of only modifying the function that outputs the result itself.

It'd be awesome if we could do stdout=subprocess.PIPE by only modifying the function that outputs the result, but we can't unfortunately. Hence option 1 as a suggestion.

I can understand that this makes things more coupled right now for a possible future refactoring, but I'm sure that in future refactoring, we'll not need to have hundreds of code duplicated.

That's my point, I'd rather have some code duplication than dependency to deprecated code. I'm going to be able deal with run_process even if you introduce a call to it now, but it's just easier to make these considerations before publishing new API.

@Baltoli
Copy link
Collaborator

Baltoli commented Jul 19, 2024

My 2c is that this is a perfectly legitimate duplication to introduce. The proof hints pipeline behaves very differently to the other ways that krun works, so it's legitimate to fully separate their python entrypoints.

@Baltoli Baltoli dismissed their stale review July 23, 2024 09:17

Waiting for final approval from Tamas on design choices

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature pyk Issues transferred from runtimeverification/pyk
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants