Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fact] Remove wrapper for capturing output of subprocess #981

Merged
merged 10 commits into from
Feb 4, 2022
Merged

[fact] Remove wrapper for capturing output of subprocess #981

merged 10 commits into from
Feb 4, 2022

Conversation

algomaster99
Copy link
Contributor

I feel the wrapper captured_run is not required because the output was actually captured in just one method. Moreover, the parameters of subprocess.run are not a lot and quite explanatory of what we are trying to achieve.

@algomaster99 algomaster99 changed the title refactor: remove wrapper for subprocess execution refactor: remove wrapper for capturing output of subprocess Dec 29, 2021
@algomaster99 algomaster99 changed the title refactor: remove wrapper for capturing output of subprocess [fact] Remove wrapper for capturing output of subprocess Dec 29, 2021
Copy link
Collaborator

@slarse slarse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @algomaster99 ,

Your idea is not bad, but you've missed a side effect of capturing the output, see comment.

@@ -69,7 +69,7 @@ def _ensure_repo_dir_exists(clone_spec: CloneSpec) -> None:


def _git_init(dirpath):
captured_run(["git", "init"], cwd=str(dirpath))
subprocess.run("git init".split(), cwd=str(dirpath))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're missing something here. If the output is not captured, then it will go to stdout/stderr. That means that this function now causes the output of git init to be displayed to the user, which is not desirable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought the default behaviour would be to discard the output. In more technical terms, it would redirect the output to /dev/null. Since we do not care about the output here, should we do redirect the stdout and stderr to /dev/null?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought the default behaviour would be to discard the output.

A child process inherits the file descriptor table of its parent process (this is consistent across all operating systems and programming languages I know of), so if the parent process outputs to stdout and stderr, so will the child.

Since we do not care about the output here, should we do redirect the stdout and stderr to /dev/null?

We could do that, but the benefit over just using captured_run isn't all too clear to me. Perhaps there's something different we can do here? Rename the function perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that using subprocess.run is very clear by itself given it is a common python function and we have passed very few arguments to it. If it needed a lot of extra arguments, a wrapper would be desirable in that case.

When I saw this function, my first thought was to reuse this everywhere where subprocess.run was used. However, it was definitely not needed because of the same arguments above. Moreover, for this specific case, we don't even want to capture output so a function name captured_run shouldn't be invoked in my opinion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that using subprocess.run is very clear

It's not that it isn't clear, it's just an incredibly noisy function that takes way too many arguments.

Moreover, for this specific case, we don't even want to capture output so a function name captured_run shouldn't be invoked in my opinion.

So, originally there were two functions: captured_run and quiet_run, I removed quiet_run because it was semantically identical to invoking captured_run without using the captured output. This is why I suggested a rename of the function such that it's less confusing, because semantically this is what we want. Whether or not the output goes to devnull or to a temporary buffer that's then flushed doesn't matter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just an incredibly noisy function that takes way too many arguments.

In our case, we just pass four arguments and two of them are common - stdout and stderr. In my opinion, it was not enough for me to create a wrapper around it.

This is why I suggested a rename of the function such that it's less confusing

It is not confusing at all. I wanted to use captured_run wherever we invoke subprocess.run for the sake of uniformity. In most of the places where the latter is invoked, the same set of arguments is provided. However, there were a lot of occurrences of subprocess.run and they were readable as well. Thus, I thought to remove captured_run instead to achieve the goal I thought of.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not confusing at all.

To some extent it has to be confusing, as in refactoring, you inadvertently changed the semantics of the program by letting the output go to stdout/stderr.

Anyway, there's a neater option available as of #986, because in Python 3.7+ there's a capture_output argument to subprocess.run. So we can just set capture_output=True, which would more conclusively obviate the need for a wrapper function.

@slarse
Copy link
Collaborator

slarse commented Jan 27, 2022

@algomaster99 Now we can actually make this update more objectively worthwhile. As we've now dropped support for Python 3.6, we can use the capture_output parameter to supbrocess.run to capture output :)

@algomaster99
Copy link
Contributor Author

@slarse

make this update more objectively worthwhile

I agree. The wrapper is no longer needed now :)

@algomaster99
Copy link
Contributor Author

@slarse The failing test passes on my local so it is making it harder to debug. Do you have an idea what could be the reason?

@slarse
Copy link
Collaborator

slarse commented Feb 3, 2022

It failed for me with Python 3.10.1, but it seems unrelated to your changes. Something probably changed with asyncio going from 3.10.0 to 3.10.1.

I've pushed a potential fix, let's see if it works :)

@slarse
Copy link
Collaborator

slarse commented Feb 3, 2022

Indeed, get_event_loop was deprecated in 3.10: https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.get_event_loop

But tests are still failing. I'm relatively certain it's not due to your changes. My lunch hour is up soon so I'll have to return to this tonight :)

@slarse
Copy link
Collaborator

slarse commented Feb 4, 2022

Hmm, so this is now passing for me locally. Not quite sure what the problem is.

@slarse
Copy link
Collaborator

slarse commented Feb 4, 2022

Pretty sure this has something to do with the version of Git and how the repos are initialized. This error message is prevalent:

fatal: couldn't find remote ref HEAD

But the expected error message is with an uppercase c:

fatal: Couldn't find remote ref HEAD

So they probably made it lowercase at some point. I just need to casefold here and it should be fixed.

@algomaster99
Copy link
Contributor Author

Thanks for debugging it :)
I think this is ready for merge? We can ignore the complaint by codecov.

@slarse
Copy link
Collaborator

slarse commented Feb 4, 2022

Agreed, thanks @algomaster99 :)

@slarse slarse merged commit 9295e75 into repobee:master Feb 4, 2022
@algomaster99 algomaster99 deleted the remove-subprocess-wrapper branch February 4, 2022 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants