New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Miscellanous fixups of a miscellaneous nature #171
Conversation
I am glad you found the way back out of the rabbit hole! Going to review and test your changes in detail. |
src/aurman/wrappers.py
Outdated
expac_base_query = "expac {} '{}'".format(option, "?!".join(["%{}".format(formatter) for formatter in formatting])) | ||
base_query_length = len(expac_base_query.encode("utf8")) | ||
queries_parameters = split_query_helper(max_query_length, base_query_length, 1, targets) | ||
cmd = ["expac", option, "?!".join(["%{}".format(formatter) for formatter in formatting])] + targets | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't even understand what this used to be doing. Why does this need to be split? ARG_MAX is quite reasonable, but if not, then it should accept names on stdin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When being run with subprocess.run, the command may be too long to be executed, I had those values researched when I coded that part, it really is needed to split the query, if it is too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, splitting it on 131071 doesn't even make sense, how do you know what ARG_MAX is.
But like I said, if you're genuinely afraid the number of targets will overflow ARG_MAX, the appropriate response is not to try splitting it into multiple commands, but switch modes of operation and tell expac to read from the not-ARG_MAX-limited stdin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about the number of arguments, but the number of characters of the query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARG_MAX is the kernel-defined character (not argument) limit of a command line being passed through exec(2)
. Is there something different, which is expac-specific, that you are referring to?
Because the number in question is the current value of ARG_MAX from uapi/linux/limits.h but again this is not really a reliable way of determining what is safe, and it's probably better to skip the issue entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, I misinterpreted the meaning of ARG_MAX then, guess it's the value I looked up back then, when I researched the max limit. I do not see any way listed on https://github.com/falconindy/expac to make expac read from stdin however.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use -
as a target (and /dev/stdin
is not a terminal), expac
will read targets from standard input.
For example (use printf
so standard input is not a terminal)
$ printf "%s\n" aurman coreutils pacman
aurman
coreutils
pacman
$ printf "%s\n" aurman coreutils pacman | expac -Q "%n %v" -
aurman 2.12-1
coreutils 8.29-1
pacman 5.1.0-2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, like I said and, specifically, implemented in 4ac65c9
src/docker_tests/install_tests.py
Outdated
test_command("rm -rf expac-git/") | ||
|
||
# install aurman | ||
test_command("sudo python setup.py install --optimize=1", "/home/aurman/aurman-git") | ||
test_command("sudo python setup.py install --optimize=1", dir_to_execute=d(d(d(a(__file__))))) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are less reliant on being run inside docker, hardcoded paths kind of suck.
src/docker_tests/install_tests.py
Outdated
return_code = run(command, shell=True, stdout=DEVNULL, stderr=DEVNULL).returncode | ||
else: | ||
return_code = run(command, shell=True, stdout=DEVNULL, stderr=DEVNULL, cwd=dir_to_execute).returncode | ||
test_output = run(command, shell=True, stdout=DEVNULL, stderr=PIPE, universal_newlines=True, cwd=dir_to_execute) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole function actually could be used a lot more generically everywhere, I stopped trying to see error output on Travis partway through, so it only touched this file. Oh well.
It often seems to claim failure on Rebase with more thorough fix for error codes, which might fix the timing or else expose the error more obviously. |
Actually I had planned to not use "cower" anymore due to the needed PGP key. I had noticed that, too, and wanted to change all that when rewriting most of the docker tests. had planned to do that after my exams, which start on Monday... fun times |
src/aurman/classes.py
Outdated
|
||
# if last commit seen hash file does not exist - create | ||
if not os.path.isfile(last_commit_hash_file): | ||
empty_tree_hash = run( | ||
"git hash-object -t tree --stdin < /dev/null", | ||
shell=True, | ||
["git", "hash-object", "-t", "tree", "/dev/null"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e. g. here, how do you know, that this is doing exactly the same as executing git hash-object -t tree --stdin < /dev/null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, running them both produces the same results, because they're both reading the contents of /dev/null. It's actually a constant in the git source code, but using hash-object is recommended because that does not depend on git using SHA1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, maybe I have not made clear, what I wanted to know. The List writing ["git", "hash-object", "-t", "tree", "/dev/null"]
leaves out --stdin <
, but why is still exactly that being executed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
["git", "hash-object", "-t", "tree", "/dev/null"]
will execute git hash-object -t tree /dev/null
git hash-object <file>
is equivalent to git hash-object --stdin < <file>
(see man git-hash-object
)
git hash-object -t tree /dev/null
is the recommended method of deriving the empty tree object (for example, the official sample pre-commit
hook)
src/docker_tests/install_tests.py
Outdated
test_command("rm -rf expac-git/") | ||
|
||
# install aurman | ||
test_command("sudo python setup.py install --optimize=1", "/home/aurman/aurman-git") | ||
test_command("sudo python setup.py install --optimize=1", dir_to_execute=d(d(d(a(__file__))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the previous invocation, you omit dir_to_execute
. Since this invocation is in the same directory, doesn't it make sense to omit it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who omitted what?
I modified an existing instance to derive the dir_to_execute
from the code itself rather than making assumptions about where the testsuite is cloned to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I was referring to the fact that the tests are executing from /home/aurman
anyway. But your approach seems more robust.
src/docker_tests/archive_tests.py
Outdated
print("Error with: '{}'".format(command)) | ||
|
||
if CurrentTest.to_return == 0 and return_code != 0: | ||
if CurrentTest.to_return == 0 and test_output.returncode != 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it necessary to test for CurrentTest.to_return == 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not especially relevant to this PR, I guess.
Thanks to @eli-schwartz for your effort and also thanks to @saleemrashid for the explanations and also the effort trying to further improve this PR. I am currently preparing for exams, hence I am not going to merge this over the next days (I guess), because I also want to look through the whole |
Rebase on top of #173 |
Have changed the docker tests, see: 63b126a Currently still using |
I changed the keyserver for the docker tests, I guess using the "pool" was the problem, I took the ubuntu keyserver for the joke of it |
321746f
to
de85983
Compare
Status? |
got my last exam next week, will look into and merge this PR probably on the weekend, so 6-8 July |
Rebased to resolve conflicts with #186 |
I have cherrypicked some commits, rest to be done next week |
Don't fork subprocesses in order to handle things in the stdlib (python has a very decent in-process C module for this). Also DRY. Also better chained error messages.
In nearly every case it is better to run subprocess.run() without the burden of forking a bash shell too. All it takes is to turn strings into arrays of strings and *remove* some use of formatting. Aurman specifically, does not use *any* of the fancy shell interpretation that sometimes makes forking a shell to fork the commands, a worthwhile endeavor. So don't do so. This is further cleanup after the removal of subprocesses for file operations. Only changes where this cleanly applies have been done for now... in a followup commit, subprocesses that handle arguments as heavily-quoted strings will be handled.
Some functions pass around lots of string-formatted arguments. Handling this properly would require major refactoring to use lists everywhere. For now, use shlex to convert strings to a proper list-based array that subprocess.run can elegantly handle without running its own shell parser.
Instead of decoding the stdout of subprocess.run and then doing our own writes, just hand subprocess the file object and let it write directly.
Appending is a bit faster than constucting new lists on top of the old ones. It also avoids the need to use two lists.
Develop branch looks good, and has the heroic quality of making aurman actually use native lists for this, instead of the shlex dance, which is definitely nice. Thanks! |
see: 5a40935...32e2ac6 |
Initially motivated by the desire to update the PGP error string, then I found subprocesses using shells everywhere and jumped down the rabbit hole.