-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git file install: support for file creation from Git repositories #2754
Git file install: support for file creation from Git repositories #2754
Conversation
28e76e2
to
689784e
Compare
Let us know when you want us to have a review at this. |
Ready for review, not ready to go in IMO! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@benfitzpatrick, what's left to do? |
I'd like to do a bit of in-anger user-case testing if we feel it's close enough as-is? Shouldn't take too long. |
I've done some testing against real cases I could find in our workflows, found and fixed some issues in afddb2b. I tested a case with 6 GitHub file installs in a single rose-app.conf - we were worried about hitting a rate limit. It worked fine and looked like these happen sequentially, at least from the verbose output? Happy to demo offline. Otherwise I'm happy |
Curious, I'll check that's because GitHub is serving the requests sequentially rather than Rose failing to run them in parallel. |
Little bump |
After a bit of digging I'm happy that these operations are being run concurrently, resulting in their subprocesses being run in parallel. To track subprocesses, try the following diff: diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..8ec00c94 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -372,6 +372,7 @@ class RosePopener:
proc = Popen(args[0], **kwargs)
else:
command = ' '.join(map(shlex.quote, args))
+ print(f'# {command}') # process started
proc = await asyncio.create_subprocess_shell(command, **kwargs)
except OSError as exc:
if exc.filename is None and args:
@@ -392,7 +393,9 @@ class RosePopener:
if isinstance(kwargs.get("stdin"), str):
stdin = kwargs.get("stdin")
stdout, stderr = await proc.communicate(stdin)
+ command = ' '.join(map(shlex.quote, args))
await proc.wait()
+ print(f'$ {command}') # process returned
return proc.returncode, stdout, stderr
async def run_ok_async(self, *args, **kwargs):
Using this diff with the following config: [command]
default=true
[file:cylc/scheduler.py]
source=git:git@github.com:cylc/cylc-flow.git::cylc/flow/scheduler.py::master
[file:cylc/task_proxy.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/task_proxy.py::master
[file:cylc/async_util.py]
source=git:https://github.com/cylc/cylc-flow.git::cylc/flow/async_util.py::master I get this output showing up to three subprocesses running simultaneously:
You can also test it like this: diff --git a/metomi/rose/popen.py b/metomi/rose/popen.py
index 8993fb93..35bdbe27 100644
--- a/metomi/rose/popen.py
+++ b/metomi/rose/popen.py
@@ -371,7 +371,8 @@ class RosePopener:
if kwargs.get("shell"):
proc = Popen(args[0], **kwargs)
else:
- command = ' '.join(map(shlex.quote, args))
+ # command = ' '.join(map(shlex.quote, args))
+ command = 'sleep 5'
proc = await asyncio.create_subprocess_shell(command, **kwargs)
except OSError as exc:
if exc.filename is None and args: The time taken should increase linearly as the sleep is increased, but only see a small increase (parsing overheads) when more files are configured for installation. |
Ah, I know why I thought they were sequential - the parsing is serial, the actual pulling can be concurrent. Sorry-thanks! |
if loc.loc_type == "tree": | ||
dest += "/" | ||
cmd = self.manager.popen.get_cmd("rsync", name, dest) | ||
await self.manager.popen.run_ok_async(*cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm getting a failure here:
[INFO] 2024-06-03T11:56:18+0100 rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpzpxb2lpj/README.md /var/tmp/tmpjul8aid7/e99d80506f38bfee39893fd042f2f5d6
[FAIL] 2024-06-03T11:56:19+0100 rsync -a '--exclude=.*' --timeout=1800 '--rsh=ssh -oBatchMode=yes -oStrictHostKeyChecking=no -oConnectTimeout=8' /var/tmp/tmpzpxb2lpj/README.md /var/tmp/tmpjul8aid7/e99d80506f38bfee39893fd042f2f5d6 # return-code=3, stderr=
[FAIL] 2024-06-03T11:56:19+0100 rsync: change_dir#3 "/var/tmp/tmpjul8aid7" failed: No such file or directory (2)
[FAIL] 2024-06-03T11:56:19+0100 rsync error: errors selecting input/output files, dirs (code 3) at main.c(694) [Receiver=3.1.2]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(status update - we're finding it hard to independently reproduce this one, currently a mystery)
ad11e99
to
5a16399
Compare
Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review - LGTM, some small comments.
@@ -856,7 +857,7 @@ def parse(self, loc, conf_tree): | |||
# Scheme specified in the configuration. | |||
handler = self.get_handler(loc.scheme) | |||
if handler is None: | |||
raise ValueError(loc.name) | |||
raise ValueError(f"don't support scheme {loc.scheme}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] maybe a slightly cryptic error message for users to digest:
raise ValueError(f"don't support scheme {loc.scheme}") | |
raise ValueError(f"Rose doesn't support scheme {loc.scheme}") |
@@ -865,7 +866,7 @@ def parse(self, loc, conf_tree): | |||
if handler is None: | |||
handler = self.guess_handler(loc) | |||
if handler is None: | |||
raise ValueError(loc.name) | |||
raise ValueError(f"don't know how to process {loc.name}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] maybe a slightly cryptic error message for users to digest:
raise ValueError(f"don't know how to process {loc.name}") | |
raise ValueError(f"Rose doesn't know how to process {loc.name}") |
# sparse-checkout available and suitable for this case. | ||
await self.manager.popen.run_ok_async( | ||
"git", git_dir_opt, "sparse-checkout", "set", path, | ||
"--no-cone" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is required, but worth noting the documented warnings:
"git", git_dir_opt, f"--work-tree={tmpdirname}", "checkout", | ||
loc.key | ||
) | ||
name = tmpdirname + "/" + path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[minor] os.path.join
would be nicer.
You should set ``git config uploadpack.allowFilter true`` and | ||
``git config uploadpack.allowAnySHA1InWant true`` on repositories | ||
if you are setting them up to pull from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth explaining that this only applies to repos within your management (e.g. local repos).
E.G. this doesn't apply to a GitHub hosted repository (I think)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
I've opened #2784 - it looks like the bug I found is unrelated to this work. |
Co-authored-by: Oliver Sanders <oliver.sanders@metoffice.gov.uk>
Co-authored-by: Ronnie Dutta <61982285+MetRonnie@users.noreply.github.com>
Whilst working on #2785, I've noticed that "cannot connect to" type errors appear to surface with the following message:
Where the underlying cause here is that GitHub SSH access has not been configured. |
I could do a "or remote not contactable?" on the end of the error message? |
If it's possible to differentiate between these cases, great, but if not that'll be fine. |
git: differentiate between no such branch and no such repo
Waiting for tests to pass again, then ready to merge. |
Good news, the tests pass. Bad news, if some of the tests skip, then the test fails, e.g: diff --git a/t/rose-app-run/28-git.t b/t/rose-app-run/28-git.t
index 1b4eaaa90..92a411611 100644
--- a/t/rose-app-run/28-git.t
+++ b/t/rose-app-run/28-git.t
@@ -72,12 +72,12 @@ remote_locations=("$HOSTNAME:$TEST_DIR/hellorepo/" "http://localhost:$GIT_WS_POR
for i in 0 1 2; do
remote_mode="${remote_test_modes[$i]}"
remote="${remote_locations[$i]}"
- if [[ "$remote_mode" == "ssh" ]] && ! ssh -n -q -oBatchMode=yes $HOSTNAME true 1>'/dev/null' 2>/dev/null; then
+ if true; then
skip 14 "cannot ssh to localhost $HOSTNAME"
echo "Skip $remote" >/dev/tty
continue
fi
- if [[ "$remote_mode" == "http" ]] && ! curl --head --silent --fail $remote >/dev/null 2>&1; then
+ if true; then
skip 14 "failed to launch http on localhost"
echo "Skip $remote" >/dev/tty
continue :(
Not a release blocker. Suggest bumping this to future work.
|
@MetRonnie, could you check the last couple of commits and merge if you're happy with the tests being pushed to a bugfix release milestone. |
Brill, that's fixed it (will close issue). Tested with all possible skip combinations, test passed each time. |
(Mac OS tests failing due to gh-actions DNS issues, safe to ignore) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge if you're happy with the tests being pushed to a bugfix release milestone.
It's a minor release rather than a bugfix release though?
(the test fixes would have been on a bugfix release milestone, however, there is no need for a post-fix, Ben has sorted it on this PR) |
Work in progress!
Aims to address #1419.
I'm not that happy with the configuration syntax, although I am happy with the three main elements of it as elements.
The mechanism itself uses filtering and either partial clones or proper sparse checkouts where the Git version allows.