-
Notifications
You must be signed in to change notification settings - Fork 23.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add archive option in git plugin #23332
Conversation
The test
The test
|
@s-hertel I have started working on this PR |
@Akasurde Hey, I'm sorry I keep missing you on IRC. I will get any messages you send to me there. I'm PTO today and tomorrow so I haven't been at the computer much. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few things I noticed that could be improved or fixed. @Akasurde I'm on irc (randynobx) if you want to PM me. I'm happy to give any assistance I can.
@@ -899,6 +917,8 @@ def main(): | |||
recursive=dict(default='yes', type='bool'), | |||
track_submodules=dict(default='no', type='bool'), | |||
umask=dict(default=None, type='raw'), | |||
archive=dict(default='no', type='bool'), | |||
archive_format=dict(default="zip"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can force choices here by adding the choices key:
archive_format=dict(default='zip', choices=['zip', 'tar']
By doing this, AnsibleModule will handle the argument input for you (what default is, and to fail if a bad choice is given)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
@@ -977,6 +999,17 @@ def main(): | |||
track_submodules = module.params['track_submodules'] | |||
|
|||
result.update(before=None) | |||
|
|||
if archive_fmt not in ['zip', 'tar']: | |||
archive_fmt = 'tar' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this if
block here, since the AnsibleModule arg handling will do it for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool.
archive_fmt = 'tar' | ||
|
||
if archive: | ||
cmd = "%s archive --format=%s --remote=%s --output=%s master" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this command not be using a variable for the branch, in the case that it is specified by the version
argument in a playbook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I will replace master with version
argument in a playbook.
archive_fmt = 'tar' | ||
|
||
if archive: | ||
cmd = "%s archive --format=%s --remote=%s --output=%s master" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this command work on all supported platforms? I tried this patch on mac and it failed.
Example I used:
- git:
repo: 'https://github.com/ganeshrn/ansible.git'
dest: /var/tmp/ansible
archive: yes
TASK [git] ************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to perform archive operation"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git-archive does not work with http URLs, only git:// URLs.
If you run the command:
% git archive --format=tar --remote='https://github.com/ganeshrn/ansible.git' --output=/var/tmp/ansible master
You get the following:
fatal: Operation not supported by protocol.
This is probably something that should be in the documentation, and checked for in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Gets a shipit from me after a few things are addressed.
cmd = "%s clone %s %s" %(git_path, repo, repo_name) | ||
(rc, out, err) = module.run_command(cmd, cwd=dest) | ||
if rc != 0: | ||
module.fail_json(msg="Failed to clone source") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail if the destination exists. Is that the desired behavior? Since it is making a directory, maybe it should check if the directory already exists and if not create it? If it does exist, maybe we should be able to continue onto the archiving step.
In addition, here's a hypothetical scenario: if this archiving fails the first time because the user uses an invalid URL to archive this part of the operation will succeed and the directory will be created. When the URL is fixed and this operation will run a second time it will fail here and not be readily apparent why. I think importing traceback and then failing with module.fail_json(msg="Failed to clone source", exception=traceback.format_exc())
would be really helpful here for future debugging.
(rc, out, err) = module.run_command(cmd, cwd="%s/%s" % (dest, | ||
repo_name)) | ||
if rc != 0: | ||
module.fail_json(msg="Failed to perform archive operation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding traceback (module.fail_json(msg="Failed to perform archive operation", exception=traceback.format_exc())
) to help here too would be nice, since git archive doesn't always work right out of the box.
repo_name)) | ||
if rc != 0: | ||
module.fail_json(msg="Failed to perform archive operation") | ||
module.exit_json(**result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't showing changed=True for me when is succeeds. In fact, I don't know if it will ever complete without changing something, since right now we're failing if the git archive command isn't successful and if the directory creation isn't successful. Maybe a nice addition would be to compare the checksums of the contents of the archive destination to see if they are different than initially so changed could be False. But that would entail putting adding the zipfile to the directory, since I think we wouldn't want to rewrite the possibly existing file. Not sure what the best course of action is here. But exit with changed=True here, for now at least.
@@ -977,6 +1011,39 @@ def main(): | |||
track_submodules = module.params['track_submodules'] | |||
|
|||
result.update(before=None) | |||
|
|||
if archive: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is behaving as expected now. :) I noticed one more thing. Since this module supports check mode and archive will always change things (by creating directories, cloning, creating and archive - and replacing if it was pre-existing) lines 1016-1044 should only report the things that will change if check mode is used rather than actually implementing it. Here is an example of it being used elsewhere in this module.
shipit |
@Jmainguy Yes. I have tested on following OS's
and, this works as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Akasurde for this PR and on integrating the previous comments.
Besides the inline comment one idea that could be implement in here or in a future PR:
The changed
detection in checkmode could be improved by taking the changed
state of the checkout and if that is unchanged generate the archive and compare it.
# Git archive is not supported by all git servers, so | ||
# we will first clone and perform git archive from local directory | ||
if dest and not os.path.exists(gitconfig): | ||
clone(git_path, module, repo, dest, remote, depth, version, bare, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do the clone here and not use the clone/fetch/... code that already exists?
The archive option in this way works only with a local checkout, which the module already does, so I'd rather use the existing checkout setup and run archive afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant at the end of the module, so after https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/source_control/git.py#L1067
At that point you can assume dest exists and has the version you want checked out, so you can remove this if/clone statement and just move the rest of the code down.
Since the main routine is quite long it's probably a good idea to put all the archive code in a separate routine, such that in main you have something like
if archive:
create_archive(module, archive, git_path, version)
# If git archive file exists, then create md5sum of existing file | ||
# and compare it with new git archive file. if match, do nothing | ||
# if does not match, then replace existing with temp archive file. | ||
md5sum_old = hashlib.md5(open(archive, 'rb').read()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of calculating the hash, you can compare the file content directly using e.g. https://docs.python.org/2/library/filecmp.html
Since you read both files full anyhow, I don't see how the hash helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
@robinro Take a look, I included all review comments. |
# filecmp is supposed to be efficient than md5sum checksum | ||
if filecmp.cmp(new_archive, archive): | ||
result.update(changed=False) | ||
module.exit_json(**result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace this with return
, the actual exit_json
call happens later and there is some ssh_wrapper code in the main routine that should run.
Where do you remove tempdir/new_archive? Maybe that code got lost in one of the last refactorings.
Remove the temp part should also happen before the exit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I will remove new temp archive after comparison.
module.exit_json(**result) | ||
|
||
create_archive(git_path, module, dest, archive, version, repo, result) | ||
result.update(changed=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the result.update
into the create_archive
such that
if os.path.exists(..):
...
else:
result.update(changed=True)
git_archive()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
|
||
create_archive(git_path, module, dest, archive, version, repo, result) | ||
result.update(changed=True) | ||
module.exit_json(**result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this.
The final module exit is 10 lines down from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Done.
# If git archive file exists, then compare it with new git archive file. | ||
# if match, do nothing | ||
# if does not match, then replace existing with temp archive file. | ||
tempdir = tempfile.mkdtemp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The remove is in the wrong place and tempdir is not removed. Better:
tempdir = tempfile.mkdtemp()
new_archive = os.path.join(tempdir, 'archive.' + archive_fmt)
git_archive(git_path, module, dest, new_archive, archive_fmt, version)
archive_unchanged = filecmp.cmp(new_archive, archive)
shutil.rmtree(tempdir)
if archive_unchanged:
result.update(changed=False)
else:
...
ready_for_review |
except OSError: | ||
pass | ||
# filecmp is supposed to be efficient than md5sum checksum | ||
if filecmp.cmp(new_archive, archive): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You remove before the comparison, see my previous comment.
Please test these code paths.
ready_for_review |
This fix will add archive option in git plugin Fixes ansible#22943 Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
ready_for_review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK for me.
@willthames you have an open review: Can you review the latest version? |
@willthames Could you please review this PR? |
As far as I cen tell all of @willthames comments are included, so: |
Thanks everyone for comments and review. |
* Add archive option in git plugin This fix will add archive option in git plugin Fixes ansible#22943 Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com> * Update code as per code review comments Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
SUMMARY
This fix will add archive option in git plugin
Fixes #22943
ISSUE TYPE
COMPONENT NAME
lib/ansible/modules/source_control/git.py
ANSIBLE VERSION