Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add archive option in git plugin #23332

Merged
merged 2 commits into from
May 15, 2017
Merged

Add archive option in git plugin #23332

merged 2 commits into from
May 15, 2017

Conversation

Akasurde
Copy link
Member

@Akasurde Akasurde commented Apr 6, 2017

SUMMARY

This fix will add archive option in git plugin

Fixes #22943

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

lib/ansible/modules/source_control/git.py

ANSIBLE VERSION
ansible 2.2.1.0

@ansibot ansibot added WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers. affects_2.4 This issue/PR affects Ansible v2.4 core_review In order to be merged, this PR must follow the core review workflow. feature_pull_request module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. labels Apr 6, 2017
@ansibot
Copy link
Contributor

ansibot commented Apr 6, 2017

The test ansible-test sanity --test pep8 failed with the following error:

lib/ansible/modules/source_control/git.py:196:1: W293 blank line contains whitespace

The test ansible-test sanity --test validate-modules failed with the following error:

lib/ansible/modules/source_control/git.py:0:0: E309 version_added for new option (archive) should be 2.4. Currently 2.1

click here for bot help

@ansibot ansibot added ci_verified Changes made in this PR are causing tests to fail. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Apr 6, 2017
@Akasurde
Copy link
Member Author

Akasurde commented Apr 6, 2017

@s-hertel I have started working on this PR

@s-hertel s-hertel removed the needs_triage Needs a first human triage before being processed. label Apr 6, 2017
@s-hertel
Copy link
Contributor

s-hertel commented Apr 6, 2017

@Akasurde Hey, I'm sorry I keep missing you on IRC. I will get any messages you send to me there. I'm PTO today and tomorrow so I haven't been at the computer much.

@ansibot
Copy link
Contributor

ansibot commented Apr 6, 2017

@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. and removed ci_verified Changes made in this PR are causing tests to fail. labels Apr 7, 2017
Copy link

@randynobx randynobx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few things I noticed that could be improved or fixed. @Akasurde I'm on irc (randynobx) if you want to PM me. I'm happy to give any assistance I can.

@@ -899,6 +917,8 @@ def main():
recursive=dict(default='yes', type='bool'),
track_submodules=dict(default='no', type='bool'),
umask=dict(default=None, type='raw'),
archive=dict(default='no', type='bool'),
archive_format=dict(default="zip"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can force choices here by adding the choices key:
archive_format=dict(default='zip', choices=['zip', 'tar']
By doing this, AnsibleModule will handle the argument input for you (what default is, and to fail if a bad choice is given)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@@ -977,6 +999,17 @@ def main():
track_submodules = module.params['track_submodules']

result.update(before=None)

if archive_fmt not in ['zip', 'tar']:
archive_fmt = 'tar'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this if block here, since the AnsibleModule arg handling will do it for you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

archive_fmt = 'tar'

if archive:
cmd = "%s archive --format=%s --remote=%s --output=%s master" \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this command not be using a variable for the branch, in the case that it is specified by the version argument in a playbook?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will replace master with version argument in a playbook.

archive_fmt = 'tar'

if archive:
cmd = "%s archive --format=%s --remote=%s --output=%s master" \
Copy link
Member

@ganeshrn ganeshrn Apr 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this command work on all supported platforms? I tried this patch on mac and it failed.

Example I used:

  - git:
      repo: 'https://github.com/ganeshrn/ansible.git'
      dest: /var/tmp/ansible
      archive: yes
TASK [git] ************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "Failed to perform archive operation"}

Copy link

@randynobx randynobx Apr 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git-archive does not work with http URLs, only git:// URLs.

If you run the command:
% git archive --format=tar --remote='https://github.com/ganeshrn/ansible.git' --output=/var/tmp/ansible master
You get the following:
fatal: Operation not supported by protocol.

This is probably something that should be in the documentation, and checked for in the code.

Copy link
Contributor

@s-hertel s-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Gets a shipit from me after a few things are addressed.

cmd = "%s clone %s %s" %(git_path, repo, repo_name)
(rc, out, err) = module.run_command(cmd, cwd=dest)
if rc != 0:
module.fail_json(msg="Failed to clone source")
Copy link
Contributor

@s-hertel s-hertel Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail if the destination exists. Is that the desired behavior? Since it is making a directory, maybe it should check if the directory already exists and if not create it? If it does exist, maybe we should be able to continue onto the archiving step.
In addition, here's a hypothetical scenario: if this archiving fails the first time because the user uses an invalid URL to archive this part of the operation will succeed and the directory will be created. When the URL is fixed and this operation will run a second time it will fail here and not be readily apparent why. I think importing traceback and then failing with module.fail_json(msg="Failed to clone source", exception=traceback.format_exc()) would be really helpful here for future debugging.

(rc, out, err) = module.run_command(cmd, cwd="%s/%s" % (dest,
repo_name))
if rc != 0:
module.fail_json(msg="Failed to perform archive operation")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding traceback (module.fail_json(msg="Failed to perform archive operation", exception=traceback.format_exc())) to help here too would be nice, since git archive doesn't always work right out of the box.

repo_name))
if rc != 0:
module.fail_json(msg="Failed to perform archive operation")
module.exit_json(**result)
Copy link
Contributor

@s-hertel s-hertel Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't showing changed=True for me when is succeeds. In fact, I don't know if it will ever complete without changing something, since right now we're failing if the git archive command isn't successful and if the directory creation isn't successful. Maybe a nice addition would be to compare the checksums of the contents of the archive destination to see if they are different than initially so changed could be False. But that would entail putting adding the zipfile to the directory, since I think we wouldn't want to rewrite the possibly existing file. Not sure what the best course of action is here. But exit with changed=True here, for now at least.

@ansibot ansibot removed the core_review In order to be merged, this PR must follow the core review workflow. label Apr 11, 2017
@ansibot ansibot added core_review In order to be merged, this PR must follow the core review workflow. and removed core_review In order to be merged, this PR must follow the core review workflow. labels Apr 15, 2017
@@ -977,6 +1011,39 @@ def main():
track_submodules = module.params['track_submodules']

result.update(before=None)

if archive:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is behaving as expected now. :) I noticed one more thing. Since this module supports check mode and archive will always change things (by creating directories, cloning, creating and archive - and replacing if it was pre-existing) lines 1016-1044 should only report the things that will change if check mode is used rather than actually implementing it. Here is an example of it being used elsewhere in this module.

@Akasurde Akasurde changed the title [WIP] Add archive option in git plugin Add archive option in git plugin Apr 18, 2017
@s-hertel
Copy link
Contributor

shipit

@ansibot ansibot removed the WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers. label Apr 18, 2017
@Akasurde
Copy link
Member Author

@Jmainguy Yes. I have tested on following OS's

  1. Fedora 25
  2. RHEL7
  3. MacOS 10.12

and, this works as expected.

Copy link
Contributor

@robinro robinro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Akasurde for this PR and on integrating the previous comments.

Besides the inline comment one idea that could be implement in here or in a future PR:
The changed detection in checkmode could be improved by taking the changed state of the checkout and if that is unchanged generate the archive and compare it.

# Git archive is not supported by all git servers, so
# we will first clone and perform git archive from local directory
if dest and not os.path.exists(gitconfig):
clone(git_path, module, repo, dest, remote, depth, version, bare,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the clone here and not use the clone/fetch/... code that already exists?

The archive option in this way works only with a local checkout, which the module already does, so I'd rather use the existing checkout setup and run archive afterwards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robinro Do you mean after this code ? I added archive code here because I didn't want to complicate other part of code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant at the end of the module, so after https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/source_control/git.py#L1067

At that point you can assume dest exists and has the version you want checked out, so you can remove this if/clone statement and just move the rest of the code down.

Since the main routine is quite long it's probably a good idea to put all the archive code in a separate routine, such that in main you have something like

if archive:
    create_archive(module, archive, git_path, version)

# If git archive file exists, then create md5sum of existing file
# and compare it with new git archive file. if match, do nothing
# if does not match, then replace existing with temp archive file.
md5sum_old = hashlib.md5(open(archive, 'rb').read()).hexdigest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calculating the hash, you can compare the file content directly using e.g. https://docs.python.org/2/library/filecmp.html
Since you read both files full anyhow, I don't see how the hash helps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@Akasurde
Copy link
Member Author

@robinro Take a look, I included all review comments.

# filecmp is supposed to be efficient than md5sum checksum
if filecmp.cmp(new_archive, archive):
result.update(changed=False)
module.exit_json(**result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace this with return, the actual exit_json call happens later and there is some ssh_wrapper code in the main routine that should run.

Where do you remove tempdir/new_archive? Maybe that code got lost in one of the last refactorings.
Remove the temp part should also happen before the exit.

Copy link
Member Author

@Akasurde Akasurde Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I will remove new temp archive after comparison.

module.exit_json(**result)

create_archive(git_path, module, dest, archive, version, repo, result)
result.update(changed=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move the result.update into the create_archive such that

if os.path.exists(..):
   ...
else:
   result.update(changed=True)
   git_archive()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.


create_archive(git_path, module, dest, archive, version, repo, result)
result.update(changed=True)
module.exit_json(**result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this.
The final module exit is 10 lines down from here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Done.

# If git archive file exists, then compare it with new git archive file.
# if match, do nothing
# if does not match, then replace existing with temp archive file.
tempdir = tempfile.mkdtemp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove is in the wrong place and tempdir is not removed. Better:

tempdir = tempfile.mkdtemp()
new_archive = os.path.join(tempdir,  'archive.' + archive_fmt)
git_archive(git_path, module, dest, new_archive, archive_fmt, version)

archive_unchanged = filecmp.cmp(new_archive, archive)
shutil.rmtree(tempdir)

if archive_unchanged:
    result.update(changed=False)
else:
...

@Akasurde
Copy link
Member Author

ready_for_review

except OSError:
pass
# filecmp is supposed to be efficient than md5sum checksum
if filecmp.cmp(new_archive, archive):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You remove before the comparison, see my previous comment.
Please test these code paths.

@Akasurde
Copy link
Member Author

ready_for_review

This fix will add archive option in git plugin

Fixes ansible#22943

Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
@Akasurde
Copy link
Member Author

Akasurde commented May 4, 2017

ready_for_review

Copy link
Contributor

@robinro robinro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK for me.

@robinro
Copy link
Contributor

robinro commented May 4, 2017

@willthames you have an open review: Can you review the latest version?

@Akasurde
Copy link
Member Author

@willthames Could you please review this PR?

@robinro
Copy link
Contributor

robinro commented May 15, 2017

As far as I cen tell all of @willthames comments are included, so:
shipit

@ansibot ansibot added stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. and removed stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. labels May 15, 2017
@s-hertel s-hertel merged commit e967274 into ansible:devel May 15, 2017
@Akasurde
Copy link
Member Author

Thanks everyone for comments and review.

@Akasurde Akasurde deleted the gitarchive branch May 16, 2017 01:49
KKoukiou pushed a commit to KKoukiou/ansible that referenced this pull request May 22, 2017
* Add archive option in git plugin

This fix will add archive option in git plugin

Fixes ansible#22943

Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>

* Update code as per code review comments

Signed-off-by: Abhijeet Kasurde <akasurde@redhat.com>
@ansibot ansibot added feature This issue/PR relates to a feature request. and removed feature_pull_request labels Mar 4, 2018
@ansible ansible locked and limited conversation to collaborators Apr 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.4 This issue/PR affects Ansible v2.4 feature This issue/PR relates to a feature request. module This issue/PR relates to a module. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Archive option for git module
8 participants