Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive module does not adds empty(zero-size) files to dest archive #34569

Open
chromium58 opened this issue Jan 8, 2018 · 15 comments

Comments

Projects
None yet
@chromium58
Copy link

commented Jan 8, 2018

ISSUE TYPE
  • Bug Report
COMPONENT NAME

archive

ANSIBLE VERSION
ansible 2.4.2.0
CONFIGURATION
OS / ENVIRONMENT

N/A

SUMMARY

Archive module does not add empty(zero-size) files to archive. I think it is caused by this condition https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/files/archive.py#L319

In this condition we are comparing file that we adding to archive with just created archive. In case of empty file it should not work. Why do we need this condition?

STEPS TO REPRODUCE
  1. Create directory and few empty files in it
  2. Fill files except one with some data
  3. Apply playbook to this directory
$ ls -la ~/test-archive-bug/
total 12
drwxrwxrwx 0 user user 4096 Jan  8 13:20 .
drwxr-xr-x 0 user user 4096 Jan  8 13:16 ..
-rw-rw-rw- 1 user user    2 Jan  8 13:12 1.txt
-rw-rw-rw- 1 user user    2 Jan  8 13:12 2.txt
-rw-rw-rw- 1 user user    0 Jan  8 13:12 3.txt
$ cat test.yml
- hosts: all
  tasks:
    - name: Test archive
      archive:
        path: /home/user/test-archive-bug
        dest: /home/user/test.tar.gz
        format: gz
$ ansible-playbook test.yml -c local -i localhost,
  1. See that file with zero-size was not added to archive
EXPECTED RESULTS

That all files including empty file will be added to archive

ACTUAL RESULTS

Empty(zero-size) file was not added to archive

@ansibot

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2018

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2018

@alikins

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2018

Looking through git history 2aa3f52 seems to introduce the filecmp.cmp

Looks like that was added to handle some corner cases with empty archives and similar
odd scenarios, so it's possible that doesn't handle empty files correctly.

Haven't reproduced or bisected the bug yet, so thats not necessarily the cause.

@alikins alikins removed the needs_triage label Jan 8, 2018

@chromium58

This comment has been minimized.

Copy link
Author

commented Jan 8, 2018

I did some research, looks like filecmp.cmp causes this issue. In my test case file with name 3.txt is zero size and filecmp.cmp says that it is equal with my just created tar.gz archive :)

$ ls -la /home/user/test-archive-bug
Total 16
drwxrwxr-x  2 user user 4096 Jan  8 20:02 .
drwxr-xr-x 87 user user 4096 Jan  8 20:36 ..
-rw-rw-r--  1 user user    2 Jan  8 20:03 1.txt
-rw-rw-r--  1 user user    2 Jan  8 20:03 2.txt
-rw-rw-r--  1 user user    0 Jan  8 20:02 3.txt
$ cat test-archive-bug.py
import tarfile
import filecmp
import os
import time

epoch_time = int(time.time())
path = '/home/user/test-archive-bug/'
dest = '/home/user/test' + str(epoch_time) + '.tar.gz'
arcfile = tarfile.open(dest, 'w|gz')

for dirpath, dirnames, filenames in os.walk(path, topdown=True):
  for filename in filenames:
    fullpath = dirpath + filename
    print('Comparing ' + fullpath + ' with ' + dest + ' --- result is ' + str(filecmp.cmp(fullpath,dest)))

arcfile.close()
$ python --version
Python 2.7.12
$ python test-archive-bug.py
Comparing /home/user/test-archive-bug/2.txt with /home/user/test1515433022.tar.gz --- result is False
Comparing /home/user/test-archive-bug/3.txt with /home/user/test1515433022.tar.gz --- result is True
Comparing /home/user/test-archive-bug/1.txt with /home/user/test1515433022.tar.gz --- result is False
@alikins

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2018

I can reproduce this:

files:

[newswoop:F27:archive_empty_files_34569 (master % u=)]$ ls -laR
.:
total 20
drwxrwxr-x.  4 adrian adrian 4096 Jan  8 11:47 .
drwxrwxr-x. 71 adrian adrian 4096 Jan  8 11:41 ..
drwxrwxr-x.  2 adrian adrian 4096 Jan  8 11:42 test-archive-bug
-rw-rw-r--.  1 adrian adrian  179 Jan  8 11:47 test.yml
drwxrwxr-x.  2 adrian adrian 4096 Jan  8 11:47 the_dest

./test-archive-bug:
total 16
drwxrwxr-x. 2 adrian adrian 4096 Jan  8 11:42 .
drwxrwxr-x. 4 adrian adrian 4096 Jan  8 11:47 ..
-rw-rw-r--. 1 adrian adrian    6 Jan  8 11:42 1.txt
-rw-rw-r--. 1 adrian adrian   11 Jan  8 11:42 2.txt
-rw-rw-r--. 1 adrian adrian    0 Jan  8 11:42 3.txt

./the_dest:
total 8
drwxrwxr-x. 2 adrian adrian 4096 Jan  8 11:47 .
drwxrwxr-x. 4 adrian adrian 4096 Jan  8 11:47 ..

test.yml:

---
- hosts: localhost
  gather_facts: false
  tasks:
    - name: Test archive
      archive:
        path: test-archive-bug
        dest: the_dest/test.tar.gz
        format: gz

command output:

[newswoop:F27:archive_empty_files_34569 (master % u=)]$ ansible-playbook -i hosts -vv test.yml 
ansible-playbook 2.5.0 (devel 42dd48a6d0) last updated 2018/01/08 10:22:02 (GMT -400)
  config file = None
  configured module search path = [u'/home/adrian/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /home/adrian/src/ansible/lib/ansible
  executable location = /home/adrian/src/ansible/bin/ansible-playbook
  python version = 2.7.14 (default, Dec 11 2017, 14:52:53) [GCC 7.2.1 20170915 (Red Hat 7.2.1-2)]
No config file found; using defaults

PLAYBOOK: test.yml ***************************************************************************************************
1 plays in test.yml

PLAY [localhost] *****************************************************************************************************
META: ran handlers

TASK [Test archive] **************************************************************************************************
task path: /home/adrian/src/ansible-bug-repro/archive_empty_files_34569/test.yml:5
changed: [localhost] => {"archived": ["test-archive-bug/1.txt", "test-archive-bug/2.txt"], "arcroot": "/", "attempts": 1, "changed": true, "dest": "the_dest/test.tar.gz", "expanded_exclude_paths": [], "expanded_paths": ["test-archive-bug"], "gid": 1000, "group": "adrian", "missing": [], "mode": "0664", "owner": "adrian", "secontext": "unconfined_u:object_r:user_home_t:s0", "size": 183, "state": "file", "uid": 1000}
META: ran handlers
META: ran handlers

PLAY RECAP ***********************************************************************************************************
localhost                  : ok=1    changed=1    unreachable=0    failed=0   

the_dest/

[newswoop:F27:archive_empty_files_34569 (master % u=)]$ ls -lart the_dest/
total 12
drwxrwxr-x. 4 adrian adrian 4096 Jan  8 11:50 ..
drwxrwxr-x. 2 adrian adrian 4096 Jan  8 11:50 .
-rw-rw-r--. 1 adrian adrian  183 Jan  8 11:50 test.tar.gz

the_dest/test.tar.gz contents:

[newswoop:F27:archive_empty_files_34569 (master % u=)]$ tar tlf the_dest/test.tar.gz 
test-archive-bug/1.txt
test-archive-bug/2.txt
@alikins

This comment has been minimized.

Copy link
Contributor

commented Jan 8, 2018

bisecting is kind of ambiguous (lots of api changes ) but narrows down commits to:

e10e1e358c1c6ebbdfb36b99d346b6233d4638d1
506f3f68f14b8e554bb9e4993f37918a8b6a4c9d
d4b51adf728b1c61b1f38d6d3209e64017755a66
102ee6a3b47f2c7aebaa90356e96bd282236df23
c9291e06f6d9f3e7e4558ed64ec588c93bb5d1c0
3c8d788c1196963c26277691ed95fd54b9c710fe
8fc2a22b4ce4066aafefe550876acd63d6de965c
fb03fc8eb104f5c85da39e8daccd8bf84026e032

Of those, these look like suspects:

pinging @bendoh

@lck

This comment has been minimized.

Copy link

commented Jul 25, 2018

Please, can anyone please finally fix this bug? It is there since filecmp.cmp was introduced and it really can produce unpredictable result in production! Thank You.

@abadger

This comment has been minimized.

Copy link
Member

commented Dec 14, 2018

Fix for this bug should be stating both files and If they are different number of bytes, treating them as different. if they are both zero bytes, treating them as the same. else run filecmp on them to see if they are the same or different. It's a reasonably easy fix but this module is in need of a maintainer to work on it.

@Alexander198961

This comment has been minimized.

Copy link
Contributor

commented Dec 25, 2018

just to make sure I understand logic correct if file and archive are different we are currently handle them (added to archive) but if they are the same we don't add file to archive. so maybe better just to write file in case zero byte size too (because we have situation when archive and file zero byte and treated as the same ) ? I got in all cases that we compare some file with newly created archive I am not sure do we need compare file (which we would like to add to archive) with dest archive at all?

@outreal

This comment has been minimized.

Copy link

commented Feb 9, 2019

I've been having this very problem with ansible 2.7.7.
It breaks some of my deployments, as they depend on some files existing, whether they're zero-sized or not.

@NitigyaS

This comment has been minimized.

Copy link

commented Feb 12, 2019

Hi, I am a newcomer here. I have done a fix for this in #52117 .
Can someone please review this once, maybe suggest any changes.

@jpmens

This comment has been minimized.

Copy link
Contributor

commented Jun 7, 2019

This bug report was opened 1.5 years ago. It is marked as easy_fix and bug. It still demonstrates the same behavior (missing files in archive == data loss) in ansible 2.8.0.

cc: @bsdlme

@gundalow

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2019

Does #52117 fix the issue for people?

@jpmens

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2019

#56139 works for me.

@jpmens

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2019

#52117 works for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.