Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-process bagit.validate breaks AIP re-ingest #708

Closed
jrwdunham opened this issue Aug 10, 2017 · 3 comments
Closed

Multi-process bagit.validate breaks AIP re-ingest #708

jrwdunham opened this issue Aug 10, 2017 · 3 comments
Assignees
Labels
Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Milestone

Comments

@jrwdunham
Copy link
Contributor

To re-create: using AM qa/1.x and SS qa/0.x, create an AIP and then attempt to do a metadata-only reingest on it. AM should spin its wheels for 5 seconds and then notify you that "Error re-ingesting package: An unknown error occurred." If you insert some logging lines before and after https://github.com/artefactual/archivematica-storage-service/blob/qa/0.x/storage_service/locations/models/package.py#L843, you will see that the log messages after that line do not get called: the call to bag.validate(processes=4) hangs indefinitely. If you replace that call with the original bag.validate() everything works as normal.

Note that the bag should be valid and validation should pass—and does pass when the processes kwarg is NOT passed in.

I have tried to recreate this issue in a simplified environment but have not had success. The following python script tries multi-process BagIt validation (on a separate thread for good measure) using a bag created on the fly as well as on a preexisting bag called 'breaking-transfer' that you must place in the same directory as the script. Create a file called testbagit.py and copy this code into it and then call python testbagit.py; you should see bagit behaving correctly:

import bagit
import os
import sys
import shutil
import time
from uuid import uuid4
from threading import Thread


DIR_TO_CREATE = 'bagittestdir'
REAL_BROKEN_BAG = 'breaking-transfer'
REAL_BROKEN_BAG_COPY = REAL_BROKEN_BAG + '-copy'


def create_test_dir():
    try:
        os.mkdir(DIR_TO_CREATE)
    except OSError:
        pass
    for _ in range(20):
        fname = str(uuid4()) + '.txt'
        fpath = os.path.join(DIR_TO_CREATE, fname)
        with open(fpath, 'w') as fd:
            fd.write('bagit test!')


def create_bag():
    create_test_dir()
    bag = bagit.make_bag(DIR_TO_CREATE, {'Contact-Name': 'John Kunze'})


def _validate_bag(bag_path):
    print 'starting to validate bag %s' % bag_path
    sys.stdout.flush()
    bag = bagit.Bag(bag_path)
    try:
        success = bag.validate(processes=4)
        print 'bag %s is valid!' % bag_path
    except bagit.BagValidationError as e:
        print 'got error on bag validate on %s' % bag_path
        print e
    print 'done validating bag %s' % bag_path
    sys.stdout.flush()


def validate_bag(bag_path):
    #_validate_bag(bag_path)
    t = Thread(target=_validate_bag, args=(bag_path,))
    t.start()
    t.join()


def validate_created_bag():
    validate_bag(DIR_TO_CREATE)


def invalidate_created_bag():
    path = os.path.join(DIR_TO_CREATE, 'data')
    fname = os.listdir(path)[0]
    fpath = os.path.join(path, fname)
    with open(fpath, 'w') as fd:
        fd.write('I changed this file!')


def destroy_test_dir():
    shutil.rmtree(DIR_TO_CREATE)


def destroy_broken_bag_copy():
    shutil.rmtree(REAL_BROKEN_BAG_COPY)


def copy_broken_bag():
    shutil.copytree(REAL_BROKEN_BAG, REAL_BROKEN_BAG_COPY)


def validate_broken_bag():
    validate_bag(REAL_BROKEN_BAG_COPY)


def experiment1():
    """Create a dir, make it into a bag, invalidate the bag by changing a file,
    attempt to validate it with multiple processes, then destroy it.
    """
    create_bag()
    invalidate_created_bag()
    validate_created_bag()
    destroy_test_dir()


def experiment2():
    """Copy an existing purportedly broken bag, attempt to validate it on a
    separate thread with multiple processes, then destroy the copy.
    """
    copy_broken_bag()
    validate_broken_bag()
    destroy_broken_bag_copy()


if __name__ == '__main__':
    experiment1()
    experiment2()

Does anyone have any idea what is causing this issue? Are you able to re-create the issue given my description above? I plan to create a SS PR that reverts the bag.validate(processes=4) calls to bag.validate() but obviously I would prefer to keep the multi-process call figure out the real cause of the problem...

@jrwdunham jrwdunham added this to the 1.7.0 milestone Aug 10, 2017
@jrwdunham jrwdunham added the Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result. label Aug 10, 2017
@jrwdunham
Copy link
Contributor Author

Thanks to sevein for figuring out the source of this: it's an incompatibility between Gunicorn's gevent worker class and BagIt's use of multiprocessing. In order for bag.validate(processes=4) to work, Gunicorn must be using the sync worker class.

qubot pushed a commit to artefactual/archivematica-storage-service that referenced this issue Aug 11, 2017
Allows user to configure number of processes to use when calling
BagIt-python's ``validate`` method. Using more than one when the
Gunicorn worker class is 'gevent' causes BagIt validate to hang. This
change warns the user/installer of Archivematica about that in two
places.
qubot pushed a commit to artefactual/archivematica-storage-service that referenced this issue Aug 11, 2017
Allows user to configure number of processes to use when calling
BagIt-python's ``validate`` method. Using more than one when the
Gunicorn worker class is 'gevent' causes BagIt validate to hang. This
change warns the user/installer of Archivematica about that in two
places.
qubot pushed a commit to artefactual/archivematica-storage-service that referenced this issue Aug 11, 2017
Allows user to configure number of processes to use when calling
BagIt-python's ``validate`` method. Using more than one when the
Gunicorn worker class is 'gevent' causes BagIt validate to hang. This
change warns the user/installer of Archivematica about that in two
places.
@jrwdunham
Copy link
Contributor Author

helenst pushed a commit to JiscSD/archivematica-storage-service that referenced this issue Aug 23, 2017
Allows user to configure number of processes to use when calling
BagIt-python's ``validate`` method. Using more than one when the
Gunicorn worker class is 'gevent' causes BagIt validate to hang. This
change warns the user/installer of Archivematica about that in two
places.
helenst added a commit to JiscSD/archivematica-storage-service that referenced this issue Aug 23, 2017
Allows user to configure number of processes to use when calling
BagIt-python's ``validate`` method. Using more than one when the
Gunicorn worker class is 'gevent' causes BagIt validate to hang. This
change warns the user/installer of Archivematica about that in two
places.
@kellyannewithane
Copy link

Bug still present (augustus.qa)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Projects
None yet
Development

No branches or pull requests

4 participants