Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing old exported histories failing #4268

Merged
merged 3 commits into from Jul 12, 2017

Conversation

Projects
None yet
5 participants
@cche
Copy link
Contributor

commented Jun 29, 2017

I have exported histories on older versions of galaxy (<17.01) in order to keep them as backup. I can not import them anymore in newer version.

While exporting histories on older versions of galaxy (using the bioblend API) there was a subdirectory created at the base of the archive (Galaxy-History-). This is not the format expected by the import history tool which expects all the data to be found on the base of the archive and the datasets in a 'datasets' subdirectory. This produces a 'history_attrs.txt' file not found error and the import fails.

The workaround implemented is to check for the existence of 'history_attrs.txt' on the base of the archive and if not found check for the existence in the subdirectory of the archive.

Cheers,
Cristian

@galaxybot galaxybot added the triage label Jun 29, 2017

@galaxybot galaxybot added this to the 17.09 milestone Jun 29, 2017

dirs = os.listdir(archive_dir)
for d in dirs:
if os.path.isdir(os.path.join(archive_dir, d)):
archive_dir = os.path.join(archive_dir, d)

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 29, 2017

Member

The indentation is off here, it should be 4 spaces always.
Also I think you can add a break after this, since we expect only load a single archive_dir.

This comment has been minimized.

Copy link
@cche

cche Jun 29, 2017

Author Contributor

Ok. Done!

@@ -74,6 +74,11 @@ def get_tag_str( tag, value ):
#
# Create history.
#
if not os.path.exists(os.path.join( archive_dir, 'history_attrs.txt')):

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Jun 30, 2017

Member

Sorry for the piecemeal review, but could you drop the whitespace before archive_dir, move the added code before the existing comment and add a comment saying something like '# Galaxy pre 17.01 stored exported histories in a subdirectory ?

@mvdbeek

This comment has been minimized.

Copy link
Member

commented Jun 30, 2017

So I tried exporting a history from 16.07 and importing to dev, and that worked without the PR (which is not to say it works all the time of course). Do you have a small example of a failing import ? Otherwise the code looks fine to me.

@cche

This comment has been minimized.

Copy link
Contributor Author

commented Jun 30, 2017

@cche

This comment has been minimized.

Copy link
Contributor Author

commented Jun 30, 2017

@nsoranzo

This comment has been minimized.

Copy link
Member

commented Jun 30, 2017

@cche Some of us prefer adding a space between a parenthesis and the content, and many don't. But that should be done consistently and both after the opening and before the closing parenthesis.

Cristian
Moved the code up and added an explanatory comment.
Also added spaces within the parens to match the existing style.
@mvdbeek

This comment has been minimized.

Copy link
Member

commented Jun 30, 2017

@galaxybot test this

@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 4, 2017

@cche I don't think that history attached - I don't see it anywhere in the PR. Can you try again using the web interface somehow or send it to jmchilton@gmail.com?

@cche

This comment has been minimized.

Copy link
Contributor Author

commented Jul 4, 2017

@jmchilton I answered directly to the mail. Here is the history that I exported from a Galaxy instance that, according to the dates of the exported files, was almost surely version 16.01.
deseq-hzu-109-110-111.tar.gz

@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 12, 2017

It took some effort - but definitely able to replicate this locally with that file and I was able to get Galaxy to generate those files. I'd like to find out why this works sometimes and not others - but for now I really appreciate the fix!

@jmchilton jmchilton merged commit 641a268 into galaxyproject:dev Jul 12, 2017

5 checks passed

api test Build finished. 279 tests run, 0 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 150 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 34 tests run, 0 skipped, 0 failed.
Details
toolshed test Build finished. 579 tests run, 0 skipped, 0 failed.
Details
@jmchilton

This comment has been minimized.

Copy link
Member

commented Jul 12, 2017

For future-@jmchilton or others to look into this more - here is a patch to allow testing the import of that history:

diff --git a/lib/galaxy/tools/imp_exp/__init__.py b/lib/galaxy/tools/imp_exp/__init__.py
index 3fde45703f..62af7f4605 100644
--- a/lib/galaxy/tools/imp_exp/__init__.py
+++ b/lib/galaxy/tools/imp_exp/__init__.py
@@ -69,6 +69,8 @@ class JobImportHistoryArchiveWrapper( object, UsesAnnotations ):
         if jiha:
             try:
                 archive_dir = jiha.archive_dir
+                log.info("Archive dir is %s" % archive_dir)
+                log.info("Archive dir contents are %s" % os.listdir(archive_dir))
                 user = jiha.job.user

                 #
diff --git a/test/api/test_histories.py b/test/api/test_histories.py
index 6a36e7ae5d..8d8538cfbd 100644
--- a/test/api/test_histories.py
+++ b/test/api/test_histories.py
@@ -212,6 +212,12 @@ class HistoriesApiTestCase( api.ApiTestCase ):
         )
         assert imported_content == "1 2 3\n"

+    def test_import_from_older(self):
+        import_data = dict( archive_source="https://github.com/galaxyproject/galaxy/files/1122357/deseq-hzu-109-110-111.tar.gz", archive_type="url" )
+        import_response = self._post( "histories", data=import_data )
+        import time
+        time.sleep(60)
+
     def test_create_tag( self ):
         post_data = dict( name="TestHistoryForTag" )
         history_id = self._post( "histories", data=post_data ).json()["id"]

Run the tests with:

./run_tests.sh -api test/api/test_histories.py:HistoriesApiTestCase.test_import_from_older
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.