Analysis refactor sync with master - DEPENDS ON #2138 (#2139)

* fix #1505 * improving some GUI stuff * improving some GUI stuff - missing lines * addressing all comments * ready for review * fix #1987 * initial commit * requested changes * fix filter job list * Fixing server cert (#2051) * fix get_studies * flake8 * fix #503 * fix #2010 * fix #1913 * fix errors * addressing @josenavas comment * flake8 * fix #1010 * fix #1066 (#2058) * addressing @josenavas comments * fix #1961 * fix #1837 * Automatic jobs & new stats (#2057) * fix #814, fix #1636 * fixing error in test-env * fixing stats.html call * adding img * addressing @josenavas comments * rm for loops * addresssing @ElDeveloper comments * generalizing this functionality * fix #1805 * adding button * fix errors * fix #1816 * fixing failures tests * fix #1959 * addressing @josenavas comments * addressing @josenavas comments * fixing error * fixed? * addressing @josenavas comments * addressing @wasade comments * fix flake8 * generate biom and metadata release (#2066) * initial commit * adding portal * addressing @josenavas comments * pid -> qiita_artifact_id * addressing @josenavas comments * addressing @ElDeveloper comments * rm 50.sql * database changes to fix 969 * adding delete * addressing @josenavas comments * addressing @ElDeveloper comments * duh! * fix generate_biom_and_metadata_release (#2072) * fix generate_biom_and_metadata_release * addressing @ElDeveloper comment * Removing qiita ware code that will not be used anymore * Organizing the handlers and new analysis description page * fixing timestamp * rm formats * st -> pt * Connecting the analysis creation and making interface responsive * Addressing @antgonza's comments * Initial artifact GUI refactor * Removing unused code * moving to ISO 8601 - wow :'( * fix errors * addressing @wasade comments * Adding can_edit call to the analysis * Fixing artifact rest API since not all artifacts have study * Adding can_be_publicized call to analysis * Adding QiitaHTTPError to handle errors gracefully * Adding safe_execution contextmanager * Fixing typo * Adding qiita test checker * Adapting some artifact handlers * Abstracting the graph reloading and adding some documentation * Fixing typo * Fixing changing artifact visibility * Fixing delete * Fixing artifact deletion * Adding default parameters to the commands * Fixing processing page * Fixing variable name * fixing private/public studies * Changing bdiv metrics to single choice * sanbox-to-sandbox * flake8 * Fixing patch * fixing other issues * adding share documentation * psycopg2 <= 2.7 * psycopg2 < 2.7 * Various small fixes to be able to run tests on the plugins * Adding private module * Fixing processing job completion * Fixing patch 52 * Fixing call * Fixing complete * small fixes * init commit * fixing errors * fixing errors due to update * Making the download work * Fixing tests * working status * adding tags, the right way! * fix error * Addressing @antgonza's comments * Adding missing test * Ignoring tgz - thanks @antgonza * addressing @josenavas comments * list study tags * fix error * adding tags to public * adding docs * addressing @wasade comment * addressing @josenavas and @wasade comments * addressing @wasade request * fix #2091 * option 2: @ElDeveloper and @josenavas * A minimal REST API for Qiita (#2094) * TST: Add initial test cases for study handler * ENH: Add initial study rest api * API: test if a study exists * ENH: oauth2 forced * Get back basic study deets * TST: test for samples collection * API: rest get sample IDs from a study * ENH: samples/info handler * broken routes * API: request sample metadata * ENH/API: Add methods to check for a study person * ENH/API: Add POST methods for study person * TST: Add tests for from_name_and_affiliation * TST: study creation * BUG: Add headers to tests * ENH: create study * Adjust GET on study description * API: Add endpoints for preparation creation * TST: 200 :D * TST: Correctly verify study instantiation * TST: prep artifact creation * ENH/API: associate artifacts with a preparation * TST: test study statys * ENH: study status * Removed trailing whitespace * STY: PEP8 * MAINT: refactor, centralize setup boilerplate * REFACTOR: Remove repeated code * DOC: Remove unnecessary comments * REFACTOR: Missing removal of pattern * STY: Fix PEP8 errors * BUG: Incorrectly changed error code * BUG/TST: Fix typo in tests * Addressing an @antgonza comment * Another @antgonza comment * RVW: Address review comments * ENH: Cleanup webserver and name-spaces * ENH: Improve error messages * ENH: Add more descriptive error message * TST: Exercise different argument types * DOC: Add documentation for REST API * ENH: Remove extra comma * ENH/DOC: update/add samples to sample information via rest (#2097) * Changing how artifact visibility works (#2098) * changing how artifact visibility works * fixing code * fix errors * fixing edit check access * fix * fix #2086 * flak8 * addressing @ElDeveloper comments + fixes * adding the final changes * fix failures * get_qiita_version -> generate_biom_and_metadata_release * download raw data * adding missing empty files * Adding endpoint to retrieve list of person (#2103) * Adding missing endpoint * Addressing @ElDeveloper comment * fix #2086 (#2102) * fix #2086 * flak8 * addressing @ElDeveloper comments + fixes * adding the final changes * fix failures * get_qiita_version -> generate_biom_and_metadata_release * addressing @wasade comments and fix errors * fix error? * rm vfabu + addressing @wasade and @josenavas comments + fix errors * just being dumb! * Deblur quality mention (#2107) * Revised rst values used for section headers (#2108) * Adding processing handlers * Fixing latlongs (#2120) * public studies are being shown in the user own studies * fix #2069 - adding tests * flake8 * Fixing url and bug on processing job workflow * Adding the private script runner * Adding is_analysis column to the command * Adding retrieval of commands excluding analysis commands * Addressing bug on retrieving information from redis * Enabling the command register endpoint to provide if the command is analysis only * Improving study list speed (#2123) * init commit * reading if ... * fixing tests * rethinking listint * split SQL * reseting sql * ignoring prep_total_samples * finishing changes * adding comment: @ElDeveloper * adding message: @josenavas * addressing @ElDeveloper, @josenavas @jdereus comments * fixing download buttons show (#2127) * fixing download buttons show * addressing @ElDeveloper comment * connecting tutorials to CMI * adding link in main page * fixing prep getting-started * mv complex designs * fix idents * Addressing @antgonza's comments * Addressing @wasade's comments * Supporting multiple choice * Adding documentation * limiting number of jobs retrieved * Modifying handler to pass allow_change_optionals * returning optional parameters * Addressing bug found by @antgonza * Enabling changing the default parameters * Adding correct class * Allowing user to change default parameters * Fixing bug with commands listing * Enabling arbitrary htmls in the summary * Prepping for merge hell * Addressing @wasade's comments * Addressing @antgonza's comment
qiita-spots · Jun 1, 2017 · f5cff17 · f5cff17
1 parent 2a1129a
commit f5cff17
Show file tree

Hide file tree

Showing 79 changed files with 3,867 additions and 1,722 deletions.
diff --git a/qiita_core/tests/test_util.py b/qiita_core/tests/test_util.py
@@ -12,7 +12,8 @@
 
 from qiita_core.util import (
     send_email, qiita_test_checker, execute_as_transaction, get_qiita_version,
-    is_test_environment)
+    is_test_environment, get_release_info)
+from qiita_db.meta_util import generate_biom_and_metadata_release
 import qiita_db as qdb
 
 
@@ -64,6 +65,21 @@ def test_get_qiita_version(self):
         # testing just the version
         self.assertEqual(exp_version, qdb.__version__)
 
+    def test_get_release_info(self):
+        # making sure there is a release
+        generate_biom_and_metadata_release('private')
+        # just checking that is not empty cause the MD5 will change on every
+        # run
+        md5sum, filepath, timestamp = get_release_info('private')
+        self.assertNotEqual(md5sum, '')
+        self.assertNotEqual(filepath, '')
+        self.assertNotEqual(timestamp, '')
+
+        md5sum, filepath, timestamp = get_release_info('public')
+        self.assertEqual(md5sum, '')
+        self.assertEqual(filepath, '')
+        self.assertEqual(timestamp, '')
+
 
 if __name__ == '__main__':
     main()
diff --git a/qiita_core/util.py b/qiita_core/util.py
@@ -11,6 +11,7 @@
 from os.path import dirname
 from git import Repo
 from git.exc import InvalidGitRepositoryError
+from moi import r_client
 
 from qiita_core.qiita_settings import qiita_config
 from qiita_pet import __version__ as qiita_pet_lib_version
@@ -141,3 +142,32 @@ def get_qiita_version():
         sha = ''
 
     return (qiita_pet_lib_version, sha)
+
+
+def get_release_info(study_status='public'):
+    """Returns the study status release MD5
+
+    Parameters
+    ----------
+    study_status : str, optional
+        The study status to search for. Note that this should always be set
+        to 'public' but having this exposed helps with testing. The other
+        options are 'private' and 'sandbox'
+
+    Returns
+    ------
+    str, str, str
+        The release MD5, filepath and timestamp
+    """
+    portal = qiita_config.portal
+    md5sum = r_client.get('%s:release:%s:md5sum' % (portal, study_status))
+    filepath = r_client.get('%s:release:%s:filepath' % (portal, study_status))
+    timestamp = r_client.get('%s:release:%s:time' % (portal, study_status))
+    if md5sum is None:
+        md5sum = ''
+    if filepath is None:
+        filepath = ''
+    if timestamp is None:
+        timestamp = ''
+
+    return md5sum, filepath, timestamp
diff --git a/qiita_db/artifact.py b/qiita_db/artifact.py
@@ -274,9 +274,11 @@ def create(cls, filepaths, artifact_type, name=None, prep_template=None,
 
         Notes
         -----
-        The visibility of the artifact is set by default to `sandbox`
-        The timestamp of the artifact is set by default to `datetime.now()`
-        The value of `submitted_to_vamps` is set by default to `False`
+        The visibility of the artifact is set by default to `sandbox` if
+        prep_template is passed but if parents is passed we will inherit the
+        most closed visibility.
+        The timestamp of the artifact is set by default to `datetime.now()`.
+        The value of `submitted_to_vamps` is set by default to `False`.
         """
         # We need at least one file
         if not filepaths:
@@ -689,19 +691,21 @@ def visibility(self, value):
         only applies when the new visibility is more open than before.
         """
         with qdb.sql_connection.TRN:
+            # In order to correctly propagate the visibility we need to find
+            # the root of this artifact and then propagate to all the artifacts
+            sql = "SELECT * FROM qiita.find_artifact_roots(%s)"
+            qdb.sql_connection.TRN.add(sql, [self.id])
+            root_id = qdb.sql_connection.TRN.execute_fetchlast()
+            root = qdb.artifact.Artifact(root_id)
+            # these are the ids of all the children from the root
+            ids = [a.id for a in root.descendants.nodes()]
+
             sql = """UPDATE qiita.artifact
                      SET visibility_id = %s
-                     WHERE artifact_id = %s"""
-            qdb.sql_connection.TRN.add(
-                sql, [qdb.util.convert_to_id(value, "visibility"), self.id])
+                     WHERE artifact_id IN %s"""
+            vis_id = qdb.util.convert_to_id(value, "visibility")
+            qdb.sql_connection.TRN.add(sql, [vis_id, tuple(ids)])
             qdb.sql_connection.TRN.execute()
-            # In order to correctly propagate the visibility upstream, we need
-            # to go one step at a time. By setting up the visibility of our
-            # parents first, we accomplish that, since they will propagate
-            # the changes to its parents
-            for p in self.parents:
-                visibilites = [[d.visibility] for d in p.descendants.nodes()]
-                p.visibility = qdb.util.infer_status(visibilites)
 
     @property
     def artifact_type(self):

diff --git a/qiita_db/meta_util.py b/qiita_db/meta_util.py
@@ -25,7 +25,8 @@
 from __future__ import division
 
 from moi import r_client
-from os import stat
+from os import stat, makedirs, rename
+from os.path import join, relpath, exists
 from time import strftime, localtime
 import matplotlib.pyplot as plt
 import matplotlib as mpl
@@ -34,8 +35,11 @@
 from StringIO import StringIO
 from future.utils import viewitems
 from datetime import datetime
+from tarfile import open as topen, TarInfo
+from hashlib import md5
 
 from qiita_core.qiita_settings import qiita_config
+from qiita_core.configuration_manager import ConfigurationManager
 import qiita_db as qdb
 
 
@@ -126,14 +130,21 @@ def validate_filepath_access_by_user(user, filepath_id):
             # the prep access is given by it's artifacts, if the user has
             # access to any artifact, it should have access to the prep
             # [0] cause we should only have 1
-            a = qdb.metadata_template.prep_template.PrepTemplate(
-                pid[0]).artifact
-            if (a.visibility == 'public' or a.study.has_access(user)):
-                return True
+            pt = qdb.metadata_template.prep_template.PrepTemplate(
+                pid[0])
+            a = pt.artifact
+            # however, the prep info file could not have any artifacts attached
+            # , in that case we will use the study access level
+            if a is None:
+                return qdb.study.Study(pt.study_id).has_access(user)
             else:
-                for c in a.descendants.nodes():
-                    if (c.visibility == 'public' or c.study.has_access(user)):
-                        return True
+                if (a.visibility == 'public' or a.study.has_access(user)):
+                    return True
+                else:
+                    for c in a.descendants.nodes():
+                        if ((c.visibility == 'public' or
+                             c.study.has_access(user))):
+                            return True
             return False
         # analyses
         elif anid:
@@ -305,7 +316,8 @@ def get_lat_longs():
                  WHERE table_name SIMILAR TO 'sample_[0-9]+'
                     AND table_schema = 'qiita'
                     AND column_name IN ('latitude', 'longitude')
-                    AND SPLIT_PART(table_name, '_', 2)::int IN %s;"""
+                    AND SPLIT_PART(table_name, '_', 2)::int IN %s
+                    GROUP BY table_name HAVING COUNT(column_name) = 2;"""
         qdb.sql_connection.TRN.add(sql, [tuple(portal_table_ids)])
 
         sql = [('SELECT CAST(latitude AS FLOAT), '
@@ -319,3 +331,99 @@ def get_lat_longs():
         qdb.sql_connection.TRN.add(sql)
 
         return qdb.sql_connection.TRN.execute_fetchindex()
+
+
+def generate_biom_and_metadata_release(study_status='public'):
+    """Generate a list of biom/meatadata filepaths and a tgz of those files
+
+    Parameters
+    ----------
+    study_status : str, optional
+        The study status to search for. Note that this should always be set
+        to 'public' but having this exposed helps with testing. The other
+        options are 'private' and 'sandbox'
+    """
+    studies = qdb.study.Study.get_by_status(study_status)
+    qiita_config = ConfigurationManager()
+    working_dir = qiita_config.working_dir
+    portal = qiita_config.portal
+    bdir = qdb.util.get_db_files_base_dir()
+    time = datetime.now().strftime('%m-%d-%y %H:%M:%S')
+
+    data = []
+    for s in studies:
+        # [0] latest is first, [1] only getting the filepath
+        sample_fp = relpath(s.sample_template.get_filepaths()[0][1], bdir)
+
+        for a in s.artifacts(artifact_type='BIOM'):
+            if a.processing_parameters is None:
+                continue
+
+            cmd_name = a.processing_parameters.command.name
+
+            # this loop is necessary as in theory an artifact can be
+            # generated from multiple prep info files
+            human_cmd = []
+            for p in a.parents:
+                pp = p.processing_parameters
+                pp_cmd_name = pp.command.name
+                if pp_cmd_name == 'Trimming':
+                    human_cmd.append('%s @ %s' % (
+                        cmd_name, str(pp.values['length'])))
+                else:
+                    human_cmd.append('%s, %s' % (cmd_name, pp_cmd_name))
+            human_cmd = ', '.join(human_cmd)
+
+            for _, fp, fp_type in a.filepaths:
+                if fp_type != 'biom' or 'only-16s' in fp:
+                    continue
+                fp = relpath(fp, bdir)
+                # format: (biom_fp, sample_fp, prep_fp, qiita_artifact_id,
+                #          human readable name)
+                for pt in a.prep_templates:
+                    for _, prep_fp in pt.get_filepaths():
+                        if 'qiime' not in prep_fp:
+                            break
+                    prep_fp = relpath(prep_fp, bdir)
+                    data.append((fp, sample_fp, prep_fp, a.id, human_cmd))
+
+    # writing text and tgz file
+    ts = datetime.now().strftime('%m%d%y-%H%M%S')
+    tgz_dir = join(working_dir, 'releases')
+    if not exists(tgz_dir):
+        makedirs(tgz_dir)
+    tgz_name = join(tgz_dir, '%s-%s-building.tgz' % (portal, study_status))
+    tgz_name_final = join(tgz_dir, '%s-%s.tgz' % (portal, study_status))
+    txt_hd = StringIO()
+    with topen(tgz_name, "w|gz") as tgz:
+        # writing header for txt
+        txt_hd.write(
+            "biom_fp\tsample_fp\tprep_fp\tqiita_artifact_id\tcommand\n")
+        for biom_fp, sample_fp, prep_fp, artifact_id, human_cmd in data:
+            txt_hd.write("%s\t%s\t%s\t%s\t%s\n" % (
+                biom_fp, sample_fp, prep_fp, artifact_id, human_cmd))
+            tgz.add(join(bdir, biom_fp), arcname=biom_fp, recursive=False)
+            tgz.add(join(bdir, sample_fp), arcname=sample_fp, recursive=False)
+            tgz.add(join(bdir, prep_fp), arcname=prep_fp, recursive=False)
+
+        txt_hd.seek(0)
+        info = TarInfo(name='%s-%s-%s.txt' % (portal, study_status, ts))
+        info.size = len(txt_hd.buf)
+        tgz.addfile(tarinfo=info, fileobj=txt_hd)
+
+    with open(tgz_name, "rb") as f:
+        md5sum = md5()
+        for c in iter(lambda: f.read(4096), b""):
+            md5sum.update(c)
+
+    rename(tgz_name, tgz_name_final)
+
+    vals = [
+        ('filepath', tgz_name_final[len(working_dir):], r_client.set),
+        ('md5sum', md5sum.hexdigest(), r_client.set),
+        ('time', time, r_client.set)]
+    for k, v, f in vals:
+        redis_key = '%s:release:%s:%s' % (portal, study_status, k)
+        # important to "flush" variables to avoid errors
+        r_client.delete(redis_key)
+        f(redis_key, v)