Skip to content

Commit

Permalink
Analysis refactor sync with master - DEPENDS ON #2138 (#2139)
Browse files Browse the repository at this point in the history
* fix #1505

* improving some GUI stuff

* improving some GUI stuff - missing lines

* addressing all comments

* ready for review

* fix #1987

* initial commit

* requested changes

* fix filter job list

* Fixing server cert (#2051)

* fix get_studies

* flake8

* fix #503

* fix #2010

* fix #1913

* fix errors

* addressing @josenavas comment

* flake8

* fix #1010

* fix #1066 (#2058)

* addressing @josenavas comments

* fix #1961

* fix #1837

* Automatic jobs & new stats (#2057)

* fix #814, fix #1636

* fixing error in test-env

* fixing stats.html call

* adding img

* addressing @josenavas comments

* rm for loops

* addresssing @ElDeveloper comments

* generalizing this functionality

* fix #1805

* adding button

* fix errors

* fix #1816

* fixing failures tests

* fix #1959

* addressing @josenavas comments

* addressing @josenavas comments

* fixing error

* fixed?

* addressing @josenavas comments

* addressing @wasade comments

* fix flake8

* generate biom and metadata release (#2066)

* initial commit

* adding portal

* addressing @josenavas comments

* pid -> qiita_artifact_id

* addressing @josenavas comments

* addressing @ElDeveloper comments

* rm 50.sql

* database changes to fix 969

* adding delete

* addressing @josenavas comments

* addressing @ElDeveloper comments

* duh!

* fix generate_biom_and_metadata_release (#2072)

* fix generate_biom_and_metadata_release

* addressing @ElDeveloper comment

* Removing qiita ware code that will not be used anymore

* Organizing the handlers and new analysis description page

* fixing timestamp

* rm formats

* st -> pt

* Connecting the analysis creation and making interface responsive

* Addressing @antgonza's comments

* Initial artifact GUI refactor

* Removing unused code

* moving to ISO 8601 - wow :'(

* fix errors

* addressing @wasade comments

* Adding can_edit call to the analysis

* Fixing artifact rest API since not all artifacts have study

* Adding can_be_publicized call to analysis

* Adding QiitaHTTPError to handle errors gracefully

* Adding safe_execution contextmanager

* Fixing typo

* Adding qiita test checker

* Adapting some artifact handlers

* Abstracting the graph reloading and adding some documentation

* Fixing typo

* Fixing changing artifact visibility

* Fixing delete

* Fixing artifact deletion

* Adding default parameters to the commands

* Fixing processing page

* Fixing variable name

* fixing private/public studies

* Changing bdiv metrics to single choice

* sanbox-to-sandbox

* flake8

* Fixing patch

* fixing other issues

* adding share documentation

* psycopg2 <= 2.7

* psycopg2 < 2.7

* Various small fixes to be able to run tests on the plugins

* Adding private module

* Fixing processing job completion

* Fixing patch 52

* Fixing call

* Fixing complete

* small fixes

* init commit

* fixing errors

* fixing errors due to update

* Making the download work

* Fixing tests

* working status

* adding tags, the right way!

* fix error

* Addressing @antgonza's comments

* Adding missing test

* Ignoring tgz - thanks @antgonza

* addressing @josenavas comments

* list study tags

* fix error

* adding tags to public

* adding docs

* addressing @wasade comment

* addressing @josenavas and @wasade comments

* addressing @wasade request

* fix #2091

* option 2: @ElDeveloper and @josenavas

* A minimal REST API for Qiita (#2094)

* TST: Add initial test cases for study handler

* ENH: Add initial study rest api

* API: test if a study exists

* ENH: oauth2 forced

* Get back basic study deets

* TST: test for samples collection

* API: rest get sample IDs from a study

* ENH: samples/info handler

* broken routes

* API: request sample metadata

* ENH/API: Add methods to check for a study person

* ENH/API: Add POST methods for study person

* TST: Add tests for from_name_and_affiliation

* TST: study creation

* BUG: Add headers to tests

* ENH: create study

* Adjust GET on study description

* API: Add endpoints for preparation creation

* TST: 200 :D

* TST: Correctly verify study instantiation

* TST: prep artifact creation

* ENH/API: associate artifacts with a preparation

* TST: test study statys

* ENH: study status

* Removed trailing whitespace

* STY: PEP8

* MAINT: refactor, centralize setup boilerplate

* REFACTOR: Remove repeated code

* DOC: Remove unnecessary comments

* REFACTOR: Missing removal of pattern

* STY: Fix PEP8 errors

* BUG: Incorrectly changed error code

* BUG/TST: Fix typo in tests

* Addressing an @antgonza comment

* Another @antgonza comment

* RVW: Address review comments

* ENH: Cleanup webserver and name-spaces

* ENH: Improve error messages

* ENH: Add more descriptive error message

* TST: Exercise different argument types

* DOC: Add documentation for REST API

* ENH: Remove extra comma

* ENH/DOC: update/add samples to sample information via rest (#2097)

* Changing how artifact visibility works (#2098)

* changing how artifact visibility works

* fixing code

* fix errors

* fixing edit check access

* fix

* fix #2086

* flak8

* addressing @ElDeveloper comments + fixes

* adding the final changes

* fix failures

* get_qiita_version -> generate_biom_and_metadata_release

* download raw data

* adding missing empty files

* Adding endpoint to retrieve list of person (#2103)

* Adding missing endpoint

* Addressing @ElDeveloper comment

* fix #2086 (#2102)

* fix #2086

* flak8

* addressing @ElDeveloper comments + fixes

* adding the final changes

* fix failures

* get_qiita_version -> generate_biom_and_metadata_release

* addressing @wasade comments and fix errors

* fix error?

* rm vfabu + addressing @wasade and @josenavas comments + fix errors

* just being dumb!

* Deblur quality mention (#2107)

* Revised rst values used for section headers (#2108)

* Adding processing handlers

* Fixing latlongs (#2120)

* public studies are being shown in the user own studies

* fix #2069 - adding tests

* flake8

* Fixing url and bug on processing job workflow

* Adding the private script runner

* Adding is_analysis column to the command

* Adding retrieval of commands excluding analysis commands

* Addressing bug on retrieving information from redis

* Enabling the command register endpoint to provide if the command is analysis only

* Improving study list speed (#2123)

* init commit

* reading if ...

* fixing tests

* rethinking listint

* split SQL

* reseting sql

* ignoring prep_total_samples

* finishing changes

* adding comment: @ElDeveloper

* adding message: @josenavas

* addressing @ElDeveloper, @josenavas @jdereus comments

* fixing download buttons show (#2127)

* fixing download buttons show

* addressing @ElDeveloper comment

* connecting tutorials to CMI

* adding link in main page

* fixing prep getting-started

* mv complex designs

* fix idents

* Addressing @antgonza's comments

* Addressing @wasade's comments

* Supporting multiple choice

* Adding documentation

* limiting number of jobs retrieved

* Modifying handler to pass allow_change_optionals

* returning optional parameters

* Addressing bug found by @antgonza

* Enabling changing the default parameters

* Adding correct class

* Allowing user to change default parameters

* Fixing bug with commands listing

* Enabling arbitrary htmls in the summary

* Prepping for merge hell

* Addressing @wasade's comments

* Addressing @antgonza's comment
  • Loading branch information
josenavas authored and antgonza committed Jun 1, 2017
1 parent 2a1129a commit f5cff17
Show file tree
Hide file tree
Showing 79 changed files with 3,867 additions and 1,722 deletions.
18 changes: 17 additions & 1 deletion qiita_core/tests/test_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@

from qiita_core.util import (
send_email, qiita_test_checker, execute_as_transaction, get_qiita_version,
is_test_environment)
is_test_environment, get_release_info)
from qiita_db.meta_util import generate_biom_and_metadata_release
import qiita_db as qdb


Expand Down Expand Up @@ -64,6 +65,21 @@ def test_get_qiita_version(self):
# testing just the version
self.assertEqual(exp_version, qdb.__version__)

def test_get_release_info(self):
# making sure there is a release
generate_biom_and_metadata_release('private')
# just checking that is not empty cause the MD5 will change on every
# run
md5sum, filepath, timestamp = get_release_info('private')
self.assertNotEqual(md5sum, '')
self.assertNotEqual(filepath, '')
self.assertNotEqual(timestamp, '')

md5sum, filepath, timestamp = get_release_info('public')
self.assertEqual(md5sum, '')
self.assertEqual(filepath, '')
self.assertEqual(timestamp, '')


if __name__ == '__main__':
main()
30 changes: 30 additions & 0 deletions qiita_core/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from os.path import dirname
from git import Repo
from git.exc import InvalidGitRepositoryError
from moi import r_client

from qiita_core.qiita_settings import qiita_config
from qiita_pet import __version__ as qiita_pet_lib_version
Expand Down Expand Up @@ -141,3 +142,32 @@ def get_qiita_version():
sha = ''

return (qiita_pet_lib_version, sha)


def get_release_info(study_status='public'):
"""Returns the study status release MD5
Parameters
----------
study_status : str, optional
The study status to search for. Note that this should always be set
to 'public' but having this exposed helps with testing. The other
options are 'private' and 'sandbox'
Returns
------
str, str, str
The release MD5, filepath and timestamp
"""
portal = qiita_config.portal
md5sum = r_client.get('%s:release:%s:md5sum' % (portal, study_status))
filepath = r_client.get('%s:release:%s:filepath' % (portal, study_status))
timestamp = r_client.get('%s:release:%s:time' % (portal, study_status))
if md5sum is None:
md5sum = ''
if filepath is None:
filepath = ''
if timestamp is None:
timestamp = ''

return md5sum, filepath, timestamp
30 changes: 17 additions & 13 deletions qiita_db/artifact.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,9 +274,11 @@ def create(cls, filepaths, artifact_type, name=None, prep_template=None,
Notes
-----
The visibility of the artifact is set by default to `sandbox`
The timestamp of the artifact is set by default to `datetime.now()`
The value of `submitted_to_vamps` is set by default to `False`
The visibility of the artifact is set by default to `sandbox` if
prep_template is passed but if parents is passed we will inherit the
most closed visibility.
The timestamp of the artifact is set by default to `datetime.now()`.
The value of `submitted_to_vamps` is set by default to `False`.
"""
# We need at least one file
if not filepaths:
Expand Down Expand Up @@ -689,19 +691,21 @@ def visibility(self, value):
only applies when the new visibility is more open than before.
"""
with qdb.sql_connection.TRN:
# In order to correctly propagate the visibility we need to find
# the root of this artifact and then propagate to all the artifacts
sql = "SELECT * FROM qiita.find_artifact_roots(%s)"
qdb.sql_connection.TRN.add(sql, [self.id])
root_id = qdb.sql_connection.TRN.execute_fetchlast()
root = qdb.artifact.Artifact(root_id)
# these are the ids of all the children from the root
ids = [a.id for a in root.descendants.nodes()]

sql = """UPDATE qiita.artifact
SET visibility_id = %s
WHERE artifact_id = %s"""
qdb.sql_connection.TRN.add(
sql, [qdb.util.convert_to_id(value, "visibility"), self.id])
WHERE artifact_id IN %s"""
vis_id = qdb.util.convert_to_id(value, "visibility")
qdb.sql_connection.TRN.add(sql, [vis_id, tuple(ids)])
qdb.sql_connection.TRN.execute()
# In order to correctly propagate the visibility upstream, we need
# to go one step at a time. By setting up the visibility of our
# parents first, we accomplish that, since they will propagate
# the changes to its parents
for p in self.parents:
visibilites = [[d.visibility] for d in p.descendants.nodes()]
p.visibility = qdb.util.infer_status(visibilites)

@property
def artifact_type(self):
Expand Down
126 changes: 117 additions & 9 deletions qiita_db/meta_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@
from __future__ import division

from moi import r_client
from os import stat
from os import stat, makedirs, rename
from os.path import join, relpath, exists
from time import strftime, localtime
import matplotlib.pyplot as plt
import matplotlib as mpl
Expand All @@ -34,8 +35,11 @@
from StringIO import StringIO
from future.utils import viewitems
from datetime import datetime
from tarfile import open as topen, TarInfo
from hashlib import md5

from qiita_core.qiita_settings import qiita_config
from qiita_core.configuration_manager import ConfigurationManager
import qiita_db as qdb


Expand Down Expand Up @@ -126,14 +130,21 @@ def validate_filepath_access_by_user(user, filepath_id):
# the prep access is given by it's artifacts, if the user has
# access to any artifact, it should have access to the prep
# [0] cause we should only have 1
a = qdb.metadata_template.prep_template.PrepTemplate(
pid[0]).artifact
if (a.visibility == 'public' or a.study.has_access(user)):
return True
pt = qdb.metadata_template.prep_template.PrepTemplate(
pid[0])
a = pt.artifact
# however, the prep info file could not have any artifacts attached
# , in that case we will use the study access level
if a is None:
return qdb.study.Study(pt.study_id).has_access(user)
else:
for c in a.descendants.nodes():
if (c.visibility == 'public' or c.study.has_access(user)):
return True
if (a.visibility == 'public' or a.study.has_access(user)):
return True
else:
for c in a.descendants.nodes():
if ((c.visibility == 'public' or
c.study.has_access(user))):
return True
return False
# analyses
elif anid:
Expand Down Expand Up @@ -305,7 +316,8 @@ def get_lat_longs():
WHERE table_name SIMILAR TO 'sample_[0-9]+'
AND table_schema = 'qiita'
AND column_name IN ('latitude', 'longitude')
AND SPLIT_PART(table_name, '_', 2)::int IN %s;"""
AND SPLIT_PART(table_name, '_', 2)::int IN %s
GROUP BY table_name HAVING COUNT(column_name) = 2;"""
qdb.sql_connection.TRN.add(sql, [tuple(portal_table_ids)])

sql = [('SELECT CAST(latitude AS FLOAT), '
Expand All @@ -319,3 +331,99 @@ def get_lat_longs():
qdb.sql_connection.TRN.add(sql)

return qdb.sql_connection.TRN.execute_fetchindex()


def generate_biom_and_metadata_release(study_status='public'):
"""Generate a list of biom/meatadata filepaths and a tgz of those files
Parameters
----------
study_status : str, optional
The study status to search for. Note that this should always be set
to 'public' but having this exposed helps with testing. The other
options are 'private' and 'sandbox'
"""
studies = qdb.study.Study.get_by_status(study_status)
qiita_config = ConfigurationManager()
working_dir = qiita_config.working_dir
portal = qiita_config.portal
bdir = qdb.util.get_db_files_base_dir()
time = datetime.now().strftime('%m-%d-%y %H:%M:%S')

data = []
for s in studies:
# [0] latest is first, [1] only getting the filepath
sample_fp = relpath(s.sample_template.get_filepaths()[0][1], bdir)

for a in s.artifacts(artifact_type='BIOM'):
if a.processing_parameters is None:
continue

cmd_name = a.processing_parameters.command.name

# this loop is necessary as in theory an artifact can be
# generated from multiple prep info files
human_cmd = []
for p in a.parents:
pp = p.processing_parameters
pp_cmd_name = pp.command.name
if pp_cmd_name == 'Trimming':
human_cmd.append('%s @ %s' % (
cmd_name, str(pp.values['length'])))
else:
human_cmd.append('%s, %s' % (cmd_name, pp_cmd_name))
human_cmd = ', '.join(human_cmd)

for _, fp, fp_type in a.filepaths:
if fp_type != 'biom' or 'only-16s' in fp:
continue
fp = relpath(fp, bdir)
# format: (biom_fp, sample_fp, prep_fp, qiita_artifact_id,
# human readable name)
for pt in a.prep_templates:
for _, prep_fp in pt.get_filepaths():
if 'qiime' not in prep_fp:
break
prep_fp = relpath(prep_fp, bdir)
data.append((fp, sample_fp, prep_fp, a.id, human_cmd))

# writing text and tgz file
ts = datetime.now().strftime('%m%d%y-%H%M%S')
tgz_dir = join(working_dir, 'releases')
if not exists(tgz_dir):
makedirs(tgz_dir)
tgz_name = join(tgz_dir, '%s-%s-building.tgz' % (portal, study_status))
tgz_name_final = join(tgz_dir, '%s-%s.tgz' % (portal, study_status))
txt_hd = StringIO()
with topen(tgz_name, "w|gz") as tgz:
# writing header for txt
txt_hd.write(
"biom_fp\tsample_fp\tprep_fp\tqiita_artifact_id\tcommand\n")
for biom_fp, sample_fp, prep_fp, artifact_id, human_cmd in data:
txt_hd.write("%s\t%s\t%s\t%s\t%s\n" % (
biom_fp, sample_fp, prep_fp, artifact_id, human_cmd))
tgz.add(join(bdir, biom_fp), arcname=biom_fp, recursive=False)
tgz.add(join(bdir, sample_fp), arcname=sample_fp, recursive=False)
tgz.add(join(bdir, prep_fp), arcname=prep_fp, recursive=False)

txt_hd.seek(0)
info = TarInfo(name='%s-%s-%s.txt' % (portal, study_status, ts))
info.size = len(txt_hd.buf)
tgz.addfile(tarinfo=info, fileobj=txt_hd)

with open(tgz_name, "rb") as f:
md5sum = md5()
for c in iter(lambda: f.read(4096), b""):
md5sum.update(c)

rename(tgz_name, tgz_name_final)

vals = [
('filepath', tgz_name_final[len(working_dir):], r_client.set),
('md5sum', md5sum.hexdigest(), r_client.set),
('time', time, r_client.set)]
for k, v, f in vals:
redis_key = '%s:release:%s:%s' % (portal, study_status, k)
# important to "flush" variables to avoid errors
r_client.delete(redis_key)
f(redis_key, v)
Loading

0 comments on commit f5cff17

Please sign in to comment.