Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's get redbiom-ing #2118

Merged
merged 34 commits into from
May 11, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
06bc8b2
installing redbiom
antgonza Apr 26, 2017
41f0fcd
classifiers=classifiers ??
antgonza Apr 26, 2017
574c538
one of those days!
antgonza Apr 26, 2017
4210a61
init commit
antgonza Apr 29, 2017
2c194fe
Merge branch 'master' of https://github.com/biocore/qiita into redbio…
antgonza May 3, 2017
36c41bf
apt redis
antgonza May 3, 2017
8a7e9b7
redis-server &
antgonza May 3, 2017
a23c360
redis-server
antgonza May 3, 2017
a870b3c
Merge branch 'master' of https://github.com/biocore/qiita into redbio…
antgonza May 5, 2017
acda0d0
adding redbiom tests
antgonza May 5, 2017
0e72f2a
redbiom setup.py
antgonza May 5, 2017
a03f1cd
redbiom #egg
antgonza May 5, 2017
07fb481
ready for test env
antgonza May 5, 2017
0e56fa0
Merge branch 'redbiom' of https://github.com/biocore/qiita into redbi…
antgonza May 5, 2017
2da0aaf
should fix tests
antgonza May 5, 2017
c2a844b
redis 7777 &
antgonza May 5, 2017
f1840f7
before script redis-server
antgonza May 5, 2017
0ac26f1
sudo: required
antgonza May 5, 2017
27d9968
adding more tests and REDBIOM_HOST
antgonza May 6, 2017
e2bcaf9
rm redbiom from conf
antgonza May 6, 2017
369628a
retriving artifact info via SQL
antgonza May 7, 2017
cc5afbb
retriving info from preps
antgonza May 7, 2017
9bd1166
creating redbiom context
antgonza May 7, 2017
28a9902
adding total of samples
antgonza May 7, 2017
17e0780
rm button when not logged
antgonza May 7, 2017
d378518
addressing @wasade initial comments
antgonza May 9, 2017
16fc478
new gui + search on categories
antgonza May 10, 2017
e41cdb4
adding a connection error message
antgonza May 10, 2017
babbede
request -> requests
antgonza May 10, 2017
0fc122b
rm query from connection error
antgonza May 10, 2017
d46392e
changing callback placing
antgonza May 10, 2017
c64a019
addressing @josenavas comments
antgonza May 11, 2017
f886d32
flake8
antgonza May 11, 2017
6e7cb7e
rm duplicated js imports
antgonza May 11, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 17 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ env:
- TEST_ADD_STUDIES=True
before_install:
- redis-server --version
- redis-server /etc/redis/redis.conf --port 7777 &
- wget http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
- chmod +x miniconda.sh
- ./miniconda.sh -b
Expand All @@ -28,9 +29,14 @@ install:
- source activate qiita
- pip install -U pip
- pip install sphinx sphinx-bootstrap-theme coveralls 'ipython[all]==2.4.1'
- travis_retry pip install . --process-dependency-links
- 'echo "backend: Agg" > matplotlibrc'
script:
- git clone https://github.com/nicolasff/webdis
- pushd webdis
- make
- ./webdis &
- popd
- travis_retry pip install . --process-dependency-links
before_script:
- export MOI_CONFIG_FP=`pwd`/qiita_core/support_files/config_test.cfg
- if [ ${TRAVIS_PULL_REQUEST} == "false" ]; then
export QIITA_CONFIG_FP=`pwd`/qiita_core/support_files/config_test_travis.cfg;
Expand All @@ -39,6 +45,15 @@ script:
- ipython profile create qiita-general --parallel
- qiita-env start_cluster qiita-general
- qiita-env make --no-load-ontologies
# loading redbiom with Qiita's test set
# first let's make sure redis is empty
- curl -s http://127.0.0.1:7379/FLUSHALL > /dev/null
- redbiom admin create-context --name "qiita-test" --description "qiita-test context"
- redbiom admin load-sample-metadata --metadata `pwd`/qiita_db/support_files/test_data/templates/1_19700101-000000.txt
- redbiom admin load-sample-metadata-search --metadata `pwd`/qiita_db/support_files/test_data/templates/1_19700101-000000.txt
- redbiom admin load-observations --table `pwd`/qiita_db/support_files/test_data/processed_data/1_study_1001_closed_reference_otu_table.biom --context qiita-test
- redbiom admin load-sample-data --table `pwd`/qiita_db/support_files/test_data/processed_data/1_study_1001_closed_reference_otu_table.biom --context qiita-test
script:
- if [ ${TEST_ADD_STUDIES} == "True" ]; then test_data_studies/commands.sh ; fi
- if [ ${TEST_ADD_STUDIES} == "True" ]; then qiita-cron-job ; fi
- if [ ${TEST_ADD_STUDIES} == "False" ]; then qiita-test-install ; fi
Expand Down
29 changes: 29 additions & 0 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ Install the non-python dependencies

* [PostgreSQL](http://www.postgresql.org/download/) (minimum required version 9.3.5, we have tested most extensively with 9.3.6)
* [redis-server](http://redis.io) (we have tested most extensively with 2.8.17)
* [webdis] (https://github.com/nicolasff/webdis) (latest version should be fine but we have tested the most with 9ee6fe2 - Feb 6, 2016)

There are several options to install these dependencies depending on your needs:

Expand Down Expand Up @@ -87,6 +88,28 @@ brew update
brew install homebrew/versions/redis28
```

### webdis

Note that this is the only package that assumes that Qiita is already installed (due to library dependencies). Also, that the general suggestion is to have 2 redis servers running, one for webdis/redbiom and the other for Qiita. The reason for multiple redis servers is so that the redbiom cache can be flushed without impacting the operation of the qiita server itself.

The following instructions install, compile and pre-populates the redbiom redis DB so we assume that redis is running on the default port and that Qiita is fully installed as the redbiom package is installed with Qiita.

```
git clone https://github.com/nicolasff/webdis
pushd webdis
make
./webdis &
popd
# note that this assumes that Qiita is already installed
fp=`python -c 'import qiita_db; print qiita_db.__file__'`
qdbd=`dirname $fp`
redbiom admin create-context --name "qiita-test" --description "qiita-test context"
redbiom admin load-sample-metadata --metadata ${qdbd}/support_files/test_data/templates/1_19700101-000000.txt
redbiom admin load-sample-metadata-search --metadata ${qdbd}/support_files/test_data/templates/1_19700101-000000.txt
redbiom admin load-observations --table ${qdbd}/support_files/test_data/processed_data/1_study_1001_closed_reference_otu_table.biom --context qiita-test
redbiom admin load-sample-data --table ${qdbd}/support_files/test_data/processed_data/1_study_1001_closed_reference_otu_table.biom --context qiita-test
```


Install Qiita development version and its python dependencies
-------------------------------------------------------------
Expand Down Expand Up @@ -163,6 +186,12 @@ Next, make a test environment:
qiita-env make --no-load-ontologies
```

Finally, redbiom relies on the REDBIOM_HOST environment variable to set the URL to query. By default is set to http://127.0.0.1:7379, which is the webdis default. For example you could:

```bash
export REDBIOM_HOST=http://my_host.com:7329
```

## Start Qiita

Start postgres (instructions vary depending on operating system and install method).
Expand Down
173 changes: 173 additions & 0 deletions qiita_pet/handlers/qiita_redbiom.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# -----------------------------------------------------------------------------
# Copyright (c) 2014--, The Qiita Development Team.
#
# Distributed under the terms of the BSD 3-clause License.
#
# The full license is in the file LICENSE, distributed with this software.
# -----------------------------------------------------------------------------

from requests import ConnectionError
import redbiom.summarize
import redbiom.search
import redbiom._requests
import redbiom.util
import redbiom.fetch
from tornado.gen import coroutine, Task

from qiita_core.util import execute_as_transaction

from .base_handlers import BaseHandler


class RedbiomPublicSearch(BaseHandler):
@execute_as_transaction
def get(self, search):
self.render('redbiom.html')

@execute_as_transaction
def _redbiom_search(self, query, search_on, callback):
error = False
message = ''
results = []

try:
df = redbiom.summarize.contexts()
except ConnectionError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is some other error raised? Are we allowing the 500 to be raised on the user screen or should we catch it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I can see pros/cons on both (catch everything or just specific errors). I decided to go with specific so we can narrow down why it's failing. Note that we can see the errors in the logs if not caught. -- BTW I think we have caught all of them at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just wanted to make sure that this was the desired behavior.

error = True
message = 'Redbiom is down - contact admin, thanks!'

if not error:
contexts = df.ContextName.values
query = query.lower()
features = []

if search_on in ('metadata', 'categories'):
try:
features = redbiom.search.metadata_full(
query, categories=(search_on == 'categories'))
except TypeError:
error = True
message = (
'Not a valid search: "%s", are you sure this is a '
'valid metadata %s?' % (
query, 'value' if search_on == 'metadata' else
'category'))
except ValueError:
error = True
message = (
'Not a valid search: "%s", your query is too small '
'(too few letters), try a longer query' % query)
elif search_on == 'observations':
features = [s.split('_', 1)[1] for context in contexts
for s in redbiom.util.samples_from_observations(
query.split(' '), True, context)]
else:
error = True
message = ('Incorrect search by: you can use observations '
'or metadata and you passed: %s' % search_on)

if not error:
import qiita_db as qdb
import qiita_db.sql_connection as qdbsc
if features:
if search_on in ('metadata', 'observations'):
sql = """
WITH main_query AS (
SELECT study_title, study_id, artifact_id,
array_agg(DISTINCT sample_id) AS samples,
qiita.artifact_descendants(artifact_id) AS
children
FROM qiita.study_prep_template
JOIN qiita.prep_template USING (prep_template_id)
JOIN qiita.prep_template_sample USING
(prep_template_id)
JOIN qiita.study USING (study_id)
WHERE sample_id IN %s
GROUP BY study_title, study_id, artifact_id)
SELECT study_title, study_id, samples,
name, command_id,
(main_query.children).artifact_id AS artifact_id
FROM main_query
JOIN qiita.artifact a ON
(main_query.children).artifact_id = a.artifact_id
JOIN qiita.artifact_type at ON (
at.artifact_type_id = a.artifact_type_id
AND artifact_type = 'BIOM')
ORDER BY artifact_id
"""
with qdbsc.TRN:
qdbsc.TRN.add(sql, [tuple(features)])
results = []
commands = {}
for row in qdbsc.TRN.execute_fetchindex():
title, sid, samples, name, cid, aid = row
nr = {'study_title': title, 'study_id': sid,
'artifact_id': aid, 'aname': name,
'samples': samples}
if cid is not None:
if cid not in commands:
c = qdb.software.Command(cid)
commands[cid] = {
'sfwn': c.software.name,
'sfv': c.software.version,
'cmdn': c.name
}
nr['command'] = commands[cid]['cmdn']
nr['software'] = commands[cid]['sfwn']
nr['version'] = commands[cid]['sfv']
else:
nr['command'] = None
nr['software'] = None
nr['version'] = None
results.append(nr)
else:
sql = """
WITH get_studies AS (
SELECT
trim(table_name, 'sample_')::int AS
study_id,
array_agg(column_name::text) AS columns
FROM information_schema.columns
WHERE column_name IN %s
AND table_name LIKE 'sample_%%'
AND table_name NOT IN (
'prep_template',
'prep_template_sample')
GROUP BY table_name)
SELECT study_title, get_studies.study_id, columns
FROM get_studies
JOIN qiita.study ON get_studies.study_id =
qiita.study.study_id"""
with qdbsc.TRN:
results = []
qdbsc.TRN.add(sql, [tuple(features)])
for row in qdbsc.TRN.execute_fetchindex():
title, sid, cols = row
nr = {'study_title': title, 'study_id': sid,
'artifact_id': None, 'aname': None,
'samples': cols,
'command': ', '.join(cols),
'software': None, 'version': None}
results.append(nr)
else:
error = True
message = 'No samples where found! Try again ...'
callback((results, message))

@coroutine
@execute_as_transaction
def post(self, search):
search = self.get_argument('search', None)
search_on = self.get_argument('search_on', None)

data = []
if search is not None and search and search != ' ':
if search_on in ('observations', 'metadata', 'categories'):
data, msg = yield Task(
self._redbiom_search, search, search_on)
else:
msg = 'Not a valid option for search_on'
else:
msg = 'Nothing to search for ...'

self.write({'status': 'success', 'message': msg, 'data': data})
Loading