-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let's get redbiom-ing #2118
Let's get redbiom-ing #2118
Changes from 25 commits
06bc8b2
41f0fcd
574c538
4210a61
2c194fe
36c41bf
8a7e9b7
a23c360
a870b3c
acda0d0
0e72f2a
a03f1cd
07fb481
0e56fa0
2da0aaf
c2a844b
f1840f7
0ac26f1
27d9968
e2bcaf9
369628a
cc5afbb
9bd1166
28a9902
17e0780
d378518
16fc478
e41cdb4
babbede
0fc122b
d46392e
c64a019
f886d32
6e7cb7e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,6 +60,7 @@ Install the non-python dependencies | |
|
||
* [PostgreSQL](http://www.postgresql.org/download/) (minimum required version 9.3.5, we have tested most extensively with 9.3.6) | ||
* [redis-server](http://redis.io) (we have tested most extensively with 2.8.17) | ||
* [webdis] (https://github.com/nicolasff/webdis) (latest version should be fine) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you make a comment similar to the one in redis? i.e. Latest version should be fine but we have tested most extensively with XXXX. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k |
||
|
||
There are several options to install these dependencies depending on your needs: | ||
|
||
|
@@ -87,6 +88,26 @@ brew update | |
brew install homebrew/versions/redis28 | ||
``` | ||
|
||
### webdis | ||
|
||
Note that this is the only package that assumes that Qiita is already installed (due to library dependencies). Also, that the general suggestion is to have 2 redis servers running, one for webdis/redbiom and the other for Qiita. The default configuration has webdis/redbiom running on the default redis port and Qiita's on 7777. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add: "The reason for multiple redis servers is so that the redbiom cache can be flushed without impacting the operation of the qiita server itself." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k |
||
|
||
The following instructions install, compile and pre-populates the redbiom redis DB so we assume that redis is running on the default port and that Qiita is fully installed as the redbiom package is installed with Qiita. | ||
|
||
``` | ||
git clone https://github.com/nicolasff/webdis | ||
pushd webdis | ||
make | ||
./webdis & | ||
popd | ||
wget https://raw.githubusercontent.com/wasade/redbiom/master/Makefile | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The instructions outlined here looks like they're for setting up a test environment, but I don't believe that is the goal of this documentation. I think it should list the commands used to generate the release from Qiita, populate the redbiom DB from that file and then how to set up a job so that the cache gets updated periodically. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup, this is a copy of the first version with loaded the redbiom db; the new one actually loads the qiita data so changing |
||
wget https://raw.githubusercontent.com/wasade/redbiom/master/test.txt | ||
wget https://raw.githubusercontent.com/wasade/redbiom/master/test.biom | ||
wget https://raw.githubusercontent.com/wasade/redbiom/master/test_with_alts.txt | ||
wget https://raw.githubusercontent.com/wasade/redbiom/master/test_with_alts.biom | ||
make test_db | ||
``` | ||
|
||
|
||
Install Qiita development version and its python dependencies | ||
------------------------------------------------------------- | ||
|
@@ -163,6 +184,12 @@ Next, make a test environment: | |
qiita-env make --no-load-ontologies | ||
``` | ||
|
||
Finally, redbiom rely on the REDBIOM_HOST environment variable to set the URL to query. By default is set to http://127.0.0.1:7379, which is the webdis default. For example you could: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. rely -> relies once we shift to a release, it's be super awesome to have a default There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k, makes sense for default installations but it might be better to leave as is as the idea of these instructions is to have local installations for development. |
||
|
||
```bash | ||
export REDBIOM_HOST=http://my_host.com:7329 | ||
``` | ||
|
||
## Start Qiita | ||
|
||
Start postgres (instructions vary depending on operating system and install method). | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
from tornado.gen import coroutine, Task | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This file is missing the copyright header. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k |
||
|
||
from qiita_core.util import execute_as_transaction | ||
|
||
from .base_handlers import BaseHandler | ||
import redbiom.summarize | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor, but these imports are not following the PEP8 standard (and they're not flagged by pep8). The order of the imports should be:
Also, is there any problem on simply doing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In [1]: import scipy
In [2]: scipy.sparse.csr_matrix
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-c4d1267aee28> in <module>()
----> 1 scipy.sparse.csr_matrix
AttributeError: module 'scipy' has no attribute 'sparse'
In [3]: import scipy.sparse
In [4]: scipy.sparse.csr_matrix
Out[4]: scipy.sparse.csr.csr_matrix There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. changing blocks, thanks @wasade for your comment. |
||
import redbiom.search | ||
import redbiom._requests | ||
import redbiom.util | ||
import redbiom.fetch | ||
|
||
|
||
class RedbiomPublicSearch(BaseHandler): | ||
@execute_as_transaction | ||
def _get_context(self, callback): | ||
df = redbiom.summarize.contexts() | ||
result = df.ContextName.values | ||
callback(result) | ||
|
||
@coroutine | ||
@execute_as_transaction | ||
def get(self, search): | ||
contexts = yield Task(self._get_context) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just curious, how annoying was it to deal with these There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Once you get the drift, pretty easy. |
||
self.render('redbiom.html', contexts=contexts) | ||
|
||
@execute_as_transaction | ||
def _redbiom_search(self, context, query, search_on, callback): | ||
df = redbiom.summarize.contexts() | ||
contexts = df.ContextName.values | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. per gchat, we may want to filter these. And one other possibility in addition to what we discussed was is to filter on load of the cache as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup, let's chat this afternoon |
||
if context not in contexts: | ||
callback(([], 'The given context is not valid: %s - %s' % ( | ||
context, contexts))) | ||
else: | ||
if search_on == 'metadata': | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we may want an additional There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup, let's chat this afternoon There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems that this block of code can be simplified to avoid code duplication: if search_on in ('metadata', 'categories'):
try:
samples = redbiom.serach.metadata_full(query, categories=(search_on == 'categories'))
except ... # Perform error handling as in the "categories" case.
elif search_on == 'observations':
... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k |
||
samples = redbiom.search.metadata_full(query, categories=None) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. k |
||
elif search_on == 'observations': | ||
# from_or_nargs first parameter is the file handler so it uses | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand this comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. in one of the previous versions I was using from_or_nargs but then replaced with current code and forgot to remove that comment, removing. |
||
# that as the query input. None basically will force to take | ||
# the values from query | ||
samples = [s.split('_', 1)[1] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What this is doing is converting from a redbiom ID to an ambiguous ID, and it does not seem to be enforcing that the ID is unique through a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, this is one of the other things we should discuss this afternoon. |
||
for s in redbiom.util.samples_from_observations( | ||
query.split(' '), True, context)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yup, also within our discussion today. |
||
else: | ||
callback(([], 'Incorrect search by: you can use observations ' | ||
'or metadata and you passed: %s' % search_on)) | ||
|
||
if bool(samples): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why cast to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I read somewhere that was the safest way to test for an empty set ... |
||
import qiita_db as qdb | ||
import qiita_db.sql_connection as qdbsc | ||
|
||
sql = """ | ||
WITH main_query AS ( | ||
SELECT study_title, study_id, artifact_id, | ||
array_agg(DISTINCT sample_id) AS samples, | ||
qiita.artifact_descendants(artifact_id) AS children | ||
FROM qiita.study_prep_template | ||
JOIN qiita.prep_template USING (prep_template_id) | ||
JOIN qiita.prep_template_sample USING (prep_template_id) | ||
JOIN qiita.study USING (study_id) | ||
WHERE sample_id IN %s | ||
GROUP BY study_title, study_id, artifact_id) | ||
SELECT study_title, study_id, samples, name, command_id, | ||
(main_query.children).artifact_id AS artifact_id | ||
FROM main_query | ||
JOIN qiita.artifact a ON (main_query.children).artifact_id = | ||
a.artifact_id | ||
JOIN qiita.artifact_type at ON ( | ||
at.artifact_type_id = a.artifact_type_id | ||
AND artifact_type = 'BIOM') | ||
ORDER BY artifact_id | ||
""" | ||
with qdbsc.TRN: | ||
qdbsc.TRN.add(sql, [tuple(samples)]) | ||
results = [] | ||
commands = {} | ||
for row in qdbsc.TRN.execute_fetchindex(): | ||
title, sid, samples, name, cid, aid = row | ||
nr = {'study_title': title, 'study_id': sid, | ||
'artifact_id': aid, 'aname': name, | ||
'samples': samples} | ||
if cid is not None: | ||
if cid not in commands: | ||
c = qdb.software.Command(cid) | ||
commands[cid] = { | ||
'sfwn': c.software.name, | ||
'sfv': c.software.version, | ||
'cmdn': c.name | ||
} | ||
nr['command'] = commands[cid]['cmdn'] | ||
nr['software'] = commands[cid]['sfwn'] | ||
nr['version'] = commands[cid]['sfv'] | ||
else: | ||
nr['command'] = None | ||
nr['software'] = None | ||
nr['version'] = None | ||
results.append(nr) | ||
callback((results, '')) | ||
else: | ||
callback(([], 'No samples where found! Try again ...')) | ||
|
||
@coroutine | ||
@execute_as_transaction | ||
def post(self, search): | ||
context = self.get_argument('context', None) | ||
search = self.get_argument('search', None) | ||
search_on = self.get_argument('search_on', None) | ||
|
||
data = [] | ||
if search is not None and search and search != ' ': | ||
if search_on in ('observations', 'metadata'): | ||
data, msg = yield Task( | ||
self._redbiom_search, context, search, search_on) | ||
else: | ||
msg = 'Not a valid option for search_on' | ||
else: | ||
msg = 'Nothing to search for ...' | ||
|
||
self.write({'status': 'success', 'message': msg, 'data': data}) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -117,3 +117,90 @@ function show_hide_process_list() { | |
$("#qiita-processing").hide(); | ||
} | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i may not be the best person for review of changes to this file There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's mainly moving code around so the libraries are available everywhere vs. just in the study listing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add documentation to the JavaScript functions? I know at the beginning we were not doing it but I started doing it based on the Emperor documentation and it is really useful (i.e. it serves the same goal as the numpy doc). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tried to add as many as possible There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! |
||
function send_samples_to_moi(aids, samples) { | ||
bootstrapAlert('We are adding ' + aids.length + ' artifact(s) to the analysis. This ' + | ||
'might take some time based on the number of samples on each artifact.', "warning", 10000); | ||
|
||
if (typeof samples === 'undefined') { | ||
$.get('/artifact/samples/', {ids:aids}) | ||
.done(function ( data ) { | ||
if (data['status']=='success') { | ||
moi.send('sel', data['data']); | ||
} else { | ||
bootstrapAlert('ERROR: ' + data['msg'], "danger", 10000); | ||
} | ||
}); | ||
} else { | ||
var data = {} | ||
data[aids[0]] = samples | ||
console.log(data) | ||
moi.send('sel', data); | ||
} | ||
} | ||
|
||
function redbiom_send_to_moi(aid, row) { | ||
var row_data = $('#redbiom-table').dataTable().fnGetData(row); | ||
send_samples_to_moi([aid], row_data.samples); | ||
} | ||
|
||
function sel_study(name, row) { | ||
var row_data = $('#'+name).dataTable().fnGetData(row); | ||
var aids = [] | ||
|
||
for(var i=0;i<row_data.proc_data_info.length;i++){ | ||
aids.push(row_data['proc_data_info'][i]['pid']); | ||
} | ||
send_samples_to_moi(aids); | ||
} | ||
|
||
function sel_proc_data(aid) { | ||
send_samples_to_moi([aid]); | ||
} | ||
|
||
function remove_pd_from_html(data) { | ||
pid = data.proc_data; | ||
sid = data.sid; | ||
$('#proc' + pid).remove(); | ||
$('#proc' + pid + '-samples').remove(); | ||
// remove study if all proc data removed | ||
if($('#study'+ sid + '-table tbody').children().length === 1) { $('#study'+sid).remove(); } | ||
check_empty(); | ||
} | ||
|
||
function check_empty() { | ||
if($('.row').length <= 1) { | ||
$('#dflt-sel-info').removeAttr('style'); | ||
$('.topfloat').hide(); | ||
$('#no-selected').show(); | ||
} | ||
} | ||
|
||
function remove_sample_from_html(data) { | ||
pid = data.proc_data; | ||
sample = data.samples[0]; | ||
sid = data.sid; | ||
document.getElementById(pid + '@' + sample).remove(); | ||
//decriment sample count for pid | ||
var count = $('#proc' + pid + '-sample-count'); | ||
count.text(parseInt(count.text(), 10) - 1); | ||
// remove proc data if all samples removed | ||
if($('#proc' + pid + '-samples-table tbody').children().length === 0) { $('#proc'+pid).remove(); $('#proc' + pid + '-samples').remove(); } | ||
// remove study if all proc data removed | ||
if($('#study'+ sid + '-table tbody').children().length === 1) { $('#study'+sid).remove(); } | ||
check_empty(); | ||
} | ||
|
||
function clear_from_html(data) { | ||
$.each($('.row'), function(index, value) { value.remove(); }); | ||
check_empty(); | ||
} | ||
|
||
function error(evt) { | ||
$('#ws-error').html("<b>Server communication error. Sample removal will not be recorded. Please try again later.</b>"); | ||
}; | ||
|
||
function show_alert(data) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think a different name for this function will be more useful, since it is not simply showing an alert, but showing the alert regarding the number of samples selected. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added doc and changed the name |
||
bootstrapAlert(data + ' samples selected.', "success", 10000); | ||
$('#dflt-sel-info').css('color', 'rgb(0, 160, 0)'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line and the one below can be removed. Then instead of using
${qdbd}
on the other lines you can simply use `pwd`/qiita_db/. According to lines 40, 42, and 43 that should work w/o a problemThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k