Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gene suggestions for 'Clock' yields Per2 #45

Open
pkerpedjiev opened this issue May 19, 2017 · 2 comments
Open

Gene suggestions for 'Clock' yields Per2 #45

pkerpedjiev opened this issue May 19, 2017 · 2 comments
Labels

Comments

@pkerpedjiev
Copy link
Member

(hg-server) ===========================================================================================================
[peter@mbi-cw-l10381 higlass-server] [master]$ python
Python 2.7.11 (default, Nov 17 2016, 17:37:33)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import tilesets.suggestions as ts
>>> ts.get_gene_suggestions('data/gene-annotations-mm9.db', 'Clock')
[{'chr': u'chr1', 'score': 394.0, 'geneName': u'Per2', 'txEnd': 93355905, 'txStart': 93312558}, {'chr': u'chr5', 'score': 308.0, 'geneName': u'Clock', 'txEnd': 76733817, 'txStart': 76638892}, {'chr': u'chr11', 'score': 283.0, 'geneName': u'Per1', 'txEnd': 68923459, 'txStart': 68912457}, {'chr': u'chr4', 'score': 69.0, 'geneName': u'Per3', 'txEnd': 150418774, 'txStart': 150377763}, {'chr': u'chr10', 'score': 41.0, 'geneName': u'Timeless', 'txEnd': 127689997, 'txStart': 127669118}, {'chr': u'chr12', 'score': 19.0, 'geneName': u'Cipc', 'txEnd': 88306316, 'txStart': 88287990}]
@mccalluc
Copy link
Collaborator

Nils reported a similar issue, searching for "pam". Searching for "perm" or "tan" causes the same kind of problem. higlass/higlass#87

@pkerpedjiev
Copy link
Member Author

This is because genes are ranked by importance. As long as the query text is included in the gene or its description, the results are ranked by the importance of the gene.

The fix would be to give extra weight to search terms which match the gene name exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants