Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace SILVA databases with unencumbered-licence source #13

Open
tseemann opened this issue Jul 28, 2015 · 8 comments
Open

Replace SILVA databases with unencumbered-licence source #13

tseemann opened this issue Jul 28, 2015 · 8 comments
Assignees

Comments

@tseemann
Copy link
Owner

@satta has been trying to package Barrnap into Debian-Med but has reported that the SILVA alignments (23S) have a licence with is incompatible with Debian.

(It's only free for academic/non-commerical: http://www.arb-silva.de/silva-license-information/ )

Goal would be to construct new 23S alignments from Refseq and build our own models.

@tseemann tseemann self-assigned this Jul 28, 2015
@satta
Copy link

satta commented Feb 6, 2016

FYI, I have started to work on this a bit. Please find in https://github.com/satta/barrnap/tree/build_own_hmms/build a version of barrnap with a changed build pipeline which

  • downloads 23S for {bac,arc} and 28S for euk from RefSeq (needs Biopython),
  • aligns them with MAFFT, and
  • builds HMMs from them.

This replaces the SILVA step completely and results in a new set of HMMs (committed in the branch as well) which mostly results in identical matches in the example data. Sometimes there are slightly different start positions with about 1bp deviation and slightly different score values. Only few hits are missed completely (2 in the fungal set). You can take a look and use the compare_results.lua script (needs gt) to compare old and new results.

I'd be happy to get some suggestions if you can think of any improvements. Preprocessing or filtering the raw RefSeq downloads comes to mind, but I'm not an expert on these RNAs to make any judgment calls there.

Thanks,
Sascha

@wwood
Copy link

wwood commented Dec 4, 2016

Hi there,
I'm similarly interested, for reasons (packaging for GNU Guix). Has there been any update? Given the small differences you observed @satta, would it make sense to your HMMs as an official alternative to SILVA?
Thanks!

@satta
Copy link

satta commented Feb 21, 2017

Hi @wwood, sorry I missed your comment. Given my lack of practical experience as a user, I would probably need some more tests and/or confirmation by an expert that the models are an alternative to the SILVA ones. No idea how 'bad' missing results are for a end user.
Hence my request for @tseemann comments. In Debian we currently do not ship the SILVA derived HMMs.

@tseemann
Copy link
Owner Author

@satta I think this problem may be solved soon?

https://www.arb-silva.de/silva-license-information/

Change of SILVA license model for commercial users in Fall 2018 - free of cost for any usage.

With the next full database release which is expected for Fall 2018, the SILVA project will resign the current dual licensing model and the SILVA datasets will become free also for commercial/non-academic users. With this change SILVA is following the recommendations of an Opens external link in new windowELIXIR Core Data Resource.

@satta
Copy link

satta commented May 1, 2018

Good news! I guess we can then finally ship all HMMs. :)

@tseemann
Copy link
Owner Author

tseemann commented May 5, 2018

Since then everyone has moved to bioconda :P

@sfehrmann
Copy link

they certainly want to ensure the license transition is carried out thoroughly

Change in SILVA license model

expected for Summer 2019

@tseemann tseemann pinned this issue Oct 4, 2019
@bwlang
Copy link

bwlang commented Jan 1, 2021

i think this can be closed... silva is now fully open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants