-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems configuring RepeatClasifier on docker. #36
Comments
Hi, can you provide the command you used to invoke RepeatClassifier? The first thing I want to confirm is that the configured library is being mounted into the container properly. Another possible problem is that the automatic update script only extracts the curated families from the .h5 files. It's possible that it's just not including the families that would be useful to you, and if that's the case I can help work around that. |
I use "singularity shell --bind /mnt/pixstor/data/magmt/:/home/magmt/data2
--bind
/mnt/pixstor/sxwf7-lab/RepeateModeler/Libraries:/opt/RepeatMasker/Libraries
~/data/singularity/tetools_latest.sif" to start the container, then delete
/opt/RepeatMasker/Libraries/famdb/rmlib.config then run ./tetoolsDfamUpdate.pl.
tetoolsDfamUpdate seems to see both partitions (it says "2 Partitions
Present"), and then I set the environment variable "export
LIBDIR=/path/to/Libraries".
I then run RepeatClassifier with "RepeatClassifier -consensi my-families.fa"
…-Matt
On Thu, Feb 8, 2024 at 4:59 PM Anthony ***@***.***> wrote:
Hi, can you provide the command you used to invoke RepeatClassifier? The
first thing I want to confirm is that the configured library is being
mounted into the container properly.
Another possible problem is that the automatic update script only extracts
the curated families from the .h5 files. It's possible that it's just not
including the families that would be useful to you, and if that's the case
I can help work around that.
—
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACRUFQGVO23AQS4SLKWLV3YSVKDJAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVGA3DKMBWGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
I thought it might be because I'm running the container with a read-only
file system. I've tried it with " "singularity shell --bind
/mnt/pixstor/data/magmt/:/home/magmt/data2 --bind
/mnt/pixstor/sxwf7-lab/RepeateModeler/Libraries:/opt/RepeatMasker/Libraries
-w ~/data/singularity/tetools_latest.sif"" which should make it read/write,
but singularity complains, is this where I am messing things up?
…-Matt
On Fri, Feb 9, 2024 at 10:53 AM Matthew Ackerman <
***@***.***> wrote:
I use "singularity shell --bind /mnt/pixstor/data/magmt/:/home/magmt/data2
--bind
/mnt/pixstor/sxwf7-lab/RepeateModeler/Libraries:/opt/RepeatMasker/Libraries
~/data/singularity/tetools_latest.sif" to start the container, then delete
/opt/RepeatMasker/Libraries/famdb/rmlib.config then run ./tetoolsDfamUpdate.pl.
tetoolsDfamUpdate seems to see both partitions (it says "2 Partitions
Present"), and then I set the environment variable "export
LIBDIR=/path/to/Libraries".
I then run RepeatClassifier with "RepeatClassifier -consensi
my-families.fa"
-Matt
On Thu, Feb 8, 2024 at 4:59 PM Anthony ***@***.***> wrote:
> Hi, can you provide the command you used to invoke RepeatClassifier? The
> first thing I want to confirm is that the configured library is being
> mounted into the container properly.
>
> Another possible problem is that the automatic update script only
> extracts the curated families from the .h5 files. It's possible that it's
> just not including the families that would be useful to you, and if that's
> the case I can help work around that.
>
> —
> Reply to this email directly, view it on GitHub
> <#36 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AACRUFQGVO23AQS4SLKWLV3YSVKDJAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZVGA3DKMBWGQ>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
|
A read-only file system could certainly be causing issues. |
Sorry to message you again. I was investigating the 'not writable' angle by
creating an image that loads properly with the shell -w command. I did this
by executing "singularity build --sandbox writable_tetools.sif
/home/magmt/data/singularity/tetools_latest.sif". This writable sif loads
properly with the shell -w command, and I verified that it is writable by
making files in the root directory. In this writable image, I
ran tetoolsDfamUpdate.pl as previously specified and it does update the
file RepeatMasker.lib, but I'm still getting mostly unknowns. I may try
downloading a few more dfam38 libraries, but I'm pretty sure Daphnia
(branchiopod crustaceans) belong in dfam38_full.8.h5. Any other ideas?
…-Matt
On Fri, Feb 9, 2024 at 2:14 PM Anthony ***@***.***> wrote:
A read-only file system could certainly be causing issues.
tetoolsDfamUpdate.pl rewrites /opt/RepeatMasker/Libraries/RepeatMasker.lib,
so you could confirm if that's working by checking the timestamp with ls
-al.
Can you share the error message when you use the shell -w command?
—
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACRUFU2AHFBGB6JHZTAMN3YSZ7TPAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZWGU2DGNZWGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Thanks for confirming that the container is working. It looks like the issue is that First, I ran a few lineage queries to confirm what is in the FamDB files:
This confirms that there are 937 uncurated families in file 8, though all are from Daphnia pulicaria. To extract them, you can use the
More information regarding FamDB commands can be found here. If you need to add other FamDB files, just be aware that Hopefully that produces better results for you, but let me know either way. |
Sorry, you may be right that this should be closed, but I just got around
to running your fix (I'd been messing with Transposon Ultimate) and I'm not
getting what I want yet. I could still be doing something stupid.
…-Matt
On Wed, Mar 13, 2024 at 3:26 PM Anthony ***@***.***> wrote:
Closed #36 <#36> as
completed.
—
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACRUFRNC4L4IMNJKWHDQGDYYCZBHAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGEYDQOBQGU2DINA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Sorry, I just closed it because of inactivity.
Either way, it's sounding less like an issue with TETools, and more like an
issue with RepeatClassifier.
If you like, you can send your data to us, and we can try to run it against
all of Dfam to see what we can find. I'm not super familiar with
RepeatClassifier, but you could send it to me or to the author of
RepeatClassifier:
agray at systemsbiology.org
or
rhubley at systemsbiology.org
as you prefer.
On Wed, Mar 13, 2024, 13:54 Matthew Ackerman ***@***.***>
wrote:
… Sorry, you may be right that this should be closed, but I just got around
to running your fix (I'd been messing with Transposon Ultimate) and I'm
not
getting what I want yet. I could still be doing something stupid.
-Matt
On Wed, Mar 13, 2024 at 3:26 PM Anthony ***@***.***> wrote:
> Closed #36 <#36> as
> completed.
>
> —
> Reply to this email directly, view it on GitHub
> <#36 (comment)>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AACRUFRNC4L4IMNJKWHDQGDYYCZBHAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGEYDQOBQGU2DINA>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub
<#36 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF6AS3AHGWT2AGOHHMUCSA3YYC4GXAVCNFSM6AAAAABCVOMPEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJVG43TMMJVGA>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
When classifying repeats for Daphnia pulex, a Crustacean, I get >95% unknown families. I've downloaded dfam38_full.8.h5, and have dfam38_full.0.h5 in the RepeatMasker directory. I delete rmlib.config, and when I run ./tetoolsDfamUpdate.pl it seems to correctly detect the dfam library. I set the environmental variable, but when I run RepeatClassifier I still get mostly unknowns. Does RepeatClassifier use RepeatMasker's library? Is there some step I am forgetting? Thanks in advance!
The text was updated successfully, but these errors were encountered: