-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Exception: Not valid subsetter: 1" while using epic2-df #32
Comments
Is this reproducible with just the head? Will look at it on Monday :) Thanks for bothering to report :) |
I tried it with the head and also again with head -n100000 Looks like it works for those files... Hmm so maybe there are some wonky lines in the files? How do you think we can pin-point the problem? |
The error seems to be in my pyranges library. The error message says that the chromosome is an int, but it should always be a string. Dunno why it happens, but I am trying to fix it :) Can you check your version of pyranges with
|
Ahaaaa. I think I might know why... I used bowtie2 to map my fastq and then converted them to bedpe using bedtools. The scaffold names are "1, 2, 3...X, Y, MT", instead of "chr1, chr2, chr3...chrX, chrY, chrM". Indeed I had to use a custom chrom.sizes file that lists the scaffolds as 1,2,3. Do you think this could be the cause? |
The error is in epic2-df after it has successfully run epic on both KO and WT. So the error happens when it works on the result of those epic2 runs. |
No, but I wondered why you used a custom genome sizes file for hg19. When I realized why you did it I added a warning message to epic2 when the chromosome size names and chromosome names in the read file are incompatible. |
That is okay, I am hoping the error is due to your pyranges being old :) |
Looks like it's version 0.0.53
Indeed, the individual outputs work well and I get two files in the output folder. So I agree with your assessment. |
That is the latest version. Do you have the opportunity to send the zipped
dataset to me via dropbox or google drive? I will treat it as confidential.
Then debugging would be easy :)
…On Fri, Sep 6, 2019 at 4:37 PM wescaiju ***@***.***> wrote:
Looks like it's version 0.0.53
`(/gpfs/ysm/project/wc376/conda_envs/for_epic2) ***@***.*** ~]$ python
Python 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import pyranges as pr
pr.*version*
'0.0.53'`
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#32?email_source=notifications&email_token=AEHURUQJDJSB6PG42PFEAELQIJTMPA5CNFSM4IUI2H52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6DBIVA#issuecomment-528880724>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEHURUQSUL4BKGIZ7LUPUSTQIJTMPANCNFSM4IUI2H5Q>
.
|
Yes, I can send you a google drive link. Which email should I use? |
endrebak85 # gmail.com. Thanks!
…On Fri, Sep 6, 2019 at 4:49 PM wescaiju ***@***.***> wrote:
Yes, I can send you a google drive link. Which email should I use?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#32?email_source=notifications&email_token=AEHURUWMNPSN75DH2URCZD3QIJUZNA5CNFSM4IUI2H52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6DCNZA#issuecomment-528885476>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEHURUQMNTIA5NXVTOGUNWTQIJUZNANCNFSM4IUI2H5Q>
.
|
I have sent you an invite via google drive! Thanks for your help. |
I have downloaded the files and am running the analysis now. I have some potential fixes that I will attempt tomorrow :) |
l was able to reproduce the error. Hooray! Will continue tomorrow. Thanks for sharing a reproducible example :) |
(Did not mean to close) |
(Notes to self) The error seems to be due to the following: When pandas reads a table it guesses the types of the columns. For our files it guesses that the chromosome is of type int since it starts with
So you end up with the following different chromosomes:
So initially, it uses an int for lookup. I have fixed this in epic2 now, I will also need to find a fix that works for PyRanges in general. Try Feel free to reopen if this did not fix it for you :) |
I'm trying to analyze knockout and wildtype samples (including input for each) using epic2-df. However I get the following error: "Exception: Not valid subsetter: 1"
Here's the full output:
epic2-output.txt
Here are examples (head -n100 of the input files):
TKO: Sample_2D_KDM2A_me3.mqsd.head100.bedpe.txt
CKO: Sample_2D_KDM2A_input.mqsd.head100.bedpe.txt
TWT: Sample_2D_Arab2_me3.mqsd.head100.bedpe.txt
CWT: Sample_2D_Arab2_input.mqsd.head100.bedpe.txt
Here's my command:
epic2-df --treatment-knockout Sample_2D_KDM2A_me3.mqsd.bedpe --control-knockout Sample_2D_KDM2A_input.mqsd.bedpe --treatment-wildtype Sample_2D_Arab2_me3.mqsd.bedpe --control-wildtype Sample_2D_Arab2_input.mqsd.bedpe --genome hg19 --false-discovery-rate-cutoff 0.01 --false-discovery-rate-comparison 0.01 --bin-size 200 --gaps-allowed 3 --fragment-size 200 --chromsizes hg19.chrom.sizes --output-knockout Sample_2D_KDM2A_me3.mqsd --output-wildtype Sample_2D_Arab2_me3.mqsd;
Interesting, some of the commands worked (with another set of bedpe) so it may be incompatibility between some of my bedpe files? Any assistance would be appreciated!
The text was updated successfully, but these errors were encountered: