New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enrichment Module Errors #72

Closed
udube opened this Issue Jan 23, 2019 · 9 comments

Comments

Projects
None yet
2 participants
@udube
Copy link

udube commented Jan 23, 2019

Hello,

I receive the following errors when running the circtools enrichment module using a gencode GTF file.

Input command:

circtools enrich -c circCoordinates-b predictions.hg19.bed-hg38 -a gencode.v26.primary_assembly.annotation.gtf -g sizes.genome -i 10 -p 20 -P 1 -T 1

Warnings and error:

Processed intersections for iteration 6
***** WARNING: File /tmp/pybedtools.a1akbyv9.tmp has inconsistent naming convention for record:
1       11869   14409   DDX11L1 0       +

***** WARNING: File /tmp/pybedtools.a1akbyv9.tmp has inconsistent naming convention for record:
1       11869   14409   DDX11L1 0       +

Processed intersections for iteration 4
Traceback (most recent call last):
  File "/root/.local/bin/circtools", line 18, in <module>
    import circtools
  File "/root/.local/lib/python3.5/site-packages/circtools/__init__.py", line 2, in <module>
    main()
  File "/root/.local/lib/python3.5/site-packages/circtools/circtools.py", line 31, in main
    CircTools()
  File "/root/.local/lib/python3.5/site-packages/circtools/circtools.py", line 76, in __init__
    getattr(self, args.command)()
  File "/root/.local/lib/python3.5/site-packages/circtools/circtools.py", line 204, in enrich
    enrich.run_module()
  File "/root/.local/lib/python3.5/site-packages/circtools/enrichment/enrichment_check.py", line 205, in run_module
    self.process_intersection(self.results[0][1], linear_start=True)
  File "/root/.local/lib/python3.5/site-packages/circtools/enrichment/enrichment_check.py", line 692, in process_intersection
    str(tmp_data["feature_length"]) + "_" +
KeyError: 'feature_length'

Input command:

circtools enrich -c circCoordinates-b predictions.hg19.bed-hg38 -a gencode.v26.primary_assembly.annotation.gtf -g sizes.genome -i 10 -p 20 -P 1 -T 1 -I exon

This command runs through, but the .csv file output only has a header (output_10_2019_01_23__20_43.zip)

Thanks,

@tjakobi tjakobi self-assigned this Jan 24, 2019

@tjakobi

This comment has been minimized.

Copy link
Member

tjakobi commented Jan 25, 2019

Dear @udube,

would you have some more program output for me, i.e. the full log of the run? In the beginning there should be some statistics about the BED input files. Additionally, if possible, the predictions.hg19.bed-hg38 would be useful (I assume that's where the warning are coming from). The bedtools warnings may point out some issues while reading the bed files, thus yielding the error you got.

Cheers,
Tobias

@udube

This comment has been minimized.

Copy link
Author

udube commented Jan 25, 2019

Sure, please find the files you requested attached.

Output.zip

EDIT: Including logs from a run with no output and a run with the error.

Thanks!

@tjakobi

This comment has been minimized.

Copy link
Member

tjakobi commented Jan 26, 2019

Thank you for providing the files. I'll take a look at the issue - I suspect something in conjunction with the chromosome names may be not correctly handled. The test data usually had "1" instead of "chr1", so that may be the problem.

@tjakobi

This comment has been minimized.

Copy link
Member

tjakobi commented Jan 28, 2019

Dear @udube,

I tried to reproduce the issue with mock CircCoordinates files and do not get errors while using the same GTF and BED file than in you example. Would it be possible to also provide the CircCoordinates file?

Cheers,
Tobias

@udube

This comment has been minimized.

Copy link
Author

udube commented Jan 28, 2019

Thank you for continuing to look into this issue. Please find the CircCoordinates file as well as all output produced, including a text file with the program's output to screen.

ExampleCircCoordinates.zip

@tjakobi

This comment has been minimized.

Copy link
Member

tjakobi commented Jan 30, 2019

Dear @udube,

Thank you for providing the files. I used the NPHP4 file as input and still cannot reproduce the error. However my output is also empty. Could you please check your Python3 environment for me and run pip3 list to see what package versions you have installed right now. Specifically pybedtools would be interesting (I'm running pybedtools 0.8.0).

Cheers,
Tobias

@udube

This comment has been minimized.

Copy link
Author

udube commented Jan 30, 2019

To clarify, I only receive the error when I do not include "-I exon". When I include "-I exon" I have the empty output.

Please find the information you requested below:

biopython (1.73)
circtools (1.1.0.7)
HTSeq (0.11.2)
numpy (1.16.0)
pandas (0.23.4)
patsy (0.5.1)
Pillow (5.4.1)
pip (9.0.1)
pybedtools (0.8.0)
pysam (0.15.2)
python-dateutil (2.7.5)
pytz (2018.9)
reportlab (3.5.13)
scipy (1.2.0)
setuptools (33.1.1)
six (1.12.0)
statsmodels (0.9.0)
wheel (0.32.3)

tjakobi added a commit that referenced this issue Jan 31, 2019

tjakobi added a commit that referenced this issue Jan 31, 2019

@tjakobi

This comment has been minimized.

Copy link
Member

tjakobi commented Jan 31, 2019

Dear @udube,

I could trace back the error to some old code that I have not used for months. Basically, I changed the default to "gene" mode for -I in the latest commit, meaning that shuffling will take place over all annotated gene regions (which more or less mirrors not specifying -I in earlier versions). Please update circtools to the latest version from the repository to get the updated code.

That should fix your problem. In a related note, the warning of bedtools is generated by the genome.sizes file that contains "chr1" etc as IDs whereas circtools internally converts everything to "1". I will add an auto conversion of the genome.sizes file in a later release to also address that issue. for now you may use sed and therefore get rid of that warning.

Cheers,
Tobias

@udube

This comment has been minimized.

Copy link
Author

udube commented Jan 31, 2019

Thank you. I am able to run the command without any errors. I appreciate your perseverance.

@udube udube closed this Jan 31, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment