Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor issue: GAT hangs for missing IDs in isochore files #4

Closed
skiaphrene opened this issue Sep 6, 2016 · 2 comments
Closed

Minor issue: GAT hangs for missing IDs in isochore files #4

skiaphrene opened this issue Sep 6, 2016 · 2 comments
Assignees

Comments

@skiaphrene
Copy link

Dear Andreas,

I've come across a minor issue while using GAT over the past months.

For some reason if the ID field (4th column) of the Isochores BED file is set to ".", GAT hangs with the following Python error:

Traceback (most recent call last):
  File "/software/UHTS/Analysis/gat/1.2/bin/gat-run.py", line 335, in <module>
    sys.exit(main(sys.argv))
  File "/software/UHTS/Analysis/gat/1.2/bin/gat-run.py", line 313, in main
    annotator_results = fromSegments(options, args)
  File "/software/UHTS/Analysis/gat/1.2/bin/gat-run.py", line 242, in fromSegments
    num_threads=options.num_threads)
  File "/software/UHTS/Analysis/gat/1.2/lib64/python2.7/site-packages/gat/__init__.py", line 915, in run
    track, counts, counters, segs, annotations, workspace, outfiles)
  File "/software/UHTS/Analysis/gat/1.2/lib64/python2.7/site-packages/gat/__init__.py", line 622, in sample
    contig_annotations.fromIsochores()
  File "GatEngine.pyx", line 2979, in GatEngine.IntervalCollection.fromIsochores (GatEngine/GatEngine.c:36284)
  File "GatEngine.pyx", line 2767, in GatEngine.IntervalDictionary.fromIsochores (GatEngine/GatEngine.c:32377)
ValueError: too many values to unpack (expected 2)

It runs fine if the "." is set to some other constant (e.g. "isoc" for isochore).
This is not the case for the workspace BED file: "ws" or "." both work just fine.

It actually took me a while to find the source of this problem, so I just thought I'd point it our here!

Best regards,

-- Alex

@AndreasHeger
Copy link
Owner

Thanks, I will take a look.

@AndreasHeger AndreasHeger self-assigned this Sep 26, 2016
@rdalbanus
Copy link

rdalbanus commented Jul 12, 2018

Hi, Andreas,

Is there any specification for the isochore file format? Right now I'm using a single BED4 file with the isochore groups in the 4th column (e.g. gc_quintile_1, gc_quintile_2) and getting the same error as above. Should I be using a separate file for each isochore group, or maybe have the actual numeric for the whatever metric I'm "isochoring" with - say, having the GC content in the 4th column as an integer from 0-100?
This is definitely worth expanding in the official documentation.

Regards,
Ricardo

Update: I solved this by using a BED4 format with consecutive numeric categories on the 4th column (e.g. 1-8).

What doesn't work:

  • Isochore categories split into different files. Both by giving the actual list of files in the --isochore-file argument (e..g. file_1.bed file2.bed file3.bed) or by using glob expansion (e.g. file*.bed). In either case, GAT seems to only read the first file. Using glob in parentheses as asked (file(*).bed or (file*.bed)) breaks bash syntax.
  • Non-consecutive isochore categories in the 4th column (e.g. gc_content_bin_1, gc_content_bin2). Gives the too many values to unpack error.
  • Actual isochore value (e.g. GC content). Gives too many values to unpack.

Hope this helps anyone who stumbles here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants