Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_ob_graph failure #11

Open
byoo opened this issue Jan 28, 2021 · 6 comments
Open

create_ob_graph failure #11

byoo opened this issue Jan 28, 2021 · 6 comments

Comments

@byoo
Copy link

byoo commented Jan 28, 2021

Hello, I would like to ask your advice on creating a offset-based graph using the create_ob_graph.
I wonder if you could guide how to resolve the error below. The input json for the create_ob_graph is from a vg file that is converted from a gfa file created by minigraph.

2021-01-28 01:43:03,592, INFO: Setting sequences using vg json graph graph_p.json
Traceback (most recent call last):
  File "graph_peak_caller", line 8, in <module>
    sys.exit(main())
  File "graph_peak_caller/command_line_interface.py", line 36, in main
    run_argument_parser(sys.argv[1:])
  File "graph_peak_caller/command_line_interface.py", line 673, in run_argument_parser
    args.func(args)
  File "graph_peak_caller/preprocess_interface.py", line 67, in create_ob_graph
    sequence_graph.set_sequences_using_vg_json_graph(args.vg_json_file_name)
  File "offsetbasedgraph/sequencegraph.py", line 71, in set_sequences_using_vg_json_graph
    self.set_sequence(int(node_object["id"]), node_object["sequence"])
  File "offsetbasedgraph/sequencegraph.py", line 94, in set_sequence
    assert node_size == len(sequence), "Invalid sequence. Does not cover whole node"
AssertionError: Invalid sequence. Does not cover whole node
@ivargr
Copy link
Member

ivargr commented Jan 28, 2021

Hi!

It seems that it crashes because it thinks there is a node in the graph having a sequence that doesn't match the node size.

Would you be able to send med the vg graph your are using, and I could check whether there is an error in the code or something wrong with the graph?

Thanks!

@byoo
Copy link
Author

byoo commented Jan 28, 2021 via email

@ivargr
Copy link
Member

ivargr commented Jan 28, 2021

Sorry, I had a typo there, I meant to ask "Would you be able to send me the vg graph you are using...?". I see you have a file called graph_p.json, you could alternatively just send me that (but I guess you also have a .vg-file that is smaller).

@byoo
Copy link
Author

byoo commented Jan 28, 2021

Sorry I mistakenly sent the message. Yes, you are right. BTW, even vg file is over 3gb in size so I wonder if it is possible to find the data causing the error and extract it. I am new to work with vg file. I'd appreciate if you guide me. Thank you.

@ivargr
Copy link
Member

ivargr commented Jan 28, 2021

It is a bit tricky without having the graph, since it seems like there might be an error in the graph. Maybe you could explain the steps/pipeline you used to create the graph, and I can see if I can understand how you got the error from there?

@byoo
Copy link
Author

byoo commented Jan 28, 2021

The steps to create graph are 1) perform de novo assembly using hifiasm, 2) build gfa using minigraph, 3) convert gfa to vg to json using vg.
The error occurs in a small subset of the graph here. Thanks!

Sorry I just read that everything in the graph needs to be connected. The graph includes all the chromosomes. It may be part of the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants