Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in m6anet-inference step #8

Closed
bhargava-morampalli opened this issue May 13, 2021 · 7 comments
Closed

Error in m6anet-inference step #8

bhargava-morampalli opened this issue May 13, 2021 · 7 comments

Comments

@bhargava-morampalli
Copy link

Hi Chris,

Thanks for releasing the new model for m6anet. I have run the nanopolish, data-prep steps successfully but I am encountering error with the inference step. Here's the command I used and the error that I got.

Screenshot 2021-05-13 at 12 50 52 PM

Here are the contents of the m6anet data-prep output directory.

data.index  data.json  data.log  data.readcount  eventalign.index 

and a few lines at the beginning for each file.
data.index

transcript_id,transcript_position,start,end
gnl|X|GEFLOABG_1,495,0,241
gnl|X|GEFLOABG_1,554,241,805
gnl|X|GEFLOABG_1,566,805,1074
gnl|X|GEFLOABG_1,609,1074,1460
gnl|X|GEFLOABG_1,641,1460,1702
gnl|X|GEFLOABG_1,794,1702,1935
gnl|X|GEFLOABG_1,929,1935,2312
gnl|X|GEFLOABG_1,1276,2312,2415
gnl|X|GEFLOABG_1,2798,2415,2768

data.json

{"gnl|X|GEFLOABG_1":{"495":{"AAAACCA":[[0.01184074074074074,2.4917777777777776,110.4,0.00564,3.395,103.8,0.00752,2.2332285714285716,83.8],[0.01029,3.492,106.7,0.0156,4.1739999999999995,106.2,0.008974782608695654,1.8518405797101452,85.8]]}}}
{"gnl|X|GEFLOABG_1":{"554":{"CGAACTT":[[0.003320000000000001,6.228,113.0,0.003519512195121951,2.2558536585365854,98.7,0.00548,1.915878787878788,94.2],[0.0046864,5.01176,111.9,0.005838703703703704,2.561796296296296,98.3,0.004930512820512821,1.81474358974359,93.4],[0.010001636363636365,4.834199999999999,116.8,0.00299,2.9989999999999997,96.2,0.010620000000000001,2.162,91.2],[0.008946176470588235,5.474882352941177,116.5,0.018920000000000003,3.4560000000000004,102.8,0.00465,1.288,92.7],[0.00498,3.87,112.3,0.00598,2.3705,100.1,0.00299,1.9419999999999997,93.7]]}}}
{"gnl|X|GEFLOABG_1":{"566":{"GGGACTC":[[0.00465,2.62,125.0,0.011222000000000001,2.3234,126.6,0.0070901960784313725,4.266843137254902,91.4],[0.00232,3.6639999999999993,118.4,0.023049523809523808,8.358790476190476,126.5,0.0033529999999999996,3.2972499999999996,91.1]]}}}
{"gnl|X|GEFLOABG_1":{"609":{"AAAACTT":[[0.0069680930232558155,2.397232558139535,110.0,0.006640000000000002,3.582,112.1,0.00465,3.0060000000000002,97.6],[0.01727975,3.17975,112.6,0.004538723404255319,2.678808510638298,111.9,0.005993611111111112,1.6218055555555557,90.4],[0.010791319148936171,2.3994510638297872,111.8,0.00465,2.253,108.8,0.0073202272727272725,2.005272727272727,95.2]]}}}
{"gnl|X|GEFLOABG_1":{"641":{"AAAACAT":[[0.01029,3.805,104.3,0.0166,4.933,96.9,0.004095833333333333,2.7229166666666664,86.7],[0.00797,1.8795,111.6,0.009260961538461539,2.988711538461539,101.2,0.0063100000000000005,3.5839999999999996,88.1]]}}}
{"gnl|X|GEFLOABG_1":{"794":{"AAAACTG":[[0.00996,5.541,111.1,0.007640000000000001,6.316,107.9,0.005605757575757576,2.3050909090909095,91.1],[0.00365,3.0589999999999993,109.3,0.00266,2.226,100.3,0.0037576000000000003,2.78356,93.4]]}}}
{"gnl|X|GEFLOABG_1":{"929":{"CGAACTG":[[0.005724387755102042,4.217897959183674,103.9,0.006640000000000002,4.3839999999999995,100.0,0.003927636363636363,2.333509090909091,94.9],[0.010001636363636365,5.474981818181819,116.8,0.00232,1.39,102.4,0.00797,1.676,97.8],[0.011672112676056338,7.767183098591549,118.6,0.00299,2.261,104.7,0.008669074074074076,2.5933148148148146,94.3]]}}}
{"gnl|X|GEFLOABG_1":{"1276":{"ATAACAT":[[0.00365,1.327,87.2,0.00299,1.148,91.1,0.00365,2.305,96.9]]}}}
{"gnl|X|GEFLOABG_1":{"2798":{"CTGACAT":[[0.016830176991150442,3.0475840707964603,107.3,0.00498,13.562000000000001,115.8,0.0234909649122807,2.8029298245614034,81.8],[0.00365,2.01,105.9,0.0093,8.011000000000001,113.2,0.0083,2.9219999999999997,78.3],[0.010674254545454544,3.1419200000000003,106.3,0.0049552,9.74708,109.6,0.007640000000000001,3.4,83.2]]}}}
{"gnl|X|GEFLOABG_1":{"2872":{"GTGACAC":[[0.006475897435897436,3.2191025641025637,98.5,0.006076666666666667,9.85124,115.7,0.0052285185185185195,2.837111111111111,81.5],[0.008934782608695653,4.1631847826086945,103.2,0.011549135514018692,5.914228971962616,110.3,0.005172222222222222,4.274511111111111,84.8]]}}}

data.log

gnl|X|GEFLOABG_1: Data preparation ... Done.

data.readcount

transcript_id,transcript_position,n_reads
gnl|X|GEFLOABG_1,495,2
gnl|X|GEFLOABG_1,554,5
gnl|X|GEFLOABG_1,566,2
gnl|X|GEFLOABG_1,609,3
gnl|X|GEFLOABG_1,641,2
gnl|X|GEFLOABG_1,794,2
gnl|X|GEFLOABG_1,929,3
gnl|X|GEFLOABG_1,1276,1
gnl|X|GEFLOABG_1,2798,3

eventalign.index

transcript_id,read_index,pos_start,pos_end
gnl|X|GEFLOABG_1,9,172,40517
gnl|X|GEFLOABG_1,17,40517,124637
gnl|X|GEFLOABG_1,4,124637,184631
gnl|X|GEFLOABG_1,15,184631,306199
gnl|X|GEFLOABG_1,12,306199,361953
gnl|X|GEFLOABG_1,27,361953,440434
gnl|X|GEFLOABG_1,5,440434,546348
gnl|X|GEFLOABG_1,16,546348,605586
gnl|X|GEFLOABG_1,21,605586,686048

Happy to provide more details if needed. Please let me know how to resolve this error.

@chrishendra93
Copy link
Collaborator

hi @bhargava-morampalli , can I ask if you modify some of the codes because I think the script name should be m6anet-run_inference? Anyway it seems that you are passing None type to the os.path.join argument somewhere in the script, is it possible that the problem is because you use -i and -o instead of --input_dir and --output_dir, that's why the command line does not recognize the input and output directory argument?

@bhargava-morampalli
Copy link
Author

Oh, sorry - I just followed the quick start thing. So, I should use run_inference instead of inference?

m6anet-inference -input_dir demo_data --out_dir demo_data ---n_processes 4

I have not touched any code. Also, for inference - --input_dir and --out_dir did not work - that's why I changed them to -i and -o. I will try with run_inference and let you know if it works.

@bhargava-morampalli
Copy link
Author

Okay, I used the m6anet-run_inference and I got the data.result.csv.gz file. It's very small and only a few bytes. Is this normal?
also, n_processes is for allocating the threads - is it correct?

@chrishendra93
Copy link
Collaborator

chrishendra93 commented May 13, 2021

Ah I see, it's weird that it can still execute m6anet-inference, perhaps I have forgotten to clear some files / cache, let me check that on my end. Apology for the typo in the documentation, I have just updated it so that people will not mistake the command. Thanks!

Anyway, you are right, --n_processes is to allocate the number of threads. Also, by default m6anet will require each position to have at least 20 reads. May I know the size of the data.readcount files (like how many rows and whether they seem to have a lot of positions with at least 20 reads)? Also, can you check if the entries inside data.result.csv.gz make sense?

@bhargava-morampalli
Copy link
Author

That's great, thanks. I will check the data.readcount and also about the data in csv.gz. I will let you know the results.

@chrishendra93
Copy link
Collaborator

oh right, please try out with --input_dir instead of -input_dir as stated in the documentation before, that was a typo that I have corrected in the documentation, again, sorry for this

@chrishendra93
Copy link
Collaborator

hi @bhargava-morampalli , can I ask if you have managed to run this successfully? If you have, then I want to close this issue, otherwise please let me know of any problems you are facing with running m6anet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants