New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improving process_results #40
improving process_results #40
Conversation
db_seqs_counts_a[subject_id_a] += 1 | ||
db_seqs_counts_b[subject_id_b] += 1 | ||
elif vals['a']['bit_score'] > vals['b']['bit_score']: | ||
if not subject_id_b: | ||
results[i]['perfect_interest'] += 1 | ||
results[i]['summary'].append('%s\t%s\t' % (seq_name, | ||
subject_id_a)) | ||
results[i]['summary_fh'].write('%s\t%s\t\n' % ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the header on line 239 has 3 columns but this is only writing two. Would it be possible to write an explicit null value so the resulting file is not jagged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Results are not gonna be jagged are they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it expects 3 values: seq_id best_A best_B. In this case, there is no best_B. Thus, just adding a new line is fine cause there is no value there. However, I can add something more specific but not sure what.
The results shouldn't be jagged. BTW there is a test that checks that the resulting files are the same.
@ElDeveloper @wasade ready for another pass ... thanks! |
for (perc_id_a, aln_len_a), (perc_id_b, aln_len_b) in izip(iter_a, iter_b): | ||
filename = "p1_%d-a1_%d_p2_%d-a2_%d" % (perc_id_a, aln_len_a, | ||
perc_id_b, aln_len_b) | ||
summary_filename = join(output_dir, "summary_" + filename + ".txt") | ||
summary_fh = open(summary_filename, 'w') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I just noticed that this file handle is not being closed, or am I missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, it's not and I think is fine cause their will be closed once the program finishes. If we want to close them, we will need to put a for loop at the end of this one to close them. Should I do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be a good idea specially, if the number of files grows, we might run out of file handles 😱
Just ☝️ comment. |
No description provided.