-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A new step call fastqprocessing for Optimus to speed up #82
Conversation
…oBam, Attach10XBarcodes, SplitBamFile, SplitBamByCellBarcodes--into one step in Optimus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments and suggested changes, but most importantly I think we need to find a different solution that does not include checking in all the libStatGen files
Are there any tests for this code? |
I think we said we were going to add tests with this ticket https://broadinstitute.atlassian.net/browse/GL-1179 .. or at least add more if there are some already. |
…nd freeze a particular version
Codecov Report
@@ Coverage Diff @@
## master #82 +/- ##
=======================================
Coverage 88.85% 88.85%
=======================================
Files 28 28
Lines 3007 3007
=======================================
Hits 2672 2672
Misses 335 335
Continue to review full report at Codecov.
|
@kishorikonwar I think you may have missed several of my comments that github had hidden, please take another look |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more minor comments. Looks good overall!
// sam records in its buffer. This is the same behavior across all | ||
// reader threads | ||
if (r == block_size || !fastQFileR1.keepReadingFile()) { | ||
submit_block_tobe_written(samrecord_data, tindex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
submit_block_tobe_written(samrecord_data, tindex); | |
submit_block_to_be_written(samrecord_data, tindex); |
for (int j=0; j < 5; j++) { | ||
char c = tp[i]; | ||
tp[i] = ATCG[j]; | ||
/* if the mutation in any of the positions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough. Should we put a ticket in the backlog to look into this ?
#!/bin/bash | ||
|
||
|
||
# --tool=memcheck \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove unused code
} | ||
/* getopt_long already printed an error message. */ | ||
return; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unnecessary break;
after return;
namespace fs = std::experimental::filesystem; | ||
|
||
/** @copydoc filesize */ | ||
int32_t filesize(const char *filename) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is int32 sufficient here, or is a long type needed for large files? If file size is in bytes doesn't this top out at 2.1G?
FILE *f = fopen(filename, "rb"); /* open the file in read only */ | ||
|
||
int32_t size = 0; | ||
if (fseek(f, 0, SEEK_END) ==0 ) /* seek was successful */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a better way to get a file's size, such as fstat
?
@@ -0,0 +1 @@ | |||
./fastqproc ../../L8TX/L8TX_180221_01_F12_R1.fastq.gz ../../L8TX/L8TX_180221_01_F12_I1.fastq.gz ../../L8TX/L8TX_180221_01_F12_R2.fastq.gz ../../L8TX/L8TX_171026_01_F03_R1.fastq.gz ../../L8TX/L8TX_171026_01_F03_I1.fastq.gz ../../L8TX/L8TX_171026_01_F03_R2.fastq.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these scripts refer to files that aren't checked in, what use are they for developers? This script in particular doesn't seem worth checking in once the development work is complete, since it's just calling fastqproc
on a list of fastqs.
|
||
|
||
/** | ||
* @brief Compute the number of bam files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this match the function? It looks like it's computing the number of blocks in a bam file.
Purpose
To speed-up Optimus for large datasets