Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1992] Make maximum FASTQ read length configurable. #2077

Merged
merged 3 commits into from Nov 7, 2018

Conversation

@heuermh
Copy link
Member

@heuermh heuermh commented Nov 5, 2018

Fixes #1992

Frank Austin Nothaft and others added 2 commits Jul 6, 2018
@AmplabJenkins
Copy link

@AmplabJenkins AmplabJenkins commented Nov 5, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2849/
Test PASSed.

@heuermh
Copy link
Member Author

@heuermh heuermh commented Nov 5, 2018

@akmorrow13 Ping for review and merge

public static final int DEFAULT_MAX_READ_LENGTH = 10000;

/** Maximum read length property name. */
public static final String MAX_READ_LENGTH_PROPERTY = "org.bdgenomics.adam.io.FastqRecordReader.MAX_READ_LENGTH";

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 6, 2018
Contributor

Is there a way to set this in the cli or just the shell?

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018
Author Member

Hmm, good point.

It is a property in the Hadoop configuration, so I suppose

adam-submit \
  --conf org.bdgenomics.adam.io.FastqRecordReader.MAX_READ_LENGTH=42 \
  --

might work

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018
Author Member

See new commit

@coveralls
Copy link

@coveralls coveralls commented Nov 6, 2018

Coverage Status

Coverage decreased (-0.02%) to 79.186% when pulling 872da37 on heuermh:issue-1992 into fc67969 on bigdatagenomics:master.

@AmplabJenkins
Copy link

@AmplabJenkins AmplabJenkins commented Nov 6, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2850/
Test PASSed.

@@ -139,6 +140,8 @@ class TransformAlignmentsArgs extends Args4jBase with ADAMSaveAnyArgs with Parqu
var partitionByStartPos: Boolean = false
@Args4jOption(required = false, name = "-partition_bin_size", usage = "Partition bin size used in Hive-style partitioning. Defaults to 1Mbp (1,000,000) base pairs).")
var partitionedBinSize = 1000000
@Args4jOption(required = false, name = "-max_read_length", usage = "Maximum FASTQ read length, defaults to 10,000 base pairs (bp).")
var maxReadLength: Int = 0

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 6, 2018
Contributor

Should the default be 10,000?

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018
Author Member

The default is 10,000, as set in FastqRecordReader.
Since it isn't possible to initialize maxReadLength to null, 0 is used instead.

@heuermh
Copy link
Member Author

@heuermh heuermh commented Nov 7, 2018

@akmorrow13 Ping for merge

@@ -385,6 +388,10 @@ class TransformAlignments(protected val args: TransformAlignmentsArgs) extends B

val stringencyOpt = Option(args.stringency).map(ValidationStringency.valueOf(_))

if (args.maxReadLength > 0) {

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 7, 2018
Contributor

Just a nit, it may make more sense to set the default args.maxReadLength to 10,000 and remove the if statement. Either way works though.

This comment has been minimized.

@heuermh

heuermh Nov 7, 2018
Author Member

If it is set to the default here in the cli class, then it is always set here, and will clobber the configuration property if defined on the command line or elsewhere.

@akmorrow13 akmorrow13 merged commit 1ad572b into bigdatagenomics:master Nov 7, 2018
1 check passed
1 check passed
@AmplabJenkins
default Merged build finished.
Details
@akmorrow13
Copy link
Contributor

@akmorrow13 akmorrow13 commented Nov 7, 2018

Thanks @heuermh!

@heuermh heuermh deleted the heuermh:issue-1992 branch Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants