New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1992] Make maximum FASTQ read length configurable. #2077

Merged
merged 3 commits into from Nov 7, 2018

Conversation

Projects
None yet
5 participants
@heuermh
Copy link
Member

heuermh commented Nov 5, 2018

Fixes #1992

@AmplabJenkins

This comment has been minimized.

Copy link

AmplabJenkins commented Nov 5, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2849/
Test PASSed.

@heuermh

This comment has been minimized.

Copy link
Member

heuermh commented Nov 5, 2018

@akmorrow13 Ping for review and merge

public static final int DEFAULT_MAX_READ_LENGTH = 10000;

/** Maximum read length property name. */
public static final String MAX_READ_LENGTH_PROPERTY = "org.bdgenomics.adam.io.FastqRecordReader.MAX_READ_LENGTH";

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 6, 2018

Contributor

Is there a way to set this in the cli or just the shell?

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018

Member

Hmm, good point.

It is a property in the Hadoop configuration, so I suppose

adam-submit \
  --conf org.bdgenomics.adam.io.FastqRecordReader.MAX_READ_LENGTH=42 \
  --

might work

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018

Member

See new commit

@coveralls

This comment has been minimized.

Copy link

coveralls commented Nov 6, 2018

Coverage Status

Coverage decreased (-0.02%) to 79.186% when pulling 872da37 on heuermh:issue-1992 into fc67969 on bigdatagenomics:master.

@AmplabJenkins

This comment has been minimized.

Copy link

AmplabJenkins commented Nov 6, 2018

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2850/
Test PASSed.

@@ -139,6 +140,8 @@ class TransformAlignmentsArgs extends Args4jBase with ADAMSaveAnyArgs with Parqu
var partitionByStartPos: Boolean = false
@Args4jOption(required = false, name = "-partition_bin_size", usage = "Partition bin size used in Hive-style partitioning. Defaults to 1Mbp (1,000,000) base pairs).")
var partitionedBinSize = 1000000
@Args4jOption(required = false, name = "-max_read_length", usage = "Maximum FASTQ read length, defaults to 10,000 base pairs (bp).")
var maxReadLength: Int = 0

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 6, 2018

Contributor

Should the default be 10,000?

This comment has been minimized.

@heuermh

heuermh Nov 6, 2018

Member

The default is 10,000, as set in FastqRecordReader.
Since it isn't possible to initialize maxReadLength to null, 0 is used instead.

@heuermh

This comment has been minimized.

Copy link
Member

heuermh commented Nov 7, 2018

@akmorrow13 Ping for merge

@@ -385,6 +388,10 @@ class TransformAlignments(protected val args: TransformAlignmentsArgs) extends B

val stringencyOpt = Option(args.stringency).map(ValidationStringency.valueOf(_))

if (args.maxReadLength > 0) {

This comment has been minimized.

@akmorrow13

akmorrow13 Nov 7, 2018

Contributor

Just a nit, it may make more sense to set the default args.maxReadLength to 10,000 and remove the if statement. Either way works though.

This comment has been minimized.

@heuermh

heuermh Nov 7, 2018

Member

If it is set to the default here in the cli class, then it is always set here, and will clobber the configuration property if defined on the command line or elsewhere.

@akmorrow13 akmorrow13 merged commit 1ad572b into bigdatagenomics:master Nov 7, 2018

1 check passed

default Merged build finished.
Details
@akmorrow13

This comment has been minimized.

Copy link
Contributor

akmorrow13 commented Nov 7, 2018

Thanks @heuermh!

@heuermh heuermh deleted the heuermh:issue-1992 branch Nov 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment