Map IntervalList format column four to feature name #1159

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
3 participants
@heuermh
Member

heuermh commented Sep 9, 2016

Fixes #1152, #1168

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 9, 2016

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1477/

Build result: FAILURE

GitHub pull request #1159 of commit e1b6772 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > /home/jenkins/git2/bin/git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1159/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 6d7e26f # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1159/merge^{commit} # timeout=10Checking out Revision 6d7e26f (origin/pr/1159/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 6d7e26f9c46f4086de5bd3974336b0ae56d35520First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1477/

Build result: FAILURE

GitHub pull request #1159 of commit e1b6772 automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > /home/jenkins/git2/bin/git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1159/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 6d7e26f # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1159/merge^{commit} # timeout=10Checking out Revision 6d7e26f (origin/pr/1159/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 6d7e26f9c46f4086de5bd3974336b0ae56d35520First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

if (fields.length < 8 || fields.length > 9) {
- log.warn("Empty or invalid GTF/GFF2 line: {}", line)
- return Seq()
+ if (stringency == ValidationStringency.STRICT) {

This comment has been minimized.

@fnothaft

fnothaft Sep 12, 2016

Member

As an aside, I feel like we've got enough of this patten (print error message if STRICT, log if LENIENT, and return None if not STRICT) around where we might as well wrap it up into a function and factor it out.

@fnothaft

fnothaft Sep 12, 2016

Member

As an aside, I feel like we've got enough of this patten (print error message if STRICT, log if LENIENT, and return None if not STRICT) around where we might as well wrap it up into a function and factor it out.

This comment has been minimized.

@heuermh

heuermh Sep 13, 2016

Member

The way I'm seeing it, if it were made a function, the error messages would always be formatted.

@heuermh

heuermh Sep 13, 2016

Member

The way I'm seeing it, if it were made a function, the error messages would always be formatted.

This comment has been minimized.

@fnothaft

fnothaft Sep 13, 2016

Member

How so?

@fnothaft

fnothaft Sep 13, 2016

Member

How so?

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 12, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1487/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1487/
Test PASSed.

- fb.setEnd(fields(2).toLong)
+ val f = Feature.newBuilder()
+ .setContigName(fields(0))
+ .setStart(fields(1).toLong) // NarrowPeak ranges are 0-based

This comment has been minimized.

@fnothaft

fnothaft Sep 12, 2016

Member

We should try/catch around the fields(x).toLong/fields(x).toDouble calls, both here and below and above and everywhere. We could factor it out like so:

def checkAndSet[T](field: String,
  convFn: String => T,
  setFn: T => Unit,
  stringency: ValidationStringency) {
  try {
    val t = convFn(field)
    setFn(t)
  } catch {
    case e: Throwable => {
      if (stringency == ValidationStringency.LENIENT) {
        log.warn("Failed to convert %s.".format(field))
      } else if (stringency == ValidationStringency.STRICT) {
        throw new IllegalArgumentException("Setting field from %s failed with %s.".format(
          field,
          e.msg))
      }
    }
  }
}

Perhaps we factor this out into a trait:

sealed trait FeatureConverterCanBeValidated {
  val stringency: ValidationStringency

  def validateOrNone(...): Option[Feature]

  def checkAndSet[T](field: String,
    convFn: String => T,
    setFn: T => Unit)
}

All of the classes in here could be turned into case classes that extend that trait. What do you think? Should you proceed, the trait name I proposed is terrible, so please change it.

@fnothaft

fnothaft Sep 12, 2016

Member

We should try/catch around the fields(x).toLong/fields(x).toDouble calls, both here and below and above and everywhere. We could factor it out like so:

def checkAndSet[T](field: String,
  convFn: String => T,
  setFn: T => Unit,
  stringency: ValidationStringency) {
  try {
    val t = convFn(field)
    setFn(t)
  } catch {
    case e: Throwable => {
      if (stringency == ValidationStringency.LENIENT) {
        log.warn("Failed to convert %s.".format(field))
      } else if (stringency == ValidationStringency.STRICT) {
        throw new IllegalArgumentException("Setting field from %s failed with %s.".format(
          field,
          e.msg))
      }
    }
  }
}

Perhaps we factor this out into a trait:

sealed trait FeatureConverterCanBeValidated {
  val stringency: ValidationStringency

  def validateOrNone(...): Option[Feature]

  def checkAndSet[T](field: String,
    convFn: String => T,
    setFn: T => Unit)
}

All of the classes in here could be turned into case classes that extend that trait. What do you think? Should you proceed, the trait name I proposed is terrible, so please change it.

This comment has been minimized.

@heuermh

heuermh Sep 13, 2016

Member

Will check to see what htsjdk does with NumberFormatExceptions and validation stringency.

@heuermh

heuermh Sep 13, 2016

Member

Will check to see what htsjdk does with NumberFormatExceptions and validation stringency.

This comment has been minimized.

@heuermh

heuermh Sep 13, 2016

Member

This seems to fiddly to me, so I've punted for now

@heuermh

heuermh Sep 13, 2016

Member

This seems to fiddly to me, so I've punted for now

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 12, 2016

Member

Generally LGTM! Thanks for taking this on @heuermh. I've dropped a variety of line notes inline.

Member

fnothaft commented Sep 12, 2016

Generally LGTM! Thanks for taking this on @heuermh. I've dropped a variety of line notes inline.

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Sep 13, 2016

Member

Addressed some review comments and rebased.

Member

heuermh commented Sep 13, 2016

Addressed some review comments and rebased.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 13, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1490/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1490/
Test PASSed.

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 15, 2016

Member

LGTM! I will leave this open until Friday morning to wait for any further review comments.

Member

fnothaft commented Sep 15, 2016

LGTM! I will leave this open until Friday morning to wait for any further review comments.

@AmplabJenkins

This comment has been minimized.

Show comment
Hide comment
@AmplabJenkins

AmplabJenkins Sep 15, 2016

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1495/
Test PASSed.

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1495/
Test PASSed.

@heuermh heuermh changed the title from Map IntervalList format column four to feature name to [ADAM-1152] Map IntervalList format column four to feature name Sep 20, 2016

@heuermh heuermh changed the title from [ADAM-1152] Map IntervalList format column four to feature name to Map IntervalList format column four to feature name Sep 20, 2016

@fnothaft

This comment has been minimized.

Show comment
Hide comment
@fnothaft

fnothaft Sep 26, 2016

Member

Thanks @heuermh! Merged as 53b8f48 and dff43ef.

Member

fnothaft commented Sep 26, 2016

Thanks @heuermh! Merged as 53b8f48 and dff43ef.

@fnothaft fnothaft closed this Sep 26, 2016

@heuermh heuermh deleted the heuermh:interval-list-name branch Sep 26, 2016

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Sep 26, 2016

Member

Thanks!

Member

heuermh commented Sep 26, 2016

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment