Skip to content

Conversation

@psobot
Copy link
Member

@psobot psobot commented Mar 4, 2019

Given a Scio job that uses the same input multiple times:

object JobWithDuplicateInput {
  def main(cmdlineArgs: Array[String]): Unit = {
    val (sc, args) = ContextAndArgs(cmdlineArgs)
    sc.textFile(args("input"))
    sc.textFile(args("input"))
    sc.close()
  }
}

...Scio's TestDataManager currently throws an error in test that reads: There already exists test input for..., which implies that the test itself has multiples of the same test input. This is unclear as the issue is actually with the job itself, not the test.

This PR clarifies this error message by changing it to s"Test input $key has already been used once.", which is hopefully clearer for users that encounter this error.

(cc @clairemcginty, @regadas)

@codecov
Copy link

codecov bot commented Mar 4, 2019

Codecov Report

Merging #1720 into master will decrease coverage by 2.58%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1720      +/-   ##
==========================================
- Coverage    71.6%   69.02%   -2.59%     
==========================================
  Files         175      175              
  Lines        5364     5359       -5     
  Branches      328      314      -14     
==========================================
- Hits         3841     3699     -142     
- Misses       1523     1660     +137
Impacted Files Coverage Δ
...ala/com/spotify/scio/testing/TestDataManager.scala 100% <100%> (ø) ⬆️
.../spotify/scio/bigquery/client/BigQueryConfig.scala 0% <0%> (-83.34%) ⬇️
...scala/com/spotify/scio/bigquery/client/Cache.scala 0% <0%> (-66.67%) ⬇️
.../spotify/scio/bigquery/BigQueryPartitionUtil.scala 0% <0%> (-58.98%) ⬇️
...scala/com/spotify/scio/bigquery/BigQueryUtil.scala 50% <0%> (-50%) ⬇️
...la/com/spotify/scio/bigquery/client/QueryOps.scala 0.79% <0%> (-48.42%) ⬇️
...com/spotify/scio/bigquery/types/BigQueryType.scala 37.5% <0%> (-25%) ⬇️
...la/com/spotify/scio/bigquery/client/TableOps.scala 0% <0%> (-22.37%) ⬇️
...la/com/spotify/scio/bigquery/client/BigQuery.scala 27.02% <0%> (-10.82%) ⬇️
...n/scala/com/spotify/scio/bigquery/BigQueryIO.scala 24.29% <0%> (-3.74%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 00f365e...bffdcdb. Read the comment docs.

Copy link
Contributor

@clairemcginty clairemcginty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for the PR!

require(!s.contains(key),
s"There already exists test input for $key, currently " +
s"registered inputs: ${s.mkString("[", ", ", "]")}")
require(!s.contains(key), s"Test input $key has already been used once.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎨 "used once" -> "read once"?

while you're at it, would you want to make a matching change to TestOutput? :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

object JobWitDuplicateInput {
object JobWithDuplicateInput {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh thanks for fixing the typos!

@psobot
Copy link
Member Author

psobot commented Mar 5, 2019

Changes made - thanks for the review, @clairemcginty!

@clairemcginty
Copy link
Contributor

thanks for the contribution! looks great 👍

@clairemcginty clairemcginty merged commit 71c8f0c into spotify:master Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants