Conversation
Codecov Report
@@ Coverage Diff @@
## master #76 +/- ##
==========================================
- Coverage 87.05% 86.85% -0.21%
==========================================
Files 3 3
Lines 340 350 +10
Branches 6 10 +4
==========================================
+ Hits 296 304 +8
- Misses 44 46 +2
Continue to review full report at Codecov.
|
def normalizedBuildId(): Option[String] = { | ||
`environment.build`.flatMap(_.buildId) match { | ||
case Some(buildId: String) => { | ||
val buildIdDay = buildId.slice(0, 6).toString() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is taking the YYYYMM
of the build, you need YYYYMMDD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right you are
import spark.implicits._ | ||
val messages = TestUtils.generateMainMessages( | ||
1, Some(Map( | ||
"environment.build" -> """{"buildId": "20170602"""", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests are testing the case of a build after a submission date. We should be testing the opposite - a build BEFORE a submission date; specifically: a build < 6 months before a submission_date, and a build > 6 months before a submission date. Any builds after a submission_date should be ignored, since that makes no sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I also found another bug while fixing this 😄
This is RFAL |
4aa35ae
to
fd82386
Compare
case Some(buildId: String) => { | ||
val buildIdDay = buildId.slice(0, 8).toString() | ||
val buildDateFormat = DateTimeFormat.forPattern("yyyyMMdd") | ||
val buildDateTime = buildDateFormat.parseDateTime(buildIdDay) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail on improper dates which our schema validation [0] doesn't necessarily catch. For example, 00000000
is validated, but it will cause this to fail with org.joda.time.IllegalFieldValueException: Cannot parse "00000000": Value 0 for monthOfYear must be in the range [1,12]
.
This will reduce the number of aggregates we generate, which should speed up aggregation and reduce the size of the output parquet file.
bccb50b
to
0d1a386
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢 🤡
This will reduce the number of aggregates we generate, which should
speed up aggregation and reduce the size of the output parquet file.