Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-19721][SS] Good error message for version mismatch in log files #17070

Closed
wants to merge 3 commits into from

Conversation

lw-lin
Copy link
Contributor

@lw-lin lw-lin commented Feb 26, 2017

Problem

There are several places where we write out version identifiers in various logs for structured streaming (usually v1). However, in the places where we check for this, we throw a confusing error message.

What changes were proposed in this pull request?

This patch made two major changes:

  1. added a parseVersion(...) method, and based on this method, fixed the following places the way they did version checking (no other place needed to do this checking):
HDFSMetadataLog
  - CompactibleFileStreamLog  ------------> fixed with this patch
    - FileStreamSourceLog  ---------------> inherited the fix of `CompactibleFileStreamLog`
    - FileStreamSinkLog  -----------------> inherited the fix of `CompactibleFileStreamLog`
  - OffsetSeqLog  ------------------------> fixed with this patch
  - anonymous subclass in KafkaSource  ---> fixed with this patch
  1. changed the type of FileStreamSinkLog.VERSION, FileStreamSourceLog.VERSION etc. from String to Int, so that we can identify newer versions via version > 1 instead of version != "v1"
    • note this didn't break any backwards compatibility -- we are still writing out "v1" and reading back "v1"

Exception message with this patch

java.lang.IllegalStateException: Failed to read log file /private/var/folders/nn/82rmvkk568sd8p3p8tb33trw0000gn/T/spark-86867b65-0069-4ef1-b0eb-d8bd258ff5b8/0. UnsupportedLogVersion: maximum supported log version is v1, but encountered v99. The log file was produced by a newer version of Spark and cannot be read by this version. Please upgrade.
	at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:202)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite$$anonfun$3$$anonfun$apply$mcV$sp$2.apply(OffsetSeqLogSuite.scala:78)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite$$anonfun$3$$anonfun$apply$mcV$sp$2.apply(OffsetSeqLogSuite.scala:75)
	at org.apache.spark.sql.test.SQLTestUtils$class.withTempDir(SQLTestUtils.scala:133)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite.withTempDir(OffsetSeqLogSuite.scala:26)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite$$anonfun$3.apply$mcV$sp(OffsetSeqLogSuite.scala:75)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite$$anonfun$3.apply(OffsetSeqLogSuite.scala:75)
	at org.apache.spark.sql.execution.streaming.OffsetSeqLogSuite$$anonfun$3.apply(OffsetSeqLogSuite.scala:75)
	at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
	at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)

How was this patch tested?

unit tests

@SparkQA
Copy link

SparkQA commented Feb 26, 2017

Test build #73476 has finished for PR 17070 at commit dcae69b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Feb 26, 2017

This is changing a lot of stuff to barely improve an error, and the PR has problems. I don't think this is worthwhile

@@ -226,7 +226,15 @@ class KafkaSourceSuite extends KafkaSourceTest {
source.getOffset.get // Read initial offset
}

assert(e.getMessage.contains("Please upgrade your Spark"))
Seq(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's useful to assert about the exact message. Assert that it has key substrings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done; thanks!

@@ -100,7 +100,8 @@ private[kafka010] class KafkaSource(
override def serialize(metadata: KafkaSourceOffset, out: OutputStream): Unit = {
out.write(0) // A zero byte is written to support Spark 2.1.0 (SPARK-19517)
val writer = new BufferedWriter(new OutputStreamWriter(out, StandardCharsets.UTF_8))
writer.write(VERSION)
writer.write("v" + VERSION)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Write one string, or write this in 3 steps if you're worried about efficiency? rather than 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -195,6 +196,11 @@ class HDFSMetadataLog[T <: AnyRef : ClassTag](sparkSession: SparkSession, path:
val input = fileManager.open(batchMetadataFile)
try {
Some(deserialize(input))
} catch {
case ise: IllegalStateException =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just let the exception go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the low-level exception does not know about the log file's path, and I'm trying to put it into the error message to give users very explicit information

@@ -18,6 +18,7 @@
package org.apache.spark.sql.execution.streaming

import java.io._
import java.lang.IllegalStateException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to import from java.lang

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, what a simple mistake :-)

s"is v${maxSupportedVersion}, but encountered v$version. The log file was produced " +
s"by a newer version of Spark and cannot be read by this version. Please upgrade.")
}
else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put on previous line; no need to use return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

private[sql] def parseVersion(text: String, maxSupportedVersion: Int): Int = {
if (text.length > 0 && text(0) == 'v') {
val version =
try { text.substring(1, text.length).toInt }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brace style is wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lw-lin
Copy link
Contributor Author

lw-lin commented Feb 26, 2017

@srowen thanks for the comments! I was trying to tackle SPARK-19721, sorry the summary just said "WIP" without a JIRA number -- adding JIRA number back.

@lw-lin lw-lin changed the title [WIP][SS] Good error message for version mismatch in log files [SPARK-19721][SS] Good error message for version mismatch in log files Feb 26, 2017
@SparkQA
Copy link

SparkQA commented Feb 26, 2017

Test build #73487 has finished for PR 17070 at commit 18f77b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

/cc @zsxwing

@lw-lin
Copy link
Contributor Author

lw-lin commented Mar 2, 2017

@zsxwing would you take a look when you've got a minute? Thanks!

@lw-lin
Copy link
Contributor Author

lw-lin commented Mar 7, 2017

Jenkins retest this please

@SparkQA
Copy link

SparkQA commented Mar 7, 2017

Test build #74063 has finished for PR 17070 at commit 18f77b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@lw-lin
Copy link
Contributor Author

lw-lin commented Mar 9, 2017

@zsxwing would you take a look when you've got a minute? Thanks!

Copy link
Member

@zsxwing zsxwing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Just one nit.

case ise: IllegalStateException =>
// re-throw the exception with the log file path added
throw new IllegalStateException(
s"Failed to read log file $batchMetadataFile. ${ise.getMessage}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please also add ise as the cause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done; thanks!

@SparkQA
Copy link

SparkQA commented Mar 16, 2017

Test build #74648 has finished for PR 17070 at commit 32c0017.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zsxwing
Copy link
Member

zsxwing commented Mar 16, 2017

LGTM. Merging to master and 2.1. Thanks!

@zsxwing
Copy link
Member

zsxwing commented Mar 16, 2017

@lw-lin there are conflicts with 2.1. Could you submit a new PR for branch-2.1?

@asfgit asfgit closed this in 2ea214d Mar 16, 2017
@lw-lin
Copy link
Contributor Author

lw-lin commented Mar 17, 2017

@zsxwing sure, please see #17327

@lw-lin lw-lin deleted the better-msg branch March 18, 2017 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants