Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46623][CORE][MLLIB][SQL] Replace SimpleDateFormat with DateTimeFormatter #44616

Closed
wants to merge 2 commits into from

Conversation

beliefer
Copy link
Contributor

@beliefer beliefer commented Jan 7, 2024

What changes were proposed in this pull request?

This PR propose to replace SimpleDateFormat with DateTimeFormatter.

Why are the changes needed?

According to the javadoc of SimpleDateFormat, it recommended to use DateTimeFormatter too.
1
In addition, DateTimeFormatter have better performance than SimpleDateFormat too.

Note: SimpleDateFormat and DateTimeFormatter are not completely compatible, for example, the formats supported by parse are not exactly the same.

Does this PR introduce any user-facing change?

'No'.

How was this patch tested?

GA and manual test.

Was this patch authored or co-authored using generative AI tooling?

'No'.

@github-actions github-actions bot added the CORE label Jan 7, 2024
@HyukjinKwon
Copy link
Member

cc @MaxGekk BTW, we would have to replace FastDateFormat too if possible.

@beliefer beliefer changed the title [WIP][CORE] Replace SimpleDateFormat with DateTimeFormatter [WIP][CORE][MLLIB][SQL] Replace SimpleDateFormat with DateTimeFormatter Jan 8, 2024
@beliefer beliefer changed the title [WIP][CORE][MLLIB][SQL] Replace SimpleDateFormat with DateTimeFormatter [SPARK-46623][CORE][MLLIB][SQL] Replace SimpleDateFormat with DateTimeFormatter Jan 8, 2024
@beliefer
Copy link
Contributor Author

beliefer commented Jan 8, 2024

ping @dongjoon-hyun @srowen @LuciferYang

@beliefer
Copy link
Contributor Author

cc @mridulm

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+CC @srowen as well

@@ -58,7 +59,10 @@ private[deploy] class Master(
private val appIdPattern = conf.get(APP_ID_PATTERN)

// For application IDs
private def createDateFormat = new SimpleDateFormat("yyyyMMddHHmmss", Locale.US)
private lazy val dateTimeFormatter =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Does not need to be lazy and can be moved to object Master, same in other places as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 107 to 113
@transient
private val dateTimeFormatter =
DateTimeFormatter
.ofPattern("yyyyMMddHHmmss", Locale.US)
.withZone(ZoneId.systemDefault())

private val jobTrackerId: String = dateTimeFormatter.format(new Date().toInstant)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont need this field

Suggested change
@transient
private val dateTimeFormatter =
DateTimeFormatter
.ofPattern("yyyyMMddHHmmss", Locale.US)
.withZone(ZoneId.systemDefault())
private val jobTrackerId: String = dateTimeFormatter.format(new Date().toInstant)
private val jobTrackerId: String = {
val dateTimeFormatter =
DateTimeFormatter
.ofPattern("yyyyMMddHHmmss", Locale.US)
.withZone(ZoneId.systemDefault())
dateTimeFormatter.format(new Date().toInstant)
}

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. +CC @LuciferYang as well.

@@ -821,7 +820,7 @@ private[deploy] class Worker(
}

private def generateWorkerId(): String = {
workerIdPattern.format(createDateFormat.format(new Date), host, port)
workerIdPattern.format(Worker.DATE_TIME_FORMATTER.format(new Date().toInstant), host, port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to directly use Instant.now() instead of new Date().toInstant?

DateTimeFormatter
.ofPattern("yyyyMMddHHmmss", Locale.US)
.withZone(ZoneId.systemDefault())
dateTimeFormatter.format(new Date().toInstant)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@@ -34,7 +40,7 @@ private[mllib] trait PMMLModelExport {
val version = getClass.getPackage.getImplementationVersion
val app = new Application("Apache Spark MLlib").setVersion(version)
val timestamp = new Timestamp()
.addContent(new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.US).format(new Date()))
.addContent(DATE_TIME_FORMATTER.format(new Date().toInstant))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Timestamp.valueOf(
TextSocketReader.DATE_FORMAT.format(Calendar.getInstance().getTime()))
Timestamp.valueOf(TextSocketReader.DATE_TIME_FORMATTER.format(
Calendar.getInstance().getTime().toInstant))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to directly use Instant.now() instead of new alendar.getInstance().getTime().toInstant?

@@ -41,6 +42,11 @@ class HiveTempPath(session: SparkSession, val hadoopConf: Configuration, path: P

lazy val externalTempPath: Path = getExternalTmpPath(path)

private lazy val dateTimeFormatter =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definition can also be moved to object HiveTempPath

@beliefer
Copy link
Contributor Author

The GA failure is unrelated.

@mridulm mridulm closed this in bede614 Jan 18, 2024
@mridulm
Copy link
Contributor

mridulm commented Jan 18, 2024

Merged to master.
Thanks for fixing this @beliefer !
Thanks for the review @LuciferYang :-)

@mridulm
Copy link
Contributor

mridulm commented Jan 18, 2024

I had to manually resolve the jira @beliefer - can you please check if I assigned it to you ? There were two id's with same name.

@beliefer
Copy link
Contributor Author

@mridulm @LuciferYang @HyukjinKwon Thank you!

@beliefer
Copy link
Contributor Author

I had to manually resolve the jira @beliefer - can you please check if I assigned it to you ? There were two id's with same name.

I checked. It's OK. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants