Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing execution plan for merge operation with python API #893

Closed
dexianta opened this issue Jan 12, 2022 · 3 comments
Closed

Printing execution plan for merge operation with python API #893

dexianta opened this issue Jan 12, 2022 · 3 comments
Labels
acknowledged This issue has been read and acknowledged by Delta admins enhancement New feature or request

Comments

@dexianta
Copy link

dexianta commented Jan 12, 2022

Hi, is there a way to see the execution plan at this level?
e.g. deltaTable.alias('t1').merge(update_df.alias(t2), 't1.id = t2.id').explain()

I see the example from Databricks was by creating temp view, and executed from SQL, as in spark.sql('EXPLAIN ...')

@zsxwing zsxwing added the enhancement New feature or request label Jan 12, 2022
@zsxwing
Copy link
Member

zsxwing commented Jan 12, 2022

Yep. This sounds a useful API to add.

@zsxwing zsxwing added the acknowledged This issue has been read and acknowledged by Delta admins label Jan 12, 2022
@stikkireddy
Copy link
Contributor

@zsxwing would something like this work:

  def explain(): Unit = {
    val sparkSession = targetTable.toDF.sparkSession
    val mergeIntoCommand = getMergeIntoCommand(sparkSession)

    // only simple is needed to describe the merge into command
    val explain = ExplainCommand(mergeIntoCommand, mode = ExplainMode.fromString("simple"))
    val plan = SparkSession.active.sessionState.executePlan(explain).executedPlan.executeCollect()
      .map(_.getString(0))
      .mkString("\n")
    // scalastyle:off println
    println(plan)
    // scalastyle:on println
  }

^ based off of spark dataset.explain()...

where the output looks like the following:

== Physical Plan ==
Execute MergeIntoCommand
   +- MergeIntoCommand Project [_1#468 AS key2#473, _2#469 AS value2#474], Relation [key1#477,value1#478] parquet, Delta[version=0, ... :/private/var/folders/gy/dy_2wchs6wz7223r9nzckf980000gq/T/spark-5bd85319-3f6e-491e-9c6b-2cd52ce9481f], (key1#477 = key2#473), [Update [actions: [`key1` = key2#473, `value1` = value2#474]]], [Insert [actions: [`key1` = key2#473, `value1` = value2#474]]], StructType(StructField(key1,IntegerType,true), StructField(value1,IntegerType,true))

Would this be sufficient for explain command?

@zsxwing
Copy link
Member

zsxwing commented Mar 22, 2022

As I mentioned in #910 (comment) , we are not going to build this right now. Closing the ticket.

@zsxwing zsxwing closed this as completed Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged This issue has been read and acknowledged by Delta admins enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants