Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -42,16 +42,14 @@ object CommandUtils extends Logging {

/** Change statistics after changing data by commands. */
def updateTableStats(sparkSession: SparkSession, table: CatalogTable): Unit = {
if (table.stats.nonEmpty) {
Copy link
Contributor Author

@sujith71955 sujith71955 Apr 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this condition check, the insert table command will never able to calculate the table size even if user enables sparkSession.sessionState.conf.autoSizeUpdateEnabled.
This check holds good if autoSizeUpdateEnabled is false which will ensure the size will be calculated from hadoop relation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pinging me, @sujith71955 . Got it. I'll take a look. BTW, I update the JIRA ID in the title.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure . thanks

val catalog = sparkSession.sessionState.catalog
if (sparkSession.sessionState.conf.autoSizeUpdateEnabled) {
val newTable = catalog.getTableMetadata(table.identifier)
val newSize = CommandUtils.calculateTotalSize(sparkSession, newTable)
val newStats = CatalogStatistics(sizeInBytes = newSize)
catalog.alterTableStats(table.identifier, Some(newStats))
} else {
catalog.alterTableStats(table.identifier, None)
}
val catalog = sparkSession.sessionState.catalog
if (sparkSession.sessionState.conf.autoSizeUpdateEnabled) {
val newTable = catalog.getTableMetadata(table.identifier)
val newSize = CommandUtils.calculateTotalSize(sparkSession, newTable)
val newStats = CatalogStatistics(sizeInBytes = newSize)
catalog.alterTableStats(table.identifier, Some(newStats))
} else if (table.stats.nonEmpty) {
catalog.alterTableStats(table.identifier, None)
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -337,6 +337,26 @@ class StatisticsCollectionSuite extends StatisticsCollectionTestBase with Shared
}
}

test("auto gather stats after insert command") {
val table = "change_stats_insert_datasource_table"
Seq(false, true).foreach { autoUpdate =>
withSQLConf(SQLConf.AUTO_SIZE_UPDATE_ENABLED.key -> autoUpdate.toString) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating, @sujith71955 . Here are two issues.

  1. We need to test both cases by using Seq(false, true).foreach { autoUpdate =>
  2. The current indentation is wrong at 343.

If you fix (1), (2) will be fixed together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled the comment. thanks

withTable(table) {
sql(s"CREATE TABLE $table (i int, j string) USING PARQUET")
// insert into command
sql(s"INSERT INTO TABLE $table SELECT 1, 'abc'")
val stats = getCatalogTable(table).stats
if (autoUpdate) {
assert(stats.isDefined)
assert(stats.get.sizeInBytes >= 0)
} else {
assert(stats.isEmpty)
}
}
}
}
}

test("invalidation of tableRelationCache after inserts") {
val table = "invalidate_catalog_cache_table"
Seq(false, true).foreach { autoUpdate =>
Expand Down