Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose Scoring History and Variable Importances on H2OMOJOModel #3212

Closed
exalate-issue-sync bot opened this issue May 22, 2023 · 2 comments
Closed

Expose Scoring History and Variable Importances on H2OMOJOModel #3212

exalate-issue-sync bot opened this issue May 22, 2023 · 2 comments
Assignees

Comments

@exalate-issue-sync
Copy link

Current workaround:

{code:scala} val model = algo.fit(dataset)

import com.google.gson._
import hex.genmodel.attributes.ModelJsonReader
import hex.genmodel.attributes.Table
import hex.genmodel.attributes.Table.ColumnType
import org.apache.spark.sql.types._
import org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
import collection.JavaConverters._
import org.apache.spark.sql.{DataFrame, Row}

def tableToDataFrame(table: Table): DataFrame = {
  val columnTypes = table.getColTypes.map {
    case ColumnType.LONG => LongType
    case ColumnType.INT => IntegerType
    case ColumnType.DOUBLE => DoubleType
    case ColumnType.FLOAT => FloatType
    case ColumnType.STRING => StringType
  }
  val columns = table.getColHeaders.zip(columnTypes).map {
    case (columnName, columnType) => StructField(columnName, columnType, nullable = true)
  }
  val schema = StructType(columns)
  val rows = (0 until table.rows()).map { rowId =>
    val rowData = (0 until table.columns()).map(colId => table.getCell(colId, rowId)).toArray[Any]
    val row: Row = new GenericRowWithSchema(rowData, schema)
    row
  }.asJava
  spark.createDataFrame(rows, schema)
}

// Get JSON string describing the model
val json = model.getModelDetails()
val modelDetails = new GsonBuilder().create().fromJson(json, classOf[JsonObject])

// Extract scoring history
println("Scoring history:")
val scoringHistoryTable = ModelJsonReader.readTable(modelDetails, "scoring_history")
val scoringHistoryDF = tableToDataFrame(scoringHistoryTable)
scoringHistoryDF.show(numRows = 20, truncate = false) // Increase numRows to see the full result

// Extract variable importances
println("Variable importances:")
val variableImportancesTable = ModelJsonReader.readTable(modelDetails, "variable_importances")
val variableImportancesDF = tableToDataFrame(variableImportancesTable)
variableImportancesDF.show(truncate = false)

{code}

@DinukaH2O
Copy link

JIRA Issue Migration Info

Jira Issue: SW-2559
Assignee: Marek Novotny
Reporter: Marek Novotny
State: Resolved
Fix Version: 3.32.1.3-1
Attachments: N/A
Development PRs: Available

Linked PRs from JIRA

#2525

@hasithjp
Copy link
Member

JIRA Issue Migration Info Cont'd

Jira Issue Created Date: 2021-05-04T08:48:14.346-0700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants