This library provides helpful Hive helper functions for apache spark users.
libraryDependencies += "com.brayanjules" %% "diane" % "0.0.4"
You can find diane releases for different Scala versions:
This function createOrReplaceHiveView
creates a hive view from a delta table. The View will contain all the columns
of the delta table, meaning that it will be like coping the table to a view not filtering or transformations are possible.
Here's how to use the function:
HiveHelpers.createOrReplaceHiveView(viewName = "students",deltaPath = "file:/path/to/your/delta-lake/table",deltaVersion = 100L)
Note that this function will create the hive view based on a specific version of the delta table.
The function getTableType
return the table type(Managed, External or Non-registered) of a given table name. The
return type is a enum value containing the label string.
Here's how to use the function:
HiveHelpers.getTableType(tableName = "students")
The result will be an HiveTableType:
HiveTableType.EXTERNAL(label = "EXTERNAL")
The function registerTable
adds metadata information of a parquet or delta table to the Hive metastore,
this enables it to be queried.
Here is how to use the function:
HiveHelpers.registerTable(tableName = "students",tableLoc = "file:/path/to/your/table", provider = HiveProvider.DELTA)
after that you would be able to query, i.e:
SparkSession.active.sql("select * from students").show
The function allTables
return a dataframe with the description of all the tables that are registered in the Hive metastore
for a specific database. It only supports parquet and delta table for now. This function is really helpful
when we want to explore which table we have registered and their nature. let's see how to use it in an example:
Using an environment with the spark session configured with the connection to your hive metastore, execute the following command:
HiveHelpers.allTables("your_hive_database_name")
The result of executing the previous command will be a table with the following values:
+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|database|tableName |provider|owner |partitionColumns|bucketColumns|type |schema |detail |
+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|default |e_new_table |delta |brayan_jules|[] |[] |EXTERNAL|{"type":"struct","fields":[{"name":"num","type":"string","nullable":true,"metadata":{}},{"name":"description","type":"string","nullable":true,"metadata":{}}]} |{tableProperties -> [delta.minReaderVersion=1,delta.minWriterVersion=2]} |
|default |inventory |delta |brayan_jules|[] |[] |EXTERNAL|{"type":"struct","fields":[{"name":"id","type":"integer","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}},{"name":"description","type":"string","nullable":true,"metadata":{}},{"name":"quantity","type":"integer","nullable":true,"metadata":{}},{"name":"price","type":"double","nullable":true,"metadata":{}},{"name":"vendor_id","type":"integer","nullable":true,"metadata":{}},{"name":"category_id","type":"integer","nullable":true,"metadata":{}}]}|{tableProperties -> [delta.minReaderVersion=1,delta.minWriterVersion=2]} |
|default |lang_num_table|parquet |brayan_jules|[num] |[] |MANAGED |{"type":"struct","fields":[{"name":"description","type":"string","nullable":true,"metadata":{}},{"name":"num","type":"string","nullable":true,"metadata":{}}]} |{inputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat, outputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, serdeLibrary -> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe}|
|default |num_table |delta |brayan_jules|[] |[] |MANAGED |{"type":"struct","fields":[{"name":"value","type":"string","nullable":true,"metadata":{}}]} |{tableProperties -> [delta.minReaderVersion=1,delta.minWriterVersion=2]} |
|default |p_e_new_table |parquet |brayan_jules|[] |[] |EXTERNAL|{"type":"struct","fields":[{"name":"num","type":"string","nullable":true,"metadata":{}},{"name":"description","type":"string","nullable":true,"metadata":{}}]} |{inputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat, outputFormat -> org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, serdeLibrary -> org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe}|
+--------+--------------+--------+------------+----------------+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
We welcome contributions to this project, to contribute checkout our CONTRIBUTING.md file.
- SBT 1.8.2
- Java 8
- Scala 2.12.12
To compile, run
sbt compile
To test, run
sbt test
To generate artifacts, run
sbt package