Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions site/docs/hive.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,40 @@ You should now be able to issue Hive SQL `SELECT` queries using the above table
SELECT * from table_a;
```

#### Using Hive Catalog
Iceberg tables created using `HiveCatalog` are automatically registered with Hive.

##### Create an Iceberg table
The first step is to create an Iceberg table using the Spark/Java/Python API and `HiveCatalog`. For the purposes of this documentation we will assume that the table is called `table_b` and that the table location is `s3://some_path/table_b`. In order for Iceberg to correctly set up the Hive table for querying some configuration values need to be set, the two options for this are described below - you can use either or the other depending on your use case.

##### Hive Configuration
The value `iceberg.engine.hive.enabled` needs to be set to `true` and added to the Hive configuration file on the classpath of the application creating the table. This can be done by modifying the relevant `hive-site.xml`. Alternatively this can done programatically like so:
```java
Configuration hadoopConfiguration = spark.sparkContext().hadoopConfiguration();
hadoopConfiguration.set(ConfigProperties.ENGINE_HIVE_ENABLED, "true"); //iceberg.engine.hive.enabled=true
HiveCatalog catalog = new HiveCatalog(hadoopConfiguration);
...
catalog.createTable(tableId, schema, spec);
```

##### Table Property Configuration
The property `engine.hive.enabled` needs to be set to `true` and added to the table properties when creating the Iceberg table. This can be done like so:
```java
Map<String, String> tableProperties = new HashMap<String, String>();
tableProperties.put(TableProperties.ENGINE_HIVE_ENABLED, "true"); //engine.hive.enabled=true
catalog.createTable(tableId, schema, spec, tableProperties);
```

#### Query the Iceberg table via Hive
In order to query a Hive table created by either of the HiveCatalog methods described above you need to first set a Hive configuration value like so:
```sql
SET iceberg.mr.catalog=hive;
```
You should now be able to issue Hive SQL `SELECT` queries using the above table and see the results returned from the underlying Iceberg table. Both the Map Reduce and Tez query execution engines are supported.
```sql
SELECT * from table_b;
```

### Features

#### Predicate pushdown
Expand Down