Skip to content

Generate Hive CREATE TABLE statements from json data

Notifications You must be signed in to change notification settings

ricaportela/json2hive

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

json2hive

json2hive is a command line utility that can automatically generate CREATE TABLE statements for Hive tables backed by JSON data.

Features

  • Automatically infer schema of JSON data by analysing JSON records
  • Supports external and managed Hive tables
  • Can be used as command line utility or programmatically

Installation

You can install json2hive using pip:

$ pip install json2hive

It is strongly recommended that you install json2hive inside a virtual environment!

Usage

On the Command Line

Run the following and follow the instructions:

$ json2hive --help

As a library

from json2hive.utils import infer_schema
from json2hive.generators import generate_json_table_statement

# infer schema from objects, these objects could be the result of json.loads(...)
object1 = {'name': 'John', age: 25}
object2 = {'name': 'Mary', age: 23}
schema = infer_schema([object1, object2])

# Generate CREATE TABLE statement
statement = generate_json_table_statement('example', schema, managed=True)
print(statement)

About

Generate Hive CREATE TABLE statements from json data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%