Skip to content
This repository has been archived by the owner on Feb 4, 2021. It is now read-only.

mozilla/parquet2hive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

parquet2hive Build Status

Hive import statement generator for Parquet datasets. Supports versioned datasets and schema evolution.

Installing from Pypi

To install this package from Pypi, run:

pip install parquet2hive

Updating the Package on PyPi

To upload the most recent version, run:

python setup.py sdist upload

Using the TestPypi Servers

You will need a separate account on https://testpypi.python.org. To upload the file to the pypi test servers, ensure your ~/.pypirc contains the following:

[distutils]
index-servers=
    pypi
    pypitest

[pypitest]
repository = https://testpypi.python.org/pypi
username = testpypi_username 
password = testpypi_password 

[pypi]
repository = https://pypi.python.org/pypi
username = pypi_username 
password = pypi_password   

Upload the code using:

python setup.py sdist upload -r https://testpypi.python.org/pypi

Finally, pull the most recent package from the test-repository on any machine using:

pip install parquet2hive -i https://testpypi.python.org/pypi

Example usage

parquet2hive s3://telemetry-parquet/longitudinal | bash

To see the allowed command line interface arguments, run parquet2hive -h