Skip to content
This repository has been archived by the owner on Feb 2, 2023. It is now read-only.
/ json2arrow Public archive

Convert JSON files to Apache Arrow.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE_APACHE.txt
MIT
LICENSE_MIT.txt
Notifications You must be signed in to change notification settings

domoritz/json2arrow

Repository files navigation

JSON to Arrow

This repo is archived and the code moved to Arrow CLI Tools.

Crates.io Rust

Convert JSON files to Apache Arrow. You may also be interested in csv2arrow, json2parquet, or csv2parquet.

Installation

Download prebuilt binaries

You can get the latest releases from https://github.com/domoritz/json2arrow/releases/.

With Cargo

cargo install json2arrow

Usage

Usage: json2arrow [OPTIONS] <JSON> [ARROW]

Arguments:
  <JSON>   Input JSON file
  [ARROW]  Output file, stdout if not present

Options:
  -s, --schema-file <SCHEMA_FILE>
          File with Arrow schema in JSON format
  -m, --max-read-records <MAX_READ_RECORDS>
          The number of records to infer the schema from. All rows if not present. Setting max-read-records to zero will stop schema inference and all columns will be string typed
  -p, --print-schema
          Print the schema to stderr
  -n, --dry
          Only print the schema
  -h, --help
          Print help information
  -V, --version
          Print version information

The --schema-file option uses the same file format as --dry and --print-schema.

Limitations

Since we use the Arrow JSON loader, we are limited to what it supports. Right now, it supports JSON line-delimited files.

{ "a": 42, "b": true }
{ "a": 12, "b": false }
{ "a": 7, "b": true }

About

Convert JSON files to Apache Arrow.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE_APACHE.txt
MIT
LICENSE_MIT.txt

Stars

Watchers

Forks

Packages

No packages published

Languages