Skip to content
A simple and fast data processing tool
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.circleci
cmd
examples
railgun
scripts
terraform
.gitignore
AUTHORS
CONTRIBUTING.md
Dockerfile
LICENSE
README.md
init.sh

README.md

CircleCI Go Report Card GoDoc license

Railgun

Description

Railgun is a simple and fast data processing tool. Railgun uses:

Railgun uses the Dynamic Filter Language through go-dfl. See the *_test files in the dfl source folder on GitHub for comprehensive examples of the syntax.

go-reader can read from stdin, http/https, the local filesystem, AWS S3, and HDFS.

go-simple-serializer (GSS) supports bson, csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml. hcl and hcl2 implementation is fragile and very much in alpha.

For an interactive demo, see the railgun notebook on ObservableHQ. It is very heavy, so only use WiFi.

Usage

CLI

You can use the command line tool to process data.

Usage: railgun -input_format INPUT_FORMAT -o OUTPUT_FORMAT [-input_uri INPUT_URI] [-input_compression [bzip2|gzip|snappy]] [-h HEADER] [-c COMMENT] [-object_path PATH] [-dfl_exp DFL_EXPRESSION] [-dfl_file DFL_FILE] [-output_path OUTPUT_PATH] [-max MAX_COUNT]
Options:
  -aws_access_key_id string
    	Defaults to value of environment variable AWS_ACCESS_KEY_ID
  -aws_default_region string
    	Defaults to value of environment variable AWS_DEFAULT_REGION.
  -aws_secret_access_key string
    	Defaults to value of environment variable AWS_SECRET_ACCESS_KEY.
  -aws_session_token string
    	Defaults to value of environment variable AWS_SESSION_TOKEN.
  -c string
    	The input comment character, e.g., #.  Commented lines are not sent to output.
  -dfl_exp string
    	Process using dfl expression
  -dfl_file string
    	Process using dfl file.
  -h string
    	The input header if the stdin input has no header.
  -hdfs_name_node string
    	Defaults to value of environment variable HDFS_DEFAULT_NAME_NODE.
  -help
    	Print help.
  -input_compression string
    	The input compression: none, bzip2, gzip, snappy (default "none")
  -input_format string
    	The input format: bson, csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml
  -input_reader_buffer_size int
    	The input reader buffer size (default 4096)
  -input_uri string
    	The input uri (default "stdin")
  -max int
    	The maximum number of objects to output (default -1)
  -output_format string
    	The output format: bson, csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml
  -output_uri string
    	The output uri (default "stdout")
  -version
    	Prints version to stdout.

Releases

Railgun is currently in alpha. See releases at https://github.com/spatialcurrent/railgun/releases.

Examples

Search for Cuisine

~/go/src/github.com/spatialcurrent/go-osm/bin/osm_linux_amd64 -input_uri 'http://download.geofabrik.de/north-america/us/district-of-columbia-latest.osm.bz2' -ways_to_nodes -output_format geojsonl -filter_keys_keep amenity -output_uri stdout | railgun -input_format jsonl  -output_format json -dfl_file ~/go/src/github.com/spatialcurrent/railgun/examples/mexican.dfl -output_uri mexican.json

Tsunami Feed

const pipeline = ["filter(@features, '(@properties?.tsunami != null) and (@properties.tsunami == 1)')", "sort(@, '@properties?.mag', true)", "map(@, '@properties?.place ?: \"\"')", "limit(@, 10)"];
(await fetch("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_month.geojson")).json().then(earthquakes => {
  result = railgun.process(earthquakes, {"dfl": pipeline, "output_format": "yaml"});
  console.log(result);
})

Encrypt as Yaml / Decrypt as JSON

# Encrypt secrets.yml and output to secrets.yml.enc
read -s -p 'Password: ' password && echo && railgun_linux_amd64 -input_uri secrets.yml -output_uri secrets.yml.enc -output_passphrase $password
...
# Decrypt secrets.yml.enc and output to stdout
read -s -p 'Password: ' password && echo && railgun_linux_amd64 -input_uri secrets.yml.enc -input_passphrase $password -output_format json

Building

CLI

The build_cli.sh script is used to build executables for Linux and Windows.

JavaScript

You can compile GSS to pure JavaScript with the scripts/build_javascript.sh script.

Changing Destination

The default destination for build artifacts is railgun/bin, but you can change the destination with a CLI argument. For building on a Chromebook consider saving the artifacts in /usr/local/go/bin, e.g., bash scripts/build_cli.sh /usr/local/go/bin

Deploying

mkdir -p /usr/local/terraform
aws-vault exec default -- terraform init # to download aws provider
cp -R .terraform/plugins/linux_amd64/terraform-provider-aws_v1.43.2_x4 /usr/local/terraform
aws-vault exec default -- terraform init -plugin-dir=/usr/local/terraform
aws-vault exec default -- terraform plan

Contributing

Spatial Current, Inc. is currently accepting pull requests for this repository. We'd love to have your contributions! Please see Contributing.md for how to get started.

License

This work is distributed under the MIT License. See LICENSE file.

You can’t perform that action at this time.