Skip to content

anelendata/target-gcs

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

target_gcs

Read in stdin and write out to Google Cloud Storage.

Example usage

Install

python3 -m venv ./venv
source ./venv/bin/activate

Then

pip install https://github.com/anelendata/target_gcs/tarball/master

Or

git clone git@github.com:anelendata/target_gcs.git
pip install -e target_gcs

Configure

Sample configuration file

Note: As in the sample, you can use the following parameters in the blob name:

  • etl_datetime (ISO 8601 format)
  • etl_tstamp (unix time stamp)

Set the path to Google Cloud API's application credential JSON file:

export GOOGLE_APPLICATION_CREDENTIALS=./path_to/your_cred_file.json

Test

Make sure your service account associated with the crendential file has sufficient GCS permissions. If the bucket specified in the config does not exist, target_gcs tries to create one. In this case, the account needs Storage Admin. Otherwise, Object Createor at minimum.

echo -e '{"line": 1, "value": "hello"}\n{"line": 2, "value": "world"}' | target_gcs -c ./your-config.json

Here is the example to get USGS earthquake events data:

curl "https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2020-06-24&endtime=2020-06-25" | target_gcs -c ./your-config.json

Extra: Creating a schemaless, externally partitioned BigQuery table from GCS files

git clone git@github.com:anelendata/target_gcs.git
cd target_gcs
pip install google-cloud-bigquery
python create_schemaless_table.py -p your-project-id -g gs://your-bucket/your-dataset -d your-dataset-name -t your-table-name

Note: dataset must exist.

About this project

This project is developed by ANELEN and friends. Please check out the ANELEN's open innovation philosophy and other projects

ANELEN

Copyright © 2020~ Anelen Co., LLC

About

Take stdin and write out to Google Cloud Storage

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages