Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 

S3 Per Record output plugin for Embulk

This plugin uploads a column's value to S3 as one S3 object per row. S3 object key can be composed of another column.

Breaking Changes from 0.3.x

This plugin serialize data columns by itself. at present, Supported formats are msgpack and json. Because of it, config parameters are changed.

  • Rename data_column to data_columns, and change type from string to array
  • Add mode
  • Add serializer
  • Add column_options.

Please update your configs.

Overview

  • Plugin type: output
  • Load all or nothing: no
  • Resume supported: no

Configuration

  • bucket: S3 bucket name.
  • key: S3 object key. ${column} is replaced by the column's value.
  • data_columns: Columns for object's body.
  • serializer: Serializer format. Supported formats are msgpack and json. default is msgpack.
  • mode: Set mode. Supported modes are multi_column and single_column. default is multi_column.
  • aws_access_key_id: (optional) AWS access key id. If not given, DefaultAWSCredentialsProviderChain is used to get credentials.
  • aws_secret_access_key: (optional) AWS secret access key. Required if aws_access_key_id is given.
  • retry_limit: (default 2) On connection errors,this plugin automatically retry up to this times.
  • column_options: Timestamp formatting option for columns.

Example

multi_column mode

out:
  type: s3_per_record
  bucket: your-bucket-name
  key: "sample/${id}.txt"
  mode: multi_column
  serializer: msgpack
  data_columns: [id, payload]
id | payload (json type)
------------
 1 | hello
 5 | world
12 | embulk

This generates s3://your-bucket-name/sample/1.txt with its content {"id": 1, "payload": "hello"} that is formatted with msgpack format, s3://your-bucket-name/sample/5.txt with its content {"id": 5, "payload": "world"}, and so on.

single_column mode

out:
  type: s3_per_record
  bucket: your-bucket-name
  key: "sample/${id}.txt"
  mode: single_column
  serializer: msgpack
  data_columns: [payload]
id | payload (json type)
------------
 1 | hello
 5 | world
12 | embulk

This generates s3://your-bucket-name/sample/1.txt with its content "hello" that is formatted with msgpack format, s3://your-bucket-name/sample/5.txt with its content "world", and so on.

Build

$ ./gradlew gem  # -t to watch change of files and rebuild continuously

About

This plugin uploads a column's value to S3 as one S3 object per row.

Resources

License

Releases

No releases published

Packages

No packages published