embulk-parse-csv with schema file. (without column definitions)
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
config/checkstyle
gradle/wrapper
lib/embulk/parser
src
.gitignore
LICENSE.txt
README.md
build.gradle
gradlew
gradlew.bat

README.md

CSV Parser with schema config file plugin for Embulk

Parses csv files with schema file read by other file input plugins.

Overview

  • Plugin type: parser
  • Guess supported: no

Usage

Install plugin

$ embulk gem install embulk-parser-csv_with_schema_file

Configuration

Example

in:
  type: file
  path_prefix: /tmp/csv/
  parser:
    type: csv_with_schema_file
    schema_path: /tmp/csv_schema.json
    default_timestamp_format: '%Y-%m-%d %H:%M:%S %z'
out: 
  type: stdout

Schema file example (csv_schema.json)

[
   {
      "index": 0,
      "name": "Name",
      "type": "string"
   },
   {
      "index": 1,
      "name": "Cnt",
      "type": "long"
   },
   {
      "index": 2,
      "name": "RegDate",
      "type": "timestamp"
   }
]

Custom column option example

in:
  type: file
  path_prefix: /tmp/csv/
  parser:
    type: csv_with_schema_file
    default_timestamp_format: '%Y-%m-%d %H:%M:%S'
    schema_path: /tmp/csv_schema.json
    columns:
      - {name: Date2, type: timestamp, format: '%Y-%m-%d %H:%M:%S.%N %z'}
out: 
  type: stdout
  

Build

$ ./gradlew gem  # -t to watch change of files and rebuild continuously