Skip to content

arenahito/embulk-filter-timestamp_hs

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Timestamp Hs filter plugin for Embulk

Gem Version Build Status

Convert string to timestamp at high speed.

Overview

  • Plugin type: filter

Configuration

  • default_timezone: Default timezone (string, default: UTC)
  • default_timestamp_format: Default timezone format by SimpleDateFormat style (string, default: yyyy-MM-dd hh:mm:ss)
  • column_options: Timestamp column options (hash, required)
    • timezone: Timezone (hash, default: default_timezone)
    • format: Timestamp format (hash, default: default_timestamp_format)

Example

2016-01-01 10:02:30.100,2016-01-01 10:02:30.111
2016-01-02 10:02:30.200,2016-01-02 10:02:30.211
2016-01-03 10:02:30.300,2016-01-03 10:02:30.311
2016-01-04 10:02:30.400,2016-01-04 10:02:30.411
2016-01-05 10:02:30.500,2016-01-05 10:02:30.511
in:
  type: file
  path_prefix: applog
  parser:
    type: csv
    delimiter: ","
    columns:
    - {name: standardTimestamp, type: timestamp, format: '%Y-%m-%d %H:%M:%S.%L'}
    - {name: highSpeedTimestamp, type: string}

filters:
  - type: timestamp_hs
    default_timezone: 'UTC'
    column_options:
      highSpeedTimestamp: {format: 'yyyy-MM-dd hh:mm:ss.SSS'}
+-----------------------------+------------------------------+
| standardTimestamp:timestamp | highSpeedTimestamp:timestamp |
+-----------------------------+------------------------------+
| 2016-01-01 10:02:30.100 UTC |  2016-01-01 10:02:30.111 UTC |
| 2016-01-02 10:02:30.200 UTC |  2016-01-02 10:02:30.211 UTC |
| 2016-01-03 10:02:30.300 UTC |  2016-01-03 10:02:30.311 UTC |
| 2016-01-04 10:02:30.400 UTC |  2016-01-04 10:02:30.411 UTC |
| 2016-01-05 10:02:30.500 UTC |  2016-01-05 10:02:30.511 UTC |
+-----------------------------+------------------------------+

Performance

Input file is 1,000,000 lines with one timestamp column.

  • OS: Windows 10
  • CPU: Core i5 2.67GHz
  • Embulk: 0.8.6

Embulk standard timestamp: 240.910s

With timestamp_hs filter: 1.902s

Install

$ embulk gem install embulk-filter-timestamp_hs

About

Convert string to timestamp at high speed.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published