Skip to content
This repository has been archived by the owner on Mar 31, 2020. It is now read-only.

sink parameter

tianliplus edited this page Feb 26, 2016 · 6 revisions

Sink Configuration Paramters

Name Default Description
channel -
type - The component type name, needs to be com.aliyun.odps.flume.sink.OdpsSink
accessID - Aliyun account accessID
accessKey - Aliyun account accessKey
odps.endPoint http://service.odps.aliyun.com/api ODPS endpoint. For applications running on Aliyun ECS, use http://odps-ext.aiyun-inc.com/api, otherwise, use http://service.odps.aliyun.com/api
odps.datahub.endPoint http://dh.odps.aliyun.com Datahub upload endpoint. For applications running on Aliyun ECS, use http://dh-ext.odps.aliyun-inc.com, otherwise, use http://dh.odps.aliyun.com. (Note: if DNS error occured, the corresponding IP address can be used instead)
odps.project - ODPS project name
odps.table - ODPS table name
odps.partition - Comma separate list of partition values identifying the partition to write to. May contain escape sequences. E.g: If the table is partitioned by (continent: string, country :string, time : string) then ‘Asia,India,2014-02-26-01-21’ will indicate continent=Asia,country=India,time=2014-02-26-01-21
batchSize 100 Number of events to be sent in one batch
serializer - Serializer is responsible for parsing out field from the event and mapping them to columns in the ODPS table. Choice of serializer depends upon the format of the data in the event. Supported serializers: DELIMITED
dateFormat yyyy-MM-dd HH:mm:ss Parsing format of the datetime field in ODPS table
shard.number 1 Number of shards to be loaded
shard.maxTimeOut 60 Load shard timeout in second
autoCreatePartition true Flume will automatically create the necessary ODPS partitions to stream to
timeZone Local Time Name of the timezone that should be used for resolving the escape sequences in partition, e.g. America/Los_Angeles
round false Should the timestamp be rounded down (if true, affects all time based escape sequences except %t)
roundUnit minute The unit of the round down value - second, minute or hour
roundValue 1 Rounded down to the highest multiple of this (in the unit configured using odps.roundUnit), less than current time
useLocalTimeStamp false Use the local time (instead of the timestamp from the event header) while replacing the escape sequences

Serializer DELIMITED handles simple dilimited textual events.

Name Default Description
serializer.dilimiter , The field delimiter in the incoming data. To use special characters, surround them with double quotes like “\t”
serializer.fieldnames - The mapping from input fields to columns in odps table. Specified as a comma separated list (no spaces) of odps table columns names, identifying the input fields in order of their occurrence. To skip fields leave the column name unspecified. Eg. ‘time,,ip,message’ indicates the 1st, 3rd and 4th fields in input map to time, ip and message columns in the ODPS table.
serializer.charset UTF-8 The charset of the event body. Assumed by default to be UTF-8

The following are the escape sequences supported:

Alias | Description ---|---|--- %{host}|Substitute value of event header named “host”. Arbitrary header names are supported. %t|Unix time in milliseconds %a|locale’s short weekday name (Mon, Tue, ...) %A|locale’s full weekday name (Monday, Tuesday, ...) %b|locale’s short month name (Jan, Feb, ...) %B|locale’s long month name (January, February, ...) %c|locale’s date and time (Thu Mar 3 23:05:25 2005) %d|day of month (01) %D|date; same as %m/%d/%y %H|hour (00..23) %I|hour (01..12) %j|day of year (001..366) %k|hour ( 0..23) %m|month (01..12) %M|minute (00..59) %p|locale’s equivalent of am or pm %s|seconds since 1970-01-01 00:00:00 UTC %S|second (00..59) %y|last two digits of year (00..99) %Y|year (2010) %z|+hhmm numeric timezone (for example, -0400)

Note: For all of the time related escape sequences, a header with the key “timestamp” must exist among the headers of the event (unless useLocalTimeStamp is set to true).

Clone this wiki locally