sink parameter
Name | Default | Description |
---|---|---|
channel | - | |
type | - | The component type name, needs to be com.aliyun.odps.flume.sink.OdpsSink
|
accessID | - | Aliyun account accessID |
accessKey | - | Aliyun account accessKey |
odps.endPoint | http://service.odps.aliyun.com/api | ODPS endpoint. For applications running on Aliyun ECS, use http://odps-ext.aiyun-inc.com/api, otherwise, use http://service.odps.aliyun.com/api |
odps.datahub.endPoint | http://dh.odps.aliyun.com | Datahub upload endpoint. For applications running on Aliyun ECS, use http://dh-ext.odps.aliyun-inc.com, otherwise, use http://dh.odps.aliyun.com. (Note: if DNS error occured, the corresponding IP address can be used instead) |
odps.project | - | ODPS project name |
odps.table | - | ODPS table name |
odps.partition | - | Comma separate list of partition values identifying the partition to write to. May contain escape sequences. E.g: If the table is partitioned by (continent: string, country :string, time : string) then ‘Asia,India,2014-02-26-01-21’ will indicate continent=Asia,country=India,time=2014-02-26-01-21 |
batchSize | 100 | Number of events to be sent in one batch |
serializer | - | Serializer is responsible for parsing out field from the event and mapping them to columns in the ODPS table. Choice of serializer depends upon the format of the data in the event. Supported serializers: DELIMITED
|
dateFormat | yyyy-MM-dd HH:mm:ss | Parsing format of the datetime field in ODPS table |
shard.number | 1 | Number of shards to be loaded |
shard.maxTimeOut | 60 | Load shard timeout in second |
autoCreatePartition | true | Flume will automatically create the necessary ODPS partitions to stream to |
timeZone | Local Time | Name of the timezone that should be used for resolving the escape sequences in partition, e.g. America/Los_Angeles |
round | false | Should the timestamp be rounded down (if true, affects all time based escape sequences except %t) |
roundUnit | minute | The unit of the round down value - second , minute or hour
|
roundValue | 1 | Rounded down to the highest multiple of this (in the unit configured using odps.roundUnit), less than current time |
useLocalTimeStamp | false | Use the local time (instead of the timestamp from the event header) while replacing the escape sequences |
Serializer DELIMITED handles simple dilimited textual events.
Name | Default | Description |
---|---|---|
serializer.dilimiter | , | The field delimiter in the incoming data. To use special characters, surround them with double quotes like “\t” |
serializer.fieldnames | - | The mapping from input fields to columns in odps table. Specified as a comma separated list (no spaces) of odps table columns names, identifying the input fields in order of their occurrence. To skip fields leave the column name unspecified. Eg. ‘time,,ip,message’ indicates the 1st, 3rd and 4th fields in input map to time, ip and message columns in the ODPS table. |
serializer.charset | UTF-8 | The charset of the event body. Assumed by default to be UTF-8 |
The following are the escape sequences supported:
Alias | Description ---|---|--- %{host}|Substitute value of event header named “host”. Arbitrary header names are supported. %t|Unix time in milliseconds %a|locale’s short weekday name (Mon, Tue, ...) %A|locale’s full weekday name (Monday, Tuesday, ...) %b|locale’s short month name (Jan, Feb, ...) %B|locale’s long month name (January, February, ...) %c|locale’s date and time (Thu Mar 3 23:05:25 2005) %d|day of month (01) %D|date; same as %m/%d/%y %H|hour (00..23) %I|hour (01..12) %j|day of year (001..366) %k|hour ( 0..23) %m|month (01..12) %M|minute (00..59) %p|locale’s equivalent of am or pm %s|seconds since 1970-01-01 00:00:00 UTC %S|second (00..59) %y|last two digits of year (00..99) %Y|year (2010) %z|+hhmm numeric timezone (for example, -0400)
Note: For all of the time related escape sequences, a header with the key “timestamp” must exist among the headers of the event (unless useLocalTimeStamp is set to true).