Skip to content

Commit

Permalink
[Doc] Add *HDFS2 sink connector guide* (#5226)
Browse files Browse the repository at this point in the history
* Add *HDFS2 sink connector guide*

* Update

* Update
  • Loading branch information
Anonymitaet authored and Jennifer88huang-zz committed Oct 10, 2019
1 parent 5df6488 commit 1611470
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 28 deletions.
4 changes: 2 additions & 2 deletions site2/docs/io-connectors.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,9 +50,9 @@ Pulsar has various sink connectors, which are sorted alphabetically as below.

- [HBase sink connector](io-hbase.md)

- [HDFS2 sink connector](io-hdfs2.md)
- [HDFS2 sink connector](io-hdfs2-sink.md)

- [HDFS3 sink connector](io-hdfs3.md)
- [HDFS3 sink connector](io-hdfs3-sink.md)

- [InfluxDB sink connector](io-influxdb-sink.md)

Expand Down
26 changes: 0 additions & 26 deletions site2/docs/io-hdfs.md

This file was deleted.

53 changes: 53 additions & 0 deletions site2/docs/io-hdfs2-sink.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
id: io-hdfs2-sink
title: HDFS2 sink connector
sidebar_label: HDFS2 sink connector
---

The HDFS2 sink connector pulls the messages from Pulsar topics
and persists the messages to HDFS files.

## Configuration

The configuration of the HDFS2 sink connector has the following properties.

### Property

| Name | Type|Required | Default | Description
|------|----------|----------|---------|-------------|
| `hdfsConfigResources` | String|true| None | A file or a comma-separated list containing the Hadoop file system configuration.<br/><br/>**Example**<br/>'core-site.xml'<br/>'hdfs-site.xml' |
| `directory` | String | true | None|The HDFS directory where files read from or written to. |
| `encoding` | String |false |None |The character encoding for the files.<br/><br/>**Example**<br/>UTF-8<br/>ASCII |
| `compression` | Compression |false |None |The compression code used to compress or de-compress the files on HDFS. <br/><br/>Below are the available options:<br/><li>BZIP2<br/><li>DEFLATE<br/><li>GZIP<br/><li>LZ4<br/><li>SNAPPY|
| `kerberosUserPrincipal` |String| false| None|The principal account of Kerberos user used for authentication. |
| `keytab` | String|false|None| The full pathname of the Kerberos keytab file used for authentication. |
| `filenamePrefix` |String| false |None |The prefix of the files created inside the HDFS directory.<br/><br/>**Example**<br/> The value of topicA result in files named topicA-. |
| `fileExtension` | String| false | None| The extension added to the files written to HDFS.<br/><br/>**Example**<br/>'.txt'<br/> '.seq' |
| `separator` | char|false |None |The character used to separate records in a text file. <br/><br/>If no value is provided, the contents from all records are concatenated together in one continuous byte array. |
| `syncInterval` | long| false |0| The interval between calls to flush data to HDFS disk in milliseconds. |
| `maxPendingRecords` |int| false|Integer.MAX_VALUE | The maximum number of records that hold in memory before acking. <br/><br/>Setting this property to 1 makes every record send to disk before the record is acked.<br/><br/>Setting this property to a higher value allows buffering records before flushing them to disk.

### Example

Before using the HDFS2 sink connector, you need to create a configuration file through one of the following methods.

* JSON

```json
{
"hdfsConfigResources": "core-site.xml",
"directory": "/foo/bar",
"filenamePrefix": "prefix",
"compression": "SNAPPY"
}
```

* YAML

```yaml
configs:
hdfsConfigResources: "core-site.xml"
directory: "/foo/bar"
filenamePrefix: "prefix"
compression: "SNAPPY"
```

0 comments on commit 1611470

Please sign in to comment.