Skip to content
Amazon Kinesis Producer Library
C++ Java CMake Shell Python
Branch: master
Clone or download

Latest commit

Cory-Bradshaw and cobrads Linux and OSX Build Automation (#291)
* Adding Backpressure callout to README

* Adding Travis CI config

* Silencing bootstrap output by default

* Moving upload to S3 to after_success

* Simplifying Travis CI Jobs

* Adding linux build steps

* Adding Bundling tasks to OSX phase to reduce pipelines

* Adding commit hash to deal with eventual consistancy

Co-authored-by: cobrads <44237866+cobrads@users.noreply.github.com>
Latest commit 3c66931 Jan 6, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Adding standard files (#194) Apr 9, 2018
aws Refactor the check for timepoints being defined or not. Nov 13, 2019
java Added more configurations to sample producer to run without modifying… Nov 19, 2019
.gitignore Release 0.10.2 Nov 19, 2015
.travis.yml Linux and OSX Build Automation (#291) Jan 6, 2020
CHANGELOG.md Adding latest release notes for release 0.14.0 (#282) Nov 14, 2019
CMakeLists.txt Fixing CMakeLists.txt for building on MacOS Sep 18, 2019
CODE_OF_CONDUCT.md Adding standard files (#194) Apr 9, 2018
CONTRIBUTING.md Adding standard files (#194) Apr 9, 2018
LICENSE Update samples KCL dependency version to 1.10.0 and update KPL to Apa… Apr 18, 2019
README.md Adding Backpressure callout to README (#287) Nov 29, 2019
THIRD_PARTY_NOTICES Release version 0.9.0 Jun 2, 2015
aggregation-format.md Release version 0.9.0 Jun 2, 2015
bootstrap.sh Linux and OSX Build Automation (#291) Jan 6, 2020
metrics.md Release version 0.9.0 Jun 2, 2015
pack.py Release 0.10.1 Aug 17, 2015

README.md

Kinesis Producer Library

Introduction

The Amazon Kinesis Producer Library (KPL) performs many tasks common to creating efficient and reliable producers for Amazon Kinesis. By using the KPL, customers do not need to develop the same logic every time they create a new application for data ingestion.

For detailed information and installation instructions, see the article Developing Producer Applications for Amazon Kinesis Using the Amazon Kinesis Producer Library in the Amazon Kinesis Developer Guide.

Back-pressure

Please see this blog post for details about writing efficient and reliable producers using the KPL. This blogpost contains details about overhead in various situations in which you might be using the KPL including back-pressure considerations.

The KPL can consume enough memory to crash itself if it gets pushed too many records without time to process them. As a protection against this, we ask that every customer implement back-pressure to protect the KPL process. Once the KPL starts getting too many records in it's buffer it will spend most of it's CPU cycles on record management, rather than record processing making the problem worse. This is highly dependent on the customer record sizes, rates, configurations, host CPU and memory limits.

When deciding the limits of your KPL instance, please consider your MAX record size, MAX request rate spikes, host memory availability, and TTL. If you are buffering requests before going into the KPL, consider that as well since that still puts memory pressure on the host system. If the KPL buffer grows too large it may be forcibly crashed due to memory exhaustion.

Sample Back-pressure implementation:

ClickEvent event = inputQueue.take();
        String partitionKey = event.getSessionId();
        String payload =  event.getPayload();
        ByteBuffer data = ByteBuffer.wrap(payload.getBytes("UTF-8"));
        while (kpl.getOutstandingRecordsCount() > MAX_RECORDS_IN_FLIGHT) {
            Thread.sleep(SLEEP_BACKOFF_IN_MS);
        }
        recordsPut.getAndIncrement();

        ListenableFuture<UserRecordResult> f =
                kpl.addUserRecord(STREAM_NAME, partitionKey, data);
        Futures.addCallback(f, new FutureCallback<UserRecordResult>() {
          ...
          ...

Sample above is provided as an example implementation. Please take your application and use cases into consideration before applying logic

Recommended Settings for Streams larger than 800 shards

The KPL is an application for ingesting data to your Kinesis Data Streams. As your streams grow you may find the need to tune the KPL to enable it to accommodate the growing needs of your applications. Without optimized configurations your KPL processes will see inefficient CPU usage and delays in writing records into KDS. For streams larger than 800 shards, we recommend the following settings:

  • ThreadingModel= “POOLED”
  • MetricsGranularity= “stream”
  • ThreadPoolSize=128

We recommend performing sufficient testing before applying these changes to production, as every customer has different usage patterns

Required KPL Update – v0.14.0

KPL 0.14.0 now uses ListShards API, making it easier for your Kinesis Producer applications to scale. Kinesis Data Streams (KDS) enables you to scale your stream capacity without any changes to producers and consumers. After a scaling event, producer applications need to discover the new shard map. Version 0.14.0 replaces the DescribeStream with the ListShards API for shard discovery. ListShards API supports 100TPS per stream compared to DescribeStream that supports 10TPS per account. For an account with 10 streams using KPL v0.14.0 will provide you a 100X higher call rate for shard discovery, eliminating the need for a DescribeStream API limit increase for scaling. You can find more information on the ListShards API in the Kinesis Data Streams documentation.

Required Upgrade

Starting on February 9, 2018 Amazon Kinesis Data Streams will begin transitioning to certificates issued by Amazon Trust Services (ATS). To continue using the Kinesis Producer Library (KPL) you must upgrade the KPL to version 0.12.6 or later.

If you have further questions please open a GitHub Issue, or create a case with the AWS Support Center.

This is a restatement of the notice published in the Amazon Kinesis Data Streams Developer Guide

Release Notes

0.14.0

  • Note: Windows platform will be unsupported going forward for this library.
  • [PR #280] When aggregation is enabled and all the buffer time is consumed for aggregating User records into Kinesis records, allow some additional buffer time for aggregating Kinesis Records into PutRecords calls.
  • [PR #260] Added endpoint for China Ningxia region (cn-northwest-1).
  • [PR #277] Changed mechanism to update the shard map
    • Switched to using ListShards instead of DescribeStream, as this is a more scalable API
    • Reduced the number of unnecessary shard map invalidations
    • Reduced the number of unnecessary update shard map calls
    • Reduced logging noise for aggregated records landing on an unexpected shard
  • [PR #276] Updated AWS SDK from 1.0.5 to 1.7.180
  • [PR #275] Improved the sample code to avoid need to edit code to run.
  • [PR #274] Updated bootstrap.sh to build all dependencies and pack binaries into the jar.
  • [PR #273] Added compile flags to enable compiling aws-sdk-cpp with Gcc7.
  • [PR #229] Fixed bootstrap.sh to download dependent libraries directly from source.
  • [PR #246] [PR #264] Various Typos

0.13.1

  • Including windows binary for Apache 2.0 release.

0.13.0

  • [PR #256] Update KPL to Apache 2.0

0.12.11

Java

  • Bump up the version to 0.12.11.

Older release notes moved to CHANGELOG.md

Supported Platforms and Languages

The KPL is written in C++ and runs as a child process to the main user process. Precompiled native binaries are bundled with the Java release and are managed by the Java wrapper.

The Java package should run without the need to install any additional native libraries on the following operating systems:

  • Linux distributions with glibc 2.9 or later
  • Apple OS X 10.9 and later

Note the release is 64-bit only.

Sample Code

A sample java project is available in java/amazon-kinesis-sample.

Compiling the Native Code

Rather than compiling from source, Java developers are encouraged to use the KPL release in Maven, which includes pre-compiled native binaries for Linux, macOS.

To build the native components and bundle them into the jar, you can run the ./bootstrap.sh which will download the dependencies, build them, then build the native binaries, bundle them into the java resources folder, and then build the java packages. This must be done on the platform you are planning to execute the jars on.

Using the Java Wrapper with the Compiled Native Binaries

There are two options. You can either pack the binaries into the jar like we did for the official release, or you can deploy the native binaries separately and point the java code at it.

Pointing the Java wrapper at a Custom Binary

The KinesisProducerConfiguration class provides an option setNativeExecutable(String val). You can use this to provide a path to the kinesis_producer[.exe] executable you have built. You have to use backslashes to delimit paths on Windows if giving a string literal.

You can’t perform that action at this time.