Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record size limit exceeded in 1586 KB #210

Closed
pradeepbhadani opened this issue Nov 9, 2020 · 9 comments
Closed

Record size limit exceeded in 1586 KB #210

pradeepbhadani opened this issue Nov 9, 2020 · 9 comments

Comments

@pradeepbhadani
Copy link

I am using kinesis_streams plugin and sending data from a file to Kinesis.
From AWS doc, I can see that there is a 1MB limit per record. But is there a better way to handle these error - like filter out these records when bigger than 1MB, etc?

Config

<match **>
  @type kinesis_streams
  stream_name test-stream
  aws_key_id AWS_KEY_ID
  aws_sec_key AWS_SECURITY_KEY
  region eu-west-2
  random_partition_key true

  <buffer>
   flush_interval 5s
   flush_thread_count 10
  </buffer>
</match>

Error

2020-11-06 10:40:45 +0000 [error]: #0 Record size limit exceeded in 1586 KB: ***********
2020-11-06 10:41:12 +0000 [warn]: #0 Retrying to request batch. Retry count:   1, Retry records: 161, Wait seconds 0.34
2020-11-06 10:41:12 +0000 [warn]: #0 Retrying to request batch. Retry count:   2, Retry records:  81, Wait seconds 0.26
2020-11-06 10:41:12 +0000 [warn]: #0 Retrying to request batch. Retry count:   3, Retry records:  41, Wait seconds 0.34
2020-11-06 10:41:13 +0000 [warn]: #0 Retrying to request batch. Retry count:   1, Retry records: 173, Wait seconds 0.26
2020-11-06 10:41:13 +0000 [warn]: #0 Retrying to request batch. Retry count:   2, Retry records:  87, Wait seconds 0.25
2020-11-06 10:41:14 +0000 [warn]: #0 Retrying to request batch. Retry count:   3, Retry records:  44, Wait seconds 0.25
2020-11-06 10:41:14 +0000 [warn]: #0 Retrying to request batch. Retry count:   4, Retry records:  22, Wait seconds 0.33
2020-11-06 10:41:15 +0000 [warn]: #0 Retrying to request batch. Retry count:   1, Retry records:  74, Wait seconds 0.35
2020-11-06 10:41:15 +0000 [warn]: #0 Retrying to request batch. Retry count:   2, Retry records:  37, Wait seconds 0.33
@singhajit89
Copy link

I am facing similar issue and haven't found any solution yet.

Found another link with same issue which was closed with no resolution
#201

@simukappu
Copy link
Contributor

Hi @pradeepbhadani, @singhajit89

Thank you for your feedback!
Do you have any enhancement requests for this plugin? The maximum size of the record payload, up to 1 MB is a limit of Kinesis Data Streams. Even if this plugin could split records longer than 1MB, the consumer of Kinesis Data Streams would get those partial records as different independent records.
I would appreciate it if you have any good idea. Thank you!

@singhajit89
Copy link

Hi @simukappu,

I haven't raise enhancement request yet for this plugin but will definitely raise based on feedback on recommended work around/solution below.

I am using this plugin to send the record to Kinesis Firehose which has exactly same 1 MB limitation and exception in td-agent.log file when record size limit exceeded similar to issue reported on #201

Fact:
You are absolutely right if the plugin could split records longer than 1MB, the consumer would get those partial records as different independent records.

My use Case:
The application we have at our environment has several occurrence where the record size is greater than 1MB. Due to Kinesis 1 MB limitation the td-agent.log file keeps on printing the exception and then there are other adverse impact on respective application server due to log file size getting exploded in GB's and running out of space.

To avoid log explosion haven't configure my application to send logs to Kinesis Firehose.

Recommended Solution:

  1. Printing exception in td-agent.log configurable for class "ExceedMaxRecordSizeError" in kinesis.rb file.
  • This will allow us to use the plugin and send the logs of the application to Kinesis and if there is any record of size greater than 1 MB at least it would not print the exception and fill the td-agent.log
  • Or we can just print as ExceedMaxRecordSizeError in td-agent.log file instead of Message and actual record as exception.

The above will help me to use my application to send all the logs to Kinesis excluding the record of size greater than 1 MB and also not filling up the td-agent.log file.

Thank you!

@simukappu
Copy link
Contributor

simukappu commented Jan 4, 2021

Thank you for the details. I understood what the critical issue is.
We will consider to enhance this plugin to handle ExceedMaxRecordSizeError without printing contents of the records. The cause of this issue seems to be here.

@simukappu
Copy link
Contributor

Hi @singhajit89,

Does log_truncate_max_size configuration parameter become a solution for your issue? When you set this parameter, error log printed by "SkipRecordError" including "ExceedMaxRecordSizeError" will be truncated to length of log_truncate_max_size here.
What do you think about your recommended solution?

@singhajit89
Copy link

Hi @simukappu,

I had already tried using the above configuration log_truncate_max_size a month back but that didn't worked for me.

But recently I made a change in kinesis.rb file which now working as expected. For the record greater than 1MB its now only printing the exception name & size and skipping to print the record_message content in td-agent.log file which is helping to keep the file log file size low and also allowing us to see application logs in elasticsearch(excluding record of size greater than 1MB)

Original block----

def to_s
          super + ": " + @record_message
        end
      end
      class KeyNotFoundError < SkipRecordError
        def initialize(key, record)
          super "Key '#{key}' doesn't exist", record
        end
      end
      class ExceedMaxRecordSizeError < SkipRecordError
        def initialize(size, record)
          super "Record size limit exceeded in #{size/1024} KB", record
        end
      end

Changes that is working

def to_s
          super + ": "
        end
      end
      class KeyNotFoundError < SkipRecordError
        def initialize(key, record)
          super "Key '#{key}' doesn't exist", record
        end
      end
      class ExceedMaxRecordSizeError < SkipRecordError
        def initialize(size, record)
          super "ExceedMaxRecordSizeError in #{size/1024} KB", record
        end
      end

For Example: Using the above if the record size exceeds the kinesis firehose limit it prints below in td-agent.log file

ExceedMaxRecordSizeError in 1157 KB:

@singhajit89
Copy link

Hi @simukappu,

How can I get the above changes committed in actual repo or what is the procedure to do it ?

@simukappu
Copy link
Contributor

If we remove @record_message from to_s method, there will be no way to know the content of error record from standard logs. We are providing log_truncate_max_size parameter to shorten log size. We would appreciate it if you could feedback why log_truncate_max_size parameter does not work for you and why you would like to remove @record_message from standard log format. Thank you for your feedback!

@simukappu
Copy link
Contributor

Closing this issue for now. Please reopen if required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants