Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kinesis Agent has high CPU for cases with many files #59

Open
frankfarrell opened this issue Dec 7, 2016 · 2 comments
Open

Kinesis Agent has high CPU for cases with many files #59

frankfarrell opened this issue Dec 7, 2016 · 2 comments

Comments

@frankfarrell
Copy link
Contributor

frankfarrell commented Dec 7, 2016

Our use case is to SFTP a file to our server every minute, and Kinesis agent is configured to match these files. The files are not modified after they are written and they have a footer line, eg "File Administratively Closed..."

Using jstack we identified the thread using high CPU as

"FileTailer[<filePattern>]" #13 prio=5 os_prio=0 tid=0x00007f2bf0574000 nid=0xdadb runnable [0x00007f2bcc490000]
   java.lang.Thread.State: RUNNABLE
    at java.lang.Object.hashCode(Native Method)
    at java.util.HashMap.hash(HashMap.java:338)
    at java.util.HashMap.put(HashMap.java:611)
    at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.syncCounterpartsByFileId(Unknown Source)
    at com.amazon.kinesis.streaming.agent.tailing.TrackedFileRotationAnalyzer.<init>(Unknown Source)
    at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.updateCurrentFile(Unknown Source)
    at com.amazon.kinesis.streaming.agent.tailing.SourceFileTracker.refresh(Unknown Source)
    at com.amazon.kinesis.streaming.agent.tailing.FileTailer.updateRecordParser(Unknown Source)
    - locked <0x00000000eac78af0> (a com.amazon.kinesis.streaming.agent.tailing.FileTailer)
    at com.amazon.kinesis.streaming.agent.tailing.FileTailer.processRecords(Unknown Source)
    - locked <0x00000000eac78af0> (a com.amazon.kinesis.streaming.agent.tailing.FileTailer)
    at com.amazon.kinesis.streaming.agent.tailing.FileTailer.runOnce(Unknown Source)
    at com.amazon.kinesis.streaming.agent.tailing.FileTailer.run(Unknown Source)
    at com.google.common.util.concurrent.AbstractExecutionThreadService$1$2.run(AbstractExecutionThreadService.java:60)
    at com.google.common.util.concurrent.Callables$3.run(Callables.java:95)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - None 

Possible Solution:
A solution would be to have a fileFooterPattern that stops tailing the file when it is matched. This has the advantage of not modifying the file (eg, as would adding .CLOSED to the file).

@johnou
Copy link

johnou commented Sep 28, 2017

@frankfarrell try increasing the value of minTimeBetweenFilePollsMillis to 1000 (default is 100).

@johnou
Copy link

johnou commented Sep 28, 2017

@chaochenq please consider swapping the thread per file busy loop implementation with WatchService (supports registering multiple paths etc.).

http://docs.oracle.com/javase/tutorial/essential/io/examples/WatchDir.java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants