New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FLUME-3149 reduce cpu cost for taildir file source while still maintaining reliability by using posFile in memory channel #159
base: trunk
Are you sure you want to change the base?
Conversation
…ining reliability by using posFile in memory channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the changes seem to be fine. Can you please provide a way of testing this change? Ie. "run this command to see the performance gain".
@@ -1182,7 +1182,6 @@ headers.<filegroupName>.<headerKey> -- Header value | |||
byteOffsetHeader false Whether to add the byte offset of a tailed line to a header called 'byteoffset'. | |||
skipToEnd false Whether to skip the position to EOF in the case of files not written on the position file. | |||
idleTimeout 120000 Time (ms) to close inactive files. If the closed file is appended new lines to, this source will automatically re-open it. | |||
writePosInterval 3000 Interval time (ms) to write the last position of each file on the position file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's better to keep it to be compatible with other channels. I'll add it back and modify the user guide.
@@ -64,4 +64,7 @@ public String toString() { | |||
return "[Event headers = " + headers + ", body.length = " + bodyLen + " ]"; | |||
} | |||
|
|||
public void notifySource() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add javadoc to clarify usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
@@ -64,4 +64,7 @@ public String toString() { | |||
return "[Event headers = " + headers + ", body.length = " + bodyLen + " ]"; | |||
} | |||
|
|||
public void notifySource() {} | |||
|
|||
public void commit() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add javadoc to clarify usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure.
@@ -104,16 +107,15 @@ public synchronized void start() { | |||
} catch (IOException e) { | |||
throw new FlumeException("Error instantiating ReliableTaildirEventReader", e); | |||
} | |||
for (Entry<Long, TailFile> entry : reader.getTailFiles().entrySet()) { | |||
inodePositionMap.put(entry.getKey() + "", entry.getValue().getPos() + ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the + ""
-s needed? As per https://stackoverflow.com/a/1572724/5323166 , String.valueOf
is recommended. Please, let me know if the intent was different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. I'll fix it.
I think this change is essentially reaching reliability in MemoryChannel when using taildirSource. So the performance diff is essentially the diff of MemoryChannel and FileChannel. In my tests it saves more than 90% percent of CPU but it's no easy way of comparing it by a simple command. |
Can one of the admins verify this patch? |
stream-diagnostics:stable v2.3.4 for fix streamQos startTime
File channel tracks transferred events and use transnational mechanism to make transfer recoverable. However, it increases CPU cost due to frequent system calls like write, read, etc. The Cpu cost could be very high if the transfer rate is high.
In contrast, Memory channel has no such issue which requires only about 10% of CPU cost in the same environment but it's not recovered if the system is down accidentally.
For sources like taildir, I propose we could write position file in memory channel to achieve reliability and reduce CPU cost.
After testing on my own production environment, CPU usage dropped from 13% to 3% and still maintain reliability. (Transfer rate: 1Mb/s , kafka sink, file channel -> memory channel with pos file)