-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create README.md of fluid/recordio #10145
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
paddle/fluid/recordio/README.md
Outdated
|
||
## Fault-tolerant Writing | ||
|
||
For the initial design purpose of ReocrdIO within Google, which was logging, RecordIO groups record into *chunks*, whose header contains an MD5 hash of the chunk. A process that writes logs is supposed to call the Writer interface to add records. Once the writer accumulates a handful of them, it groups a chunk, put the MD5 into the chunk header, and appends the chunk to the file. In the case that the process crashes unexpectedly, the leftover could be that the last chunk in the file was half-written. This doesn't prevent the process, after restarted, continue writing to the same RecordIO file, because the reader will be able to identify incomplete chunks and skip them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReocrdIO misspelled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/fluid/recordio/README.md
Outdated
@@ -0,0 +1,13 @@ | |||
## Background | |||
|
|||
RecordIO is a file format as a container of records. This package is a C++ implementation of https://github.com/paddlepaddle/recordio, which originates from https://github.com/wangkuiyi/recordio. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reword to "The RecordIO file format is a container for records."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
paddle/fluid/recordio/README.md
Outdated
|
||
## Fault-tolerant Writing | ||
|
||
For the initial design purpose of ReocrdIO within Google, which was logging, RecordIO groups record into *chunks*, whose header contains an MD5 hash of the chunk. A process that writes logs is supposed to call the Writer interface to add records. Once the writer accumulates a handful of them, it groups a chunk, put the MD5 into the chunk header, and appends the chunk to the file. In the case that the process crashes unexpectedly, the leftover could be that the last chunk in the file was half-written. This doesn't prevent the process, after restarted, continue writing to the same RecordIO file, because the reader will be able to identify incomplete chunks and skip them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"In the case that the process crashes unexpectedly, the leftover could be that the last chunk in the file was half-written. This doesn't prevent the process, after restarted, continue writing to the same RecordIO file, because the reader will be able to identify incomplete chunks and skip them."
Could be reworded to something like:
"In the event the process crashes unexpected, the last chunk in the RecordIO file could be incomplete/corrupt. The RecordIO reader is able to recover from these errors when the process restarts by identifying incomplete chucks and skipping over them".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Thanks to @abhinavarora and @cs2be ! I followed your comments. |
No description provided.