Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MMAP streaming of flatbuffer to avoid loading complete buffer in memory #4682

Closed
shivendra14 opened this issue Mar 25, 2018 · 4 comments
Closed
Labels

Comments

@shivendra14
Copy link
Contributor

flatbuffer documentation mentions that a flatbuffer could be read in streamed manner for memory efficiency.

"FlatBuffers is also very suitable for use with mmap (or streaming), requiring only part of the buffer to be in memory"

  1. Please provide a C++ example to validate how it can be done. Didn't find any example code anywhere!
  2. Is a memory mapped reading slower than reading in a single go. If yes, do we have any performance numbers for the two ways of reading?
  3. Can we write a flatbuffer too in a memory mapped way? If not, then what strategy can be used to avoid having large flatbuffers in memory during a write?
@aardappel
Copy link
Collaborator

  1. FlatBuffers just provides information in a sequence of bytes, how that is processed in IO libraries, networking or mmap-ing functionality is very platform dependent and outside the scope of the library.
  2. When read into memory you pay the cost for any paging all at once, and you fill (and possibly flush) cache. When using mmap, you typically pay paging cost on first access. With really big data sets, you may also get pages paged out over time. Even load all at once may incur paging, of course. The effects of this are very system dependent (not just the OS, hardware, amount of ram, data size etc, but also current system load). Typically I would expect a mmap-ed FlatBuffer to give about the same performance as reading it all at once with the added benefit of lower start-up time.
  3. This is a little bit harder but can be done. Essentially you can mmap a large buffer, then pass a custom allocator to FlatBufferBuilder, which will only return this block of memory.

@shivendra14
Copy link
Contributor Author

shivendra14 commented May 31, 2018

I tried using boost::memoryMap and it worked fine for reading buffer.
For write I am planning to try Custom allocator as you suggested, but the problem is I don't know how much to reserve for mmap, as file created by clients can be from few bytes to 100s of MBs.

I don't think I can add more memory to the pool contiguously as the binary requirement grows. Is there any suggestion on what could be the best to reserve by allocator for mmap? Some heuristic?

I am also doubtful about use of mmap in general on iOS. iOS doesn't Purge unused pages, and for a full file read kind of scenario I think mmap will cause a crash rather than benefiting by incremental memory read. Any suggestions on using mmap to read fb on iOS?

@aardappel
Copy link
Collaborator

No, how much to allocate depends entirely on your use case. You'll need to pick something that is always bigger than any data you expect to write.

I have no experience with mmap on iOS.

@stale
Copy link

stale bot commented Jul 27, 2019

This issue has been automatically marked as stale because it has not had activity for 1 year. It will be automatically closed if no further activity occurs. To keep it open, simply post a new comment. Maintainers will re-open on new activity. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants