When should I use memmap2? #8

bzm3r · 2021-02-07T01:43:01Z

I am hoping the discussion we have hear could make it into the project's README eventually, so I'll try to keep it general rather than specific to my use case.

The problem: I keep returning to consider mmap2 for my use case, but continue to remain unsure.

The current situation is as follows:

Problem 1: There is a "primary process" which generates information that we want to keep, but keeping it all on the RAM is not feasible.

Question 1: does Problem 1 have the right shape for memmap2 to be considered? If not, what is the right shape of problem for which memmap should be considered? After all, here's a non-memmap solution:

Solution 1 (file):

keep a buffer of the generated data
and when the buffer is full, flush the data via mpsc channels to a writer process. The writer process has a handle to an open file, into which it writes received data to the hard disk using <some binary format> + serde, while the primary process keeps generating data.

On the other hand, here is a memmap2 solution:

Solution 2 (memmap):

keep generated information in a memory mapped structure.

Question 2: Is the following true? "The benefit of Solution 2 (memmap) over Solution 1 (file) is that we do not have to deal with the overhead of inter-thread communication. Put differently, the primary process does not have to wait for a buffer flush + send to complete before continuing to generate data."

Question 3: Does Solution 2 make sense if you have a hard disk with slower write speed than the rate at which the primary process generates data?

One could also imagine the following solution:

Solution 3 (memmap, parallel):

keep a buffer of the generated data
and have a main, memory mapped structure which will hold all the generated data
this main memory mapped structure is kept by a writer process, which is sent information by the primary process using mpsc channels, which it then "appends" to the data in the memory mapped structure it is holding

Question 4: Is the following statement true? "An advantage of solution 2 is that if we have a hard disk with slower write speed than the rate at which the primary process is generating data, then Solution 3 essentially covers up this issue and replaces the cost instead with that of waiting for a buffer flush + send to complete."

Question 5: Is the following statement true? "The main benefit of memory mapping is to avoid the cost of <binary format> encoding/decoding."

(Thank you for your time.)

The text was updated successfully, but these errors were encountered:

RazrFalcon · 2021-02-07T14:57:09Z

Sorry, but I can't you help here. I'm using memmap in a very simple manner and don't really care about edge cases. And there are nothing special about this implementation. So I don't see a point in improving the docs.

bzm3r · 2021-02-07T18:10:17Z

@RazrFalcon I am not asking about edge cases, but the main use case. Although there is nothing special about the implementation, it might still be new to those who are encountering memory mapping ideas for the first time (e.g. me). Rather than just asking my own question and moving away, I tried to be general so that it might help others too. Perhaps this is a bad habit learned from StackOverflow, where generality is mandated.

Anyway, I totally understand that you don't have the time.

RazrFalcon · 2021-02-07T18:31:01Z

It's not about time. I honestly have no idea. I just forked memmap because the original was abandoned. I'm not a memmap expert. And there are a lot of blogs on the Internet that teach about it.

RazrFalcon closed this as completed Feb 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When should I use memmap2? #8

When should I use memmap2? #8

bzm3r commented Feb 7, 2021 •

edited

Loading

RazrFalcon commented Feb 7, 2021

bzm3r commented Feb 7, 2021

RazrFalcon commented Feb 7, 2021

When should I use memmap2? #8

When should I use memmap2? #8

Comments

bzm3r commented Feb 7, 2021 • edited Loading

RazrFalcon commented Feb 7, 2021

bzm3r commented Feb 7, 2021

RazrFalcon commented Feb 7, 2021

bzm3r commented Feb 7, 2021 •

edited

Loading