Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MemMapFileStream. Fixes in memFiles. #7944

Merged
merged 19 commits into from
Jun 14, 2018
Merged

Add MemMapFileStream. Fixes in memFiles. #7944

merged 19 commits into from
Jun 14, 2018

Conversation

data-man
Copy link
Contributor

@data-man data-man commented Jun 2, 2018

No description provided.

result = flushViewOfFile(f.mem, f.size.int32) == 0
if not result:
let lastErr = osLastError()
if lastErr != ERROR_LOCK_VIOLATION.OSErrorCode:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this particular error ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From MSDN:

The long remarks Remarks

Flushing a range of a mapped view initiates writing of dirty pages within that range to the disk. Dirty pages are those whose contents have changed since the file view was mapped. The FlushViewOfFile function does not flush the file metadata, and it does not wait to return until the changes are flushed from the underlying hardware disk cache and physically written to disk. To flush all the dirty pages plus the metadata for the file and ensure that they are physically written to disk, call FlushViewOfFile and then call the FlushFileBuffers function.

When flushing a memory-mapped file over a network, FlushViewOfFile guarantees that the data has been written from the local computer, but not that the data resides on the remote computer. The server can cache the data on the remote side. Therefore, FlushViewOfFile can return before the data has been physically written to disk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh!
I found this code: MappedByteBuffer.c#L66

 retry = 0;
    do {
        result = FlushViewOfFile(a, (DWORD)len);
        if ((result != 0) || (GetLastError() != ERROR_LOCK_VIOLATION))
            break;
        retry++;
    } while (retry < 3);
``

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That code retries the flushing operation a maximum of 3 times. After 3 times, if the lock error occurs, it is still raised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.
Therefore I ignore ERROR_LOCK_VIOLATION.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a similar code both for Windows and for Posix?

@Varriount
Copy link
Contributor

Please fix the build errors. It appears the code breaks when string tainting mode is turned on.

@data-man data-man changed the title Add MemMapFileStream Add MemMapFileStream. Fixes in memFiles. Jun 2, 2018
@data-man
Copy link
Contributor Author

data-man commented Jun 2, 2018

Why bugs in memfiles didn't appear in the tests before?
Caching in CIs?

@data-man
Copy link
Contributor Author

data-man commented Jun 3, 2018

Ready for something.

import os, streams
var
mms: MemMapFileStream
fn = "test.mmapstream"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a const for fn.

result = flushViewOfFile(f.mem, 0) != 0
if not result:
let lastErr = osLastError()
if lastErr != ERROR_LOCK_VIOLATION.OSErrorCode:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be ignored. The C code this was based on tries a maximum of 3 times iff ERROR_LOCK_VIOLATION is raised each time. The 4th time, ERROR_LOCK_VIOLATION is raised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I propose proc flush*(f: var MemFile, triesNumber: int = 3): bool both for Windows and Posix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would that return bool? Nim's stdlib uses exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call flush and if result == false then repeat the call. Can be useful, IMHO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might as well catch the exception then. And why would I repeat the call myself when it didn't succeed after triesNumber?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the logic proc open*(f: var File, ...): bool.

So

  • add triesNumber (or numTries)?
  • remove bool result?
  • no raise an exception inside flush?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • raise an exception for MemMapFileStream.flush?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proc flush(f: var MemFile; attempts = 3) {.raises: [IOError].}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see inconsistency here: flush for FileStream don't raises exception.
flushFile has WriteIOEffect tag only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flushFile is allowed to raise that and should be annotated too.

@data-man data-man dismissed Varriount’s stale review June 6, 2018 16:26

Added the attempts parameter

@@ -245,6 +249,30 @@ proc open*(filename: string, mode: FileMode = fmRead,
if close(result.handle) == 0:
result.handle = -1

proc flush*(f: var MemFile; attempts: int = 3) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would flushing require multiple attempts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, the hidden reviews contain the answers. The original code also contains this really useful comment which should be copied into our code IMO.

@@ -437,6 +438,71 @@ when not defined(js):
else:
raise newEIO("cannot open file")

type
MemMapFileStream* = ref MemMapFileStreamObj ## a stream that encapsulates a `MemFile`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in the memfiles module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And StringStream in the strutils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because string is in system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, move to memfiles?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, please.

var res = false
while retry <= attempts:
res = flushViewOfFile(f.mem, 0) != 0
if res:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you're retrying no matter what the error is, that's bad. Only retry on the ERROR_LOCK_VIOLATION error!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why replicating this to POSIX makes sense. In fact, I'm skeptical about whether this should be configurable via a param at all.

## Flushes `f`'s buffer for the number of attempts equal to `attempts`.
## If were errors an exception `OSError` will be raised.
when defined(windows):
var retry = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are for loops outdated now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, С-thinking. :)

@data-man
Copy link
Contributor Author

data-man commented Jun 6, 2018

All a bit complicated. :)
Linux has int sync_file_range(int fd, off64_t offset, off64_t nbytes, unsigned int flags).
It's can be more efficient for Linux only (we haven't it in linux.nim).

@dom96
Copy link
Contributor

dom96 commented Jun 7, 2018

You're still not checking the error code on Windows and are performing the retries on Linux.

#7944 (comment)

@data-man
Copy link
Contributor Author

data-man commented Jun 7, 2018

Yes, and now it's correct, IMO.

are performing the retries on Linux.

Why not?

when defined(windows):
var res = false
for i in 1..attempts:
res = flushViewOfFile(f.mem, 0) != 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what dom said, why not base the retry on the return value of flushViewOfFile?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, an exception will be raised due to any OS error.
The developer has to decide what to do next.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignoring some specific errors, it's a bad design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's retrying only on specific errors, it's good design. Of course you should raise an exception for any other error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, then I need to investigate/experiments with flags and error codes of msync, too. :)

@@ -245,6 +249,26 @@ proc open*(filename: string, mode: FileMode = fmRead,
if close(result.handle) == 0:
result.handle = -1

proc flush*(f: var MemFile; attempts: uint = 3) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Natural instead of uint here.

@dom96
Copy link
Contributor

dom96 commented Jun 8, 2018

are performing the retries on Linux.

Why not?

These retries are only for a specific Windows-only error. They do not apply to Linux.

@data-man
Copy link
Contributor Author

data-man commented Jun 8, 2018

They do not apply to Linux.

To be more precise, msync can returns EBUSY if MS_INVALIDATE was specified in the flags.

(Since Linux 2.6.19, MS_ASYNC is in fact a no-op, since the kernel properly tracks
dirty pages and flushes them to storage as necessary.)

@dom96
Copy link
Contributor

dom96 commented Jun 8, 2018

Okay, so if it's appropriate to retry on that error code then do the retry, but don't retry on any error.

@data-man
Copy link
Contributor Author

data-man commented Jun 8, 2018

Maybe to do a pause between attempts?
I tried use sleep from os module, but it has TimeEffect tag.
What if to use getTicks from system/timers?

@Araq
Copy link
Member

Araq commented Jun 14, 2018

Maybe to do a pause between attempts?

No.

@data-man
Copy link
Contributor Author

Ready for merge, I hope.

@dom96 dom96 merged commit bf5d619 into nim-lang:devel Jun 14, 2018
@dom96
Copy link
Contributor

dom96 commented Jun 14, 2018

Good enough I suppose :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants