Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about: Virtual memory effects #28

Open
RalfJung opened this issue Sep 18, 2018 · 10 comments
Open

What about: Virtual memory effects #28

RalfJung opened this issue Sep 18, 2018 · 10 comments
Labels
A-memory Topic: Related to memory accesses S-pending-design Status: Resolving this issue requires addressing some open design questions

Comments

@RalfJung
Copy link
Member

During the pointer docs update, a discussion spawned off about effects of virtual memory: What do we still guarantee when different virtual addresses map to the same physical address? Some of libstd, e.g. copy_nonoverlapping, will start misbehaving in that situation. FWIW, C seems to just not care: memmove has the same problem.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 18, 2018

Is it undefined behavior to have two pointers with different values referring to the same object?

TL;DR:

Every byte has a unique address. [...] Undefined behavior does not even enter the equation as you simply cannot have two pointers to the same object that have different values.

and

The C++ standard only concerns about one way to view the memory. If the system uses virtual memory, then the standard is only concerned about virtual memory. You can never have two pointers with different value referring to the same object.

With pretty much the recommendation that anything doing anything like this should use volatile load and stores.

@RalfJung
Copy link
Member Author

Undefined behavior does not even enter the equation as you simply cannot have two pointers to the same object that have different values.

Oh but you can (but for unrelated reasons.^^)

@RalfJung
Copy link
Member Author

More on topic, AFAIK LLVM does not use aliasing information to inform ptr equality tests, and vice versa. So with a refined view of aliasing, I see no reason why LLVM's memory model would not work with overlapping virtual memory.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 18, 2018

Oh but you can (but for unrelated reasons.^^)

Could you point me to which part of the blog post allows you to construct two pointers of different addresses that refer to the same object in C or C++ ?

The only relevant example that I find there is the one of the out-of-bound pointers, but that's the opposite case, where one has two pointers, pointing to the same address, but that refer to two different objects - technically, one refers to an object, the other does not and is therefore illegal to dereference.

@RalfJung
Copy link
Member Author

You said "different values". The value of a pointer includes its provenance information. But I am just being picky here, that's off-topic -- sorry.

@alercah
Copy link

alercah commented Sep 18, 2018

This should definitely be a topic of discussion, especially because of the possibility that a filesystem process ends up mmaping its own data. Thar be dragons here, but hopefully we can come up with sensible guarantees at least.

@gnzlbg
Copy link
Contributor

gnzlbg commented Sep 18, 2018

You said "different values". The value of a pointer includes its provenance information. But I am just being picky here, that's off-topic -- sorry.

@RalfJung No offense taken, that's an important difference! With mmap one can create two pointers of different values (with different bit-pattern and different provenance) that on purpose refer to the same memory. In the out-of-bound example, obtaining two pointers with the same bitpattern and different provenance (e.g. obtained from two different malloc calls) feels incidental to me.

EDIT: What I mean here with "on purpose" and "incidental" is that in the mmap case, it is the user's intent to be able to manipulate the same object/memory/... via two pointers with different values (with different bit-pattern and different provenance). In the out-of-bounds case, that's less clear to me. We don't necessarily have to treat both cases equally.

@RalfJung RalfJung added C-open-question Category: An open question that we should revisit A-memory Topic: Related to memory accesses labels Aug 14, 2019
@JakobDegen
Copy link
Contributor

It seems unlikely to me that we'll want to adjust the AM memory model specifically to support this. I think that basically gives us very little wiggle room here. Option one for users is to use virtual memory shenanigans in a way that makes sense to the AM. This probably means not double-mapping pages. Option 2 is to use volatile operations, which are the standard tool for "I have some concept of memory that the AM doesn't understand."

Is there anything I'm missing here?

@JakobDegen JakobDegen added S-pending-design Status: Resolving this issue requires addressing some open design questions and removed C-open-question Category: An open question that we should revisit labels May 29, 2023
@comex
Copy link

comex commented May 30, 2023

A one-mapping-per-process limit would be bad for composability. It would mean that libraries essentially could not make non-volatile accesses to writable mappings at all – even if they avoided data races using some external synchronization mechanism or using atomics – since they wouldn't know if some other library in the same process was using the same file.

(That said, for the case of atomics, aren't you technically supposed to use volatile atomics for cross-process synchronization anyway? But the standard library has no support for those.)

Is there any actual problem here other than copy_overlapping? The Python example doesn't count; it's not specific to having multiple mappings. (If the Python code fails to preserve the expected behavior of memory reads and writes, then the memory mapping becomes something like MMIO, 'special' memory where reads and writes have arbitrary effects. That case clearly requires volatile accesses. If the Python code does preserve the expected behavior of reads and writes, then it doesn't matter.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-memory Topic: Related to memory accesses S-pending-design Status: Resolving this issue requires addressing some open design questions
Projects
None yet
Development

No branches or pull requests

6 participants