Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark most functions as unsafe #2453

Closed
grovesNL opened this issue Oct 29, 2018 · 30 comments
Closed

Mark most functions as unsafe #2453

grovesNL opened this issue Oct 29, 2018 · 30 comments

Comments

@grovesNL
Copy link
Contributor

Because we follow Vulkan closely for the HAL API, we don't have many safety guarantees for most of the API usage. A few users have asked to have the explicit unsafe annotation to make it more obvious that no guarantees are provided here.

@kvark
Copy link
Member

kvark commented Oct 29, 2018

Thank you for filing the issue! It did raise up quite a few times, and it's super important to discuss in detail.

The initial approach of not having unsafe was taken for the following reasons:

  1. gfx-rs wanted to be safe, but actually, it just steered too low for that
  2. there is a precedent of gleam wrapping around a graphics API (a thin wrapper) and doesn't mark the functions as unsafe
  3. while the driver can technically collapse the universe, Rust code itself doesn't do any exposed unsafety. In that sense, why is, say, division operator on CPU is not unsafe? There is a line to draw between what we consider safe and what the driver/firmware/hardware considers safe.

@nical
Copy link
Contributor

nical commented Oct 29, 2018

FWIW, gleam lying about unsafety has been criticized by the rust safety brigade some people in the past.
I believe the question is: can you pick parameters that cause safety hazards per specification (as in not relying on a driver bug)? if so, something should be unsafe.

@nox
Copy link

nox commented Oct 29, 2018

  • there is a precedent of gleam wrapping around a graphics API (a thin wrapper) and doesn't mark the functions as unsafe

Many of those methods should actually have been marked unsafe, I'm fixing them when I stumble upon such methods.

I believe the question is: can you pick parameters that cause safety hazards per specification (as in not relying on a driver bug)? if so, something should be unsafe.

It's a bit more complicated than that, for example new unpack parameters could be introduced in such a way that tweaking them makes your safe glTexImage2D wrapper unsound, and we can't predict the future.

@fintelia
Copy link
Contributor

fintelia commented Nov 1, 2018

One approach that may work for gfx-hal is to use the same strategy as the memmap crate and mark one of the early initialization functions unsafe (say the function to create an Instance) and then leave everything else as safe. This way users have to acknowledge that misuse could technically result in unsafety without severely degrading ergonomics across the entire crate.

@kvark
Copy link
Member

kvark commented Nov 1, 2018

@fintelia this is an interesting idea, but I'm not sure if it's valid. The point of unsafe in general is that you can build safe abstractions on top. If you use it in a way of saying: "here you opt into unsafety, so don't expect any further work to be safe", then this would be against the spirit of the word.

summoning @gankro as our unsafety expert :)

@Gankra
Copy link

Gankra commented Nov 1, 2018

There's definitely some precedent for rules-lawyering unsafe boundaries. The key invariant that the stdlib team has generally focused on is: if you don't write unsafe anywhere in your application, then we can guarantee memory safety.

This gives us a fair amount of leeway to only mark key operations as unsafe, and mark many others as safe, with the understanding that you need to use one of those key unsafe operations to trigger unsafety.

For instance casting pointers to different types, or making them from integers is considered safe, because all the other operations like offset and read are marked as unsafe, and clearly document that they assume that the pointer is well-aligned, valid to access, and so on. (edit: you can think of this philosophy as "just in time" or "late" [in the sense of C++ templates being "late"] unsafety declaration)

With that said, I can't think of an example where we made this kind of "blood pact" design, where you sign away your rights by calling unsafe once upfront, and then never need to state that you understand things are unsafe again (at least for public APIs -- we lie about unsafety internally in the stdlib all the time since it doesn't matter).

I think the reason we don't tend to do that is it makes it easy for someone to "sign the pact for you" -- imagine a version of the *const API where we flip it on its head: creating pointers in unsafe, but every other operation is marked as safe. In this way we could technically still make the "you wrote unsafe so everything in unsafe" argument. However, any API that claims to be safe but hands you a raw pointer is messing up the whole concept of unsafe APIs, because it has signed the pact on your behalf. Not dissimilar to writing fn totally_safe_transmute<T, U>(t: T) -> U { unsafe { transmute(t) }, but much less obvious to notice.

@zakarumych
Copy link

zakarumych commented Nov 1, 2018

As a rule of thumb I think that any function that can be used to violate Rust safety guarantees must be marked unsafe. E.g. if you can provide a set of valid usage rules for the function - mark function unsafe and specify rules in doc comment. Only if the is no rules to follow the function can be considered safe.
This rule of thumb won't work with unsafe init and safe everything because you'll fail to write exhaustive valid usage rules for such init function.

@fintelia
Copy link
Contributor

fintelia commented Nov 1, 2018

This crate provides low level access to an extremely powerful, programmable co-processor with direct DMA access to main memory. It relies on underlying APIs which deliberately leave certain misuses undefined because detecting them would incur too large of a runtime cost. I agree that the ideal option would be to identify a small subset of the gfx-hal functions that are unsafe and mark everything else safe, but I'm not convinced that there is a small subset that would qualify. As I see it there are a couple options:

  1. Just lie about safety and don't mark anything unsafe.
  2. Mark some init function unsafe (satisfying any rules lawyers, but in practice the same as 1)
  3. Mark only some subset of functions (but hard to figure out which, might end up being almost everything)
  4. Mark everything unsafe (bad ergonomics but minimal effort)

In cases 2-4 the "set of valid usage rules" will probably just be a pointer to the vulkan spec and a disclaimer not to do anything that would cause problems, perhaps with a specific warnings about avoiding use after frees, etc.

@kvark
Copy link
Member

kvark commented Nov 1, 2018

This crate provides low level access to an extremely powerful, programmable co-processor with direct DMA access to main memory.

This quote actually got me thinking... What is the real unsafety limits of command buffer recording? All the work during recording is done on CPU, and the basic constraints (say, resources are alive at least at the point of recording) are easy to check in the backends.

What if we proceed with the following guidelines:

  • all command buffer recording operations are safe. Whatever unsafety constraints are imposed by the backends, they will simply assert! on, causing a panic instead of UB
  • queue submissions are totally unsafe, since this is what touches GPU/DMA
  • for the device operations, we go case by case basis: if we can enforce valid usage with assertions, we mark it safe in the API, otherwise we keep it unsafe.

@zakarumych
Copy link

zakarumych commented Nov 2, 2018

@kvark I'm not sure how you will assert valid usage of vkCmdDraw

@nox
Copy link

nox commented Nov 2, 2018

This one looks similar to glDrawElementsInstanced and friends, and that is safely exposed to the entire Web in WebGL, either by extensive checks or by relying on the KHR_robust_access extension.

One could make the crate panic if the robustBufferAccess feature is missing or if the unsafe omit_robust_buffer_access method is not called.

Disclaimer: I know nothing about Vulkan.

@zakarumych
Copy link

zakarumych commented Nov 2, 2018

@nox if we are willing to do extensive checks sacrificing performance like OpenGL does - we would use OpenGL 😄

Take note that robust buffer access doesn't guard you against other usage violation.

@nox
Copy link

nox commented Nov 2, 2018

How is that related? Whichever group will expose Vulkan to the Web will have to implement those extensive checks too. It's not about the tech but about where it's exposed. The Web runs third-party code so the API to access OpenGL needed safety checks, that's all.

My solution didn't say anything about implementing extensive checks in this crate.

@zakarumych
Copy link

zakarumych commented Nov 2, 2018

@nox So. Should be CommandBuffer::draw method safe or unsafe?
This discussion is about making some functions safe and some unsafe. I was asking how one can check valid usage of CommandBuffer::draw (inside it) to mark it safe.

@nox
Copy link

nox commented Nov 2, 2018

If there is something that leads to out-of-bounds buffer access, yes, that thing should definitely be unsafe. My point was that the robustBufferAccess feature seems to make those go away, so the crate could panic if it's missing, and provide an unsafe method to say "yeah even if that feature is missing, I swear with my blood that I know what I'm doing". Everything else about the Web was to explain where I'm coming from.

@nox
Copy link

nox commented Nov 2, 2018

Take note that robust buffer access doesn't guard you against other usage violation.

Don't edit your comments, I almost missed that. I am aware this doesn't guard against other usage violations, this kind of safety study is on a per-basis thing.

@zakarumych
Copy link

Also robustBufferAccess takes buffer range from descriptor or buffer view which can be set to wrong value.

@zakarumych
Copy link

Don't edit your comments

My worst habit.

@zakarumych
Copy link

My point is that command buffer recording methods has complex rules for valid usage. The idea to mark them all safe seems weird to me.

@kvark
Copy link
Member

kvark commented Nov 2, 2018 via email

@zakarumych
Copy link

zakarumych commented Nov 2, 2018

@kvark can you be sure about that?

@zakarumych
Copy link

I expect vkCmdDraw to prepare draw command data for gpu, read from pipeline, descriptors, framebuffer and renderpass objects. And I would expect no checks during those reads.

@nical
Copy link
Contributor

nical commented Nov 2, 2018

I expect vkCmdDraw to prepare draw command data for gpu, read from pipeline, descriptors, framebuffer and renderpass objects. And I would expect no checks during those reads.

I don't expect so, or the specification would have to mention that these resources (at least the mutable ones) be externally synchronized in the list at section 2.6 of this page: https://www.khronos.org/registry/vulkan/specs/1.0/html/vkspec.html.

@zakarumych
Copy link

@nical Those resources are immutable. You don't need to externally synchronize them if there is no way to mutate them.

@zakarumych
Copy link

zakarumych commented Nov 2, 2018

It may be don't read them until submission. But it would make sense to do CPU side job while recording since recording can be done in parallel. Deferring computations until submission will reduce CPU utilization.

@kvark
Copy link
Member

kvark commented Nov 2, 2018

@omni-viral I checked the AMD's Vulkan implementation, and this indeed appears to be the case: https://github.com/mesa3d/mesa/blob/677b496b6bd07cbe05dd429344ba525619cdd08c/src/amd/vulkan/radv_cmd_buffer.c#L1843

The driver checks the pipeline, active descriptor sets, vertex buffers, and more - all on the draw().

@kvark
Copy link
Member

kvark commented Nov 2, 2018

Perhaps we should make all the draws, dispatches, and copies unsafe then. It's still a big win if all the state setting is "safe" prior to the draws.

@zakarumych
Copy link

@kvark I'd stick with Vulkan valid usage rules. Without making assumptions about how spec is implemented. If there is enough data to assert valid usage (without compromising speed) then do that and mark function safe.

@kvark
Copy link
Member

kvark commented Nov 2, 2018

@omni-viral there is a certain trade-off we can afford by adding (limited!) run-time checks, e.g. vkCmdBindPipeline valid usage is trivial to check with an assert!.

@zakarumych
Copy link

Unfortunately there aren't many you can check that easily.

@kvark kvark added this to the HAL 0.1 release milestone Dec 19, 2018
@zakarumych zakarumych mentioned this issue Dec 24, 2018
4 tasks
bors bot added a commit that referenced this issue Dec 26, 2018
2526: Add few unsafe markers r=kvark a=omni-viral

Fixes #2453
PR checklist:
- [x] `make` succeeds (on *nix)
- [x] `make reftests` succeeds
- [x] tested examples with the following backends:
  metal
- [ ] `rustfmt` run on changed code


Co-authored-by: Zakarum <scareaangel@gmail.com>
@bors bors bot closed this as completed in #2526 Dec 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants