New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crazy idea: wedding with Ash #2206

Closed
kvark opened this Issue Jul 4, 2018 · 26 comments

Comments

Projects
None yet
9 participants
@kvark
Copy link
Member

kvark commented Jul 4, 2018

Ash is the low-level Vulkan binding that we use in the backend: https://github.com/MaikKlein/ash
gfx-rs has seen a lot of evolutionary development, a bit too much even. It went from a D3D11-like pre-ll to Vulkan-like HAL, and it did so gradually. Bit by bit we are rewiring HAL to serve the needs of portability, which makes it less safe and looking more like Ash. As we do this, the value of our type sugar drops as well, since it's not backed by the safety guarantees much. Vulkan backend becomes more and more just a straight translation to Ash.

It also becomes fairly clear that HAL is not usable by small teams who don't focus on performance. This is something that wasn't obvious until we made a full turn to VkPortability (and learned about how complex Vulkan really is). Thus, the type sugar we have is going to be hidden/wrapped by a higher-level library, such as Amethyst, GGEZ, or WebGPU in the future.

With that in mind, I'd like to propose a crazy idea of making HAL a drop-in replacement for Ash. Here are the expected results from this grand move:

  • removal the Vulkan backend
  • removal of the type sugar layer
  • Rust ecosystem benefit for one less Vulkan API/library to target
  • simpler API signatures (slices instead of iterator generics)
  • potentially faster Vulkan for not introducing mid layer and not collecting iterators into temporary vectors

Concerns required for resolving:

  • how would the portability layer look? Can't afford any overhead there
  • what do we do about generic support for backends (current Backend trait)? Ash doesn't have generic-based API

Thoughts? Complaints? This is just an idea on the table so far ;)
cc @MaikKlein @msiglreith , and everyone else is welcome to join the discussion!

@MaikKlein

This comment has been minimized.

Copy link
Contributor

MaikKlein commented Jul 4, 2018

what do we do about generic support for backends (current Backend trait)? Ash doesn't have generic-based API

You probably could implement the DeviceV1_X traits for hal. But I haven't designed it to be a generic backend, but I am open to changes. I guess a more static interface would make more sense here.

I haven't really used hal that much but I like what I saw. It feels like vulkan but with bit more "rusty" api than ash.

How would hal differ from the portability lib? How would vulkan extensions work in hal?

I am also willing to move ash into the gfx-rs organization.

@msiglreith

This comment has been minimized.

Copy link
Member

msiglreith commented Jul 4, 2018

I tend to agree that our layer become very thin mostly because it's pretty clear now that we can implement ~90% of Vulkan without major obstacles and can stick to the API pretty closely. Ash is pretty close to Vulkan. On the other hand I'm not sure if we just expose the raw C API in the backends:

  • Basically removing merging portability into this repository (no fiddling with Handle and Boxes yay).
  • Easy to add new extensions and might be more attractive for non-Rust folks.
  • Easier to target for other libraries.

At the end we can still keep the current gfx-hal API with the generic backend interface (using macros). We basically would invert the current hierarchy: gfx-hal implemented on top of portability.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 4, 2018

I see lots of great input, thanks for prompt responses!

@MaikKlein

You probably could implement the DeviceV1_X traits for hal

That's a cool idea, we should try it out.

How would hal differ from the portability lib?

The only difference, I suppose, would be that gfx-rs is still a Rust API.

How would vulkan extensions work in hal?

The same way they work in Ash: exposed to the user and then requested explicitly. I suppose the "requested" part is actually at the type level in Ash, right? In this case, the portability layer would just have a bunch of Option<SomeExtension> populated when it's requested by the user.

I am also willing to move ash into the gfx-rs organization.

Awesome! That would be highly appropriate if we go this (wedding) path.

@msiglreith

On the other hand I'm not sure if we just expose the raw C API in the backends

That's another crazy idea, right here :) Wouldn't it imply that all the Rust apps then have to cross the C boundary twice when basing on gfx-rs? First, between Ash and the backend. Second, between the backend and the native C API. This would be a regression from what we have now. Another negative side is that each backend would have to deal with unsafe inputs all the time, which means more boilerplate/redundancy and less safety (obviously).
As for the listed benefits, the former is largely negated having by gfx-hal on top of the portability (roughly the same code as gfx-hal below the portability, it seems), if we keep it. And the last two benefits seem to be equally true for the current portability library, don't they?

We basically would invert the current hierarchy: gfx-hal implemented on top of portability.

What I was trying to say in the issue is that I don't see much value in whatever HAL provides on top of raw backends at the moment. It's another "gfx-render" story, just lower level. Better shoot straight for the usable higher level APIs from here.


The big architectural concern I missed originally is that Ash types are copyable, and HAL takes everything by reference. This allows us to have bigger structs with custom data used by the backends, but it also gives us a bit of a headache (e.g. we still require CommandBuffer: Clone, which is ugly and annoying). If we are to make those types all be basically usize (or *mut _), the consequences are going to be:

  • nice fit for Ash
  • probably easier to work with? copyable handles allow more freedom for the users
  • no ability to enforce Send/Sync? (I guess this applies to Ash wedding in general)
  • straightforward portability implementation (no Handle/Box like @msiglreith mentioned)
  • the (non-Vulkan) backends would then have to do the Handle/Box wrapping internally... which is a clear regression from the current use of gfx-hal (from Rust).

The last point makes me especially worried... One of those clients is WebRender, and we simply can't afford to drop the performance here without a strong reason to do so.

@AIOOB

This comment has been minimized.

Copy link
Contributor

AIOOB commented Jul 4, 2018

I like this idea, in general, HAL does seem to me to be a bit confused at the moment. As far as I see it, there are two ways of doing it, at least that I can think of:

HAL is the Ash traits (which interestingly could probably be generated), a very thin portability layer which essentially just converts the C ABI to the Rust ABI, and the backends implement the Ash API (the Vulkan backend would hopefully get optimized away).

HAL essentially being most of Ash with the backends implementing the Vulkan C API. So things wanting to use VkPortability can just use the backends without HAL at all. This may be harder to manage as their can't be any guiding traits.

The first way seems to be the more sensible option as it uses more of Rust's strengths and involves fewer ABI changes. However, the second method is less reliant on the needed optimisations being performed as it involves fewer steps in the VkPortability case.

I'm confused as to why handles and boxing is going to go away. They are part of the Vulkan spec and are going to have to exist in some form. I'm thinking this change is going to move more control into the backends which I would assume can only decrease overhead.

@msiglreith

This comment has been minimized.

Copy link
Member

msiglreith commented Jul 4, 2018

I like this idea, in general, HAL does seem to me to be a bit confused at the moment.

yes due to its history, originating from creating a middleground between those APIs similar to the Forge project. The VkPortability stuff wasn't on the table back then.

I'm confused as to why handles and boxing is going to go away.

Handle stuff is in the portability branch only, HAL basically returns the pure objects.


Wouldn't it imply that all the Rust apps then have to cross the C boundary twice when basing on gfx-rs?

App -> Ash -> HAL, no C-boundaries involved considering changing the trait interfaces/implementations of Ash a bit.

Another negative side is that each backend would have to deal with unsafe inputs all the time, which means more boilerplate/redundancy and less safety (obviously).

Yes, kind of. We would need to due to ptr->slice conversions in the backends. Based on my experience with rostkatze I would say it's marginal. My C++ coding style became quite rusty and I didn't feel like a major concern to me.
Another regression in general will be lack of iterator based interfaces but I don't think we can do much about it.

As for the listed benefits, the former is largely negated having by gfx-hal on top of the portability (roughly the same code as gfx-hal below the portability, it seems), if we keep it.

I tend to agree that keeping the current HAL implementation would be a bit more work. But compared to the current situation the bottom layer API would be fixed, requiring only version updates.

And the last two benefits seem to be equally true for the current portability library, don't they

mmh yes, but having a wrapper of a wrapper is a bit more scary :D

What I was trying to say in the issue is that I don't see much value in whatever HAL provides on top of raw backends at the moment.

I fully agree on that point. IMO the maintenance costs grew over the time: Changing bits of the current API is quite costly in terms of updating depending crates additionally to trying coming up with some sort of rusty API. My hypothesis is that Ash over raw C doesn't provide enough benefit:

  • Not fully complete
  • Only provides a few features over the C api (slices and return types)
  • Need to write an crate for Ash -> C API

the (non-Vulkan) backends would then have to do the Handle/Box wrapping internally... which is a clear regression from the current use of gfx-hal

Via the host allocator interface it should be possible to mostly negate the costs.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 5, 2018

@msiglreith thank you for clarification! I think I understand now the idea - you are basically saying that each backed will have a different implementation for gfxXXX functions (we currently have in portability) and the basic types, correct?

Via the host allocator interface it should be possible to mostly negate the costs.

This is a particularly interesting (and promising) point!

Ways to go

As @AIOOB correctly identified, we have multiple options now:

  • HAL: the current approach
  • Ash: implement Ash traits
  • Portability: just expose a rough C API, which one still can depend on with Cargo in Rust

Trying to bring more structure to the arguments voiced in the discussion, I came up with this table:

HAL Ash Portability
User base us us + Ash us + Ash + Vulkano
Rustyness good to have iterators and borrowed objects slices are good, but objects have to be passed by raw pointers zero
Safety pretends to provide some with type sugar, but still horribly unsafe not safe totally not safe
Overhead for Portability all types are boxed zero N/A
Overhead for Vulkan need to collect iterators into temporary vectors N/A N/A
Overhead for others minimal all types are boxed all types are boxed
Maintain HAL and Warden yes possible no

Analysis

First of all, the current HAL approach is not very sound. It has all the bits and nits of Vulkan in it, with the associated unsafety, which is not consistent with the type semantics/constraints we are trying to preserve... For example, the clear color iterator doesn't have to be of the same length as the number of color attachments, yet some of the elements are ignored. Or the fact our CommandBuffer: Clone while other resources are not, and there should no be reason for this one to be cloned.

Secondly, with Ash having the traits already (and some user base), it's tempting to try implementing those. Dealing with slices are a pleasure, and Ash has nice object-oriented API that is nice to work with. However, it does require us to box all the resources like we do in Portability, turning them into bare mutable pointers that have to be unsafely(!) dereferenced at every spot... so it's not like we can have even a minimum type safety in the backend implementations.

That last point is tipping my scales towards the last solution, ever slightly. I'll keep chewing through those in the meantime and appreciate more input. Overall, I'm glad we haven't rushed with HAL after all :)

@termhn

This comment has been minimized.

Copy link
Contributor

termhn commented Jul 5, 2018

After turning it over for a while I do think the “portability” option seems like a good one... in addition, Vulkano and/or Ash could probably be written on top of this “new HAL”/portability lib and get the benefits of the cross-platform support while paying little to zero overhead when actual Vulkan is available. Also, documentation and knowledge base/user base would be able to be shared with Vulkan basically directly so that it’s not just “if you know Vulkan you can pick it up pretty easily,” but rather “if you know Vulkan then you know this.”

@anderejd

This comment has been minimized.

Copy link
Contributor

anderejd commented Jul 6, 2018

Related github issue in Vulkano: vulkano-rs/vulkano#525

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 6, 2018

I think I got the "User base" metric wrong. Since the portability layer is always going to be there, Vulkano and others can just use that (either the Rust library or straight as a dynamic system lib) anyway (and they aren't really blocked to do so even now). So having us to use it as a basis for backend implementations doesn't change anything for those users. Please correct me if I'm missing something... With this out of the way, let's focus on the remaining differences between Ash and Portability paths.

Ash:

  • (+) has nice traits already, uses slices and options
  • (-) a bit of work needs to be done to it:
    • cleanly separate the native vulkan implementation (using function pointers) from the traits. Traits shouldn't know about the function pointers.
    • implement those for the native vulkan
    • move the extensions on to their own traits as well
  • (+) bonus of tighter cooperation with our community (including the move of Ash under gfx-rs organization)
  • (+) bonus for existing Ash users, who would get the thinnest portability path (cc @Ralith)
  • the unsafety of accessing the objects via u64 pointer/id can hopefully be enclosed in a Handle-like structure (currently used in Portability)

Portability:

  • (+) no need to maintain a separate portability repository/project
  • Ash could still do the steps above if they want to call the portability methods directly (static dispatch instead of dynamic function pointers)
@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 8, 2018

Here is a more detailed transition plan for Ash that I think is reasonable:

  1. Move the traits out into a separate crate. Would be convenient to have it inside gfx-rs repository (e.g. just replace HAL), so that we can update it together with the backends (and possibly, with portability).
  2. Strip traits of anything related to the function pointers (including the default implementations, which I wonder if it makes sense to just do unimplemented!() - to be discussed).
  3. Move ash crate into gfx-rs organization, have it defining those function pointers and implementing the traits through them. Perhaps, it should be a part of the same repository?..
  4. Eventually update all the backends to use the new API. This may take some time, but ultimately should be mechanical work, nothing new to invent here.
  5. Some parts of the portability (e.g. Handle) need to be moved over (into some common/auxil module). I'm also wondering... given what's going to be left of Ash and Portability (if following this plan) is mostly dealing with function pointers of Vulkan, maybe those two at least need to be together in a repo?

Ideas, feedback, and alternatives plans of actions are welcome!

@msiglreith

This comment has been minimized.

Copy link
Member

msiglreith commented Jul 8, 2018

It seems like the current discussion is about ash vs raw-c for the implementation of the backends.
Independent of the chosen way we will provide an Ash and a portability interface to the users. So for me the main points are the ease of implementation for backends + layers and maintenance in form of Vulkan 1.1+ support.

Therefore the 2 paths would be backend -> Ash -> RawC and backend -> RawC -> Ash:

backend->Ash vs backend->RawC

  • Slice in Ash avoid ptr->slice conversions in all backend implemtation
  • RawC is a fixed API, how for is Ash away from stabilizing (error handling, extensions, completeness)?
  • Ash device trait requires us to have one large implementation file?

Ash->RawC vs RawC->Ash

Mostly a matter of how good we can autogenerate these layers. Not aware of the Ash internals here. cc @MaikKlein

I'm not really favoring on approach over the other atm as it's hard for me to estimate the work required at the ash side and autogeneration. Both interface implementations (Ash and RawC) should at the live in the same repository at the end.

@MaikKlein

This comment has been minimized.

Copy link
Contributor

MaikKlein commented Jul 9, 2018

The traits in Ash all have a default implementation based on the function pointers. If you go the RawC way I think I could abstract over the "function pointers".

trait InstanceRawC {..}
trait DeviceRawC {..}
//...
impl InstanceRawC for FunctionPointers  {..} // This are the function pointer in ash
impl DeviceRawC for FunctionPointers  {..}

pub struct Gfx { .. }
impl InstanceRawC for Gfx  {..} 
impl DeviceRawC for Gfx  {..} 

Where it sort of could look like this

// This is the function pointer struct in ash
pub struct InstanceFnV1_0 {
    destroy_instance:
        extern "system" fn(instance: Instance, p_allocator: *const AllocationCallbacks) -> c_void,
     //...
}
impl InstanceRawC for InstanceFnV1_0 {
    unsafe fn destroy_instance(
        &self,
        instance: Instance,
        p_allocator: *const AllocationCallbacks,
    ) -> c_void {
        (self.destroy_instance)(instance, p_allocator)
    }
    //...
}
pub struct Gfx { 
    // Potentially empty
}
impl InstanceRawC for Gfx {
    unsafe fn destroy_instance(
        &self,
        instance: Instance,
        p_allocator: *const AllocationCallbacks,
    ) -> c_void {
        gfx::destroy_instance(instance, p_allocator)
    }
    //...
}

I possibly have to create multple traits for each Vulkan version but that shouldn't be a problem. I think generating something like this should be trivial.

The arguments possibly have to be transmuted because the portability lib currently defines their own types.

I haven't put too much thought into this, but I think something like this should work. I would need to see how you would implement the other way, maybe with some pseudo code, then I can judge how easy/hard it is to generate the implementation.

I am also not too familiar with gfx to understand what exactly you want to do, if I got the RawC -> Ash implementation wrong, just tell me.

RawC is a fixed API, how for is Ash away from stabilizing (error handling, extensions, completeness)?

Now that I have the generator branch almost done, I also wanted to write a small script that checks which parts of vk hasn't been exposed in the higher level layer. I am also thinking of generating the high level layer, instead of manually wrapping all the functions. I am just not sure yet how much work that would be, especially with all the corner cases.

I've just redone bitflags and enums, you can also follow my process of transitioning to 1.1 here MaikKlein/ash#52 (comment). I am fairly open about all the changes.

As for extensions, the generator branch exposes everything, but many higher level extensions haven't been written yet.

As for the error handling rework, I assume you are referring to MaikKlein/ash#41. I haven't had the time to work on it, but any help is appreciated. I am also toying with the idea of custom errors for every function.

Also the dynamic library loading needs to be reworked. Ash currently uses a yanked libloading library and this is one of the problems why the Vulkan backend of the portability lib stack overflows.

Btw if you would like to see any bigger changes to Ash, now is the time to voice them.

@msiglreith

This comment has been minimized.

Copy link
Member

msiglreith commented Jul 9, 2018

Thanks for the detailed response! Looks promising and nice to see that you are working on generating things from xml already.

It seems for the RawC path the best way would be something like:

// dx12/lib.rs
pub fn create_instance(..) { .. }

// ash_wrapper.rs - autogenerated
impl InstanceRawC for Gfx<BackendDx12>  {
     unsafe fn create_instance(&self, ...) { dx12::create_instance(..) }
} 

// portability.rs
// export the required 3(?) functions

// portability_static.rs - autogenerated, optional static library for best performance
#[no_mangle]
pub extern "C" fn vkCreateInstance(..) {
    back::create_instance(..)
}

If I understand it correctly the backend->Ash route would instead target the current Instance traits on the higher level in contrast the the backend->RawC route implementing InstanceRawC?

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 9, 2018

@MaikKlein is InstanceRawC only needed because you can't auto-generate the implementations of the actual InstanceV1_0 and such?

I feel like the RawC (what I called Portability) and Ash paths aren't really that different. If we consider the total work that needs to be done by {gfx-rs, gfx-portability, ash} projects, than the chosen path only re-shuffles the work by doesn't add or subtract any. And we'd like to piggy back on the same generator infrastructure that @MaikKlein is rolling out, so taking Ash work into consideration makes sense to me (regardless of the chosen path).

If we go RawC, then:

  • portability -> gfx paths (both static and dynamic) are auto-generated
  • all handle resolves happen in the backend code
  • Ash's XxxRawC -> gfx path is auto-generated
  • Ash's Xxx -> XxxRawC still needs to be written by hand

If we go Ash, then:

  • portability -> gfx needs to be written by hand. Presumably, via gfxSomeFunction things (like we have now), and then static/dynamic VK bindings auto-generated (like in RawC path).
  • device and instance handles are resolved by the portability, others - by the backends
  • Ash's XxxRawC are not needed

So the amount of work seems to be roughly the same. The Ash path doesn't need XxxRawC traits, which is a win, not to mention nicer backend implementations for the slices in the API.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 11, 2018

Axis of rustiness

After wondering about why Ash specifically only has traits for the instance and device, I figured we need to place our choices on a linear scale according to the amount of type sugar they add over raw Vulkan:

  1. standalone functions with raw pointers, e.g. pub extern "C" fn gfxCreateRenderPass(VkDevice, *mut VkAttachment, u32). We already have those in portability, and this is precisely what "RawC" path (discussed above) is about.
  2. standalone functions with idiomatic names and use of slices, options and references (for structures), e.g. pub fn create_render_pass(VkDevice, &[VkAttachment]). We are unlikely going to be able to generate RawC -> this code, so from here and below we'd need some hand-written wrapping to/from RawC.
  3. traits for dispatch-able objects (defined by Vulkan spec as unique opaque pointers), namely Instance, PhysicalDevice, Device, Queue, and CommandBuffer, e.g. fn create_render_pass(&self, &[VkAttachment]). All objects (including the dispatch-able ones) are still passed around by their (copyable) handles. Dispatch-able objects can enforce the Send/Sync bounds.
  4. all objects are abstracted away as associated types of a Backend trait, passed by reference (or Borrow) everywhere. All objects enforce Send/Sync.

Classification

The current HAL (or rather, it's raw layer that backends care about) is something akin 4.5: it goes an extra mile by introducing more traits (for things like a swapchain or descriptor pool) and accepting generic iterators instead of slices. Naturally, we could consider going strictly to 4 as one of the ways to proceed. It's probably the least disrupting one, which is a small plus. The problem with it is some overhead due to the handle wrapping happening on the portability side: it needs to collect the reference lists somewhere before passing to HAL.

Interestingly, Ash is around 2.5: it has only two (out of 5 specified) dispatch-able objects. I find this a bit unsound (for our needs), and the reason for this design choice seems to come from the fact Vulkan function pointers are obtained for/by the device and instance. This is a fine argument for Ash, but it's not a convincing argument for us in terms of Ash traits since we specifically want to disconnect the traits from function pointer logic.

With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.

@ZeGentzy

This comment has been minimized.

Copy link
Contributor

ZeGentzy commented Jul 11, 2018

How will the proposed changes affect the future of the opengl backend? Will the proposed changes require opengl backend to map better to ash stuff? Will window/surface/instance creation HAVE to be merged between the opengl backend and non-opengl, or will it stay as a nice to have (and a requirement for using it in portability, of course)?

@omni-viral

This comment has been minimized.

Copy link
Contributor

omni-viral commented Jul 11, 2018

@kvark You can also use Ash as 1. As raw C functions are exposed.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 11, 2018

@ZeGentzy I think it will still be fine for the GL backend to provide specific initialization routines, if we don't find a way to fit it into Vulkan API completely.

@omni-viral
In this case, Ash provides nothing to us - we already have those functions in portability.

@grovesNL

This comment has been minimized.

Copy link
Member

grovesNL commented Jul 11, 2018

With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.

Is it possible to code-gen both the traits and C externs (and implementations of these that call our traits)? Then we just have to be concerned with aligning HAL and Vulkan, and most of the existing portability is simply code-gen'd.

@MaikKlein

This comment has been minimized.

Copy link
Contributor

MaikKlein commented Jul 12, 2018

This is a fine argument for Ash, but it's not a convincing argument for us in terms of Ash traits since we specifically want to disconnect the traits from function pointer logic.

I am fine with disconnecting it. I just implemented it that way because it was easy. I would just do a generic impl for Ashs function pointers.

Interestingly, Ash is around 2.5: it has only two (out of 5 specified) dispatch-able objects. I find this a bit unsound (for our needs), and the reason for this design choice seems to come from the fact Vulkan function pointers are obtained for/by the device and instance.

I am open for ideas but I am not sure how nicely this would map to Ash. I'll have a look at the HAL traits to get an idea how you are using these dispatchable objects. I assume this is not a problem for the other backends because all the function pointers are loaded globally? We could do something similar in Ash, if we get function pointers with "runtime dispatch". (Retrieve device level fp from the instance). Otherwise we would need something like an Arc in those objects. Currently instance and device own their function pointers.

I find this a bit unsound (for our needs),

Could you explain this statement? What exactly would make it unsound? I assume you might need to abstract over those "dispatchable" objects? Like impl CommandBuffer for MetalCommandBuffer?

With this in mind, the only really sound approach, involving no overhead on Vulkan or portability by itself, is 1 (RawC). I am wondering though, could it be possible to auto-generate the traits for 3? In this case, we'd get the benefits of direct mapping as well as nicer types.

As you mentioned 2/5 traits are already in Ash, but they are all written manually. I think it should be possible to generate them but it is a bit painful as you don't really have any type information. I am also not sure how many edge cases there would be.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 12, 2018

We had a call today, discussing the positions on this issue. I was about to write up about our conclusions and close the issue, but then something unsettling got in - the feeling we haven't captured all the right arguments for the chosen paths yet.

The first misunderstanding that I got is about Ash rustiness: "slices are good". Ash only has slices at the top level, and anything deeper (e.g. structure fields) are still raw pointers. So Ash doesn't get us much further on the safety/convenience scale than RawC.

The next misunderstanding, and it may be a crucial one, is that making the backends operate on raw handles would be just shuffling the work around and not have any impact on the end users in terms of performance. This is true for the portability - either the handle dispatches happen before the backend boundary or after - doesn't matter. But this is certainly not true for any Rust user code. Ability to move the actual structs around and wrap them into bigger logical constructs (e.g. WebGPU implementation) can be saving quite a bit of indirection. I don't want to penalize Rust applications running on Metal/D3D12, and I think the existing move semantics in HAL is both convenient and useful. Perhaps, the unsoundness of some of the pieces could be patched instead of throwing out the whole thing?

So here is an updated proposal for the (current) HAL way:

  • keep the move semantics and associated Backend types
  • keep the iterators. Make Vulkan backend to mitigate those by re-using the local storage vectors to collect the iterators into.
  • keep the portability taking care of the Handle wrapping. It may do so more efficiently, and it make take full responsibility to use the allocator callbacks to place those concrete types in user-defined memory.
  • revise the unsound parts of HAL and try to fix them
  • mark a large part of HAL API as unsafe

I think the decision largely depends on one metric - the ratio of users using our Vulkan backend, and their performance expectations. Note that this only applies to Rust users - portability layer doesn't care about the Vulkan backend as much. If our users want to target native Vulkan and consider everything else a fallback on those walled garden ecosystems - then going with RawC makes sense. If the users want to choose the popular platforms and run their preferred APIs, then the current HAL makes more sense.

@anderejd

This comment has been minimized.

Copy link
Contributor

anderejd commented Jul 12, 2018

If our users want to target native Vulkan and consider everything else a fallback on those walled garden ecosystems

This match my perspective pretty well, but I'm just a single user and have yet to start using gfx-rs in production.

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 12, 2018

@anderejd my main case of consideration is the browser tech.

WebRender is written in Rust and is much higher level than gfx-hal. It will enjoy the fewer indirections when accessing any resources, and the performance difference here could be very important. The main target platform of WR today is Windows, and I expect our D3D12 and D3D11 backends to be preferred there. Vulkan doesn't have as wide of a support, and it's worse in terms of interoperation with other Windows-specific systems, e.g. hard-ware media decoding (video playback). So for WR the Vulkan backend would be a solution on Linux and an alternative on some Windows systems.

WebGPU is a higher level API that is tempting to implement and validate in Rust. It would have bigger/fatter structures, so it would benefit from fewer indirections. Ultimately, it would want to use the same native API backend as the browser, which means D3D again on Windows.

@anderejd

This comment has been minimized.

Copy link
Contributor

anderejd commented Jul 12, 2018

Thanks for the detailed reply, I don't have enough experience in this domain to have an opinion on the way forward, just adding my perspective.

Regarding WebRender though, what about Android and iOS? Optimizing for D3D seems illogical to me, in that context.

@termhn

This comment has been minimized.

Copy link
Contributor

termhn commented Jul 12, 2018

@kvark

This comment has been minimized.

Copy link
Member

kvark commented Jul 18, 2018

Here is some news: HAL is here to stay. The justification has been mostly provided above ^, plus new ideas by @msiglreith to be expressed and discussed in a separate RFC (codename "BAL". Edit: #2249). Whether or not HAL can some day become stable is also still on the table, but largely outside of the scope of this issue.
As for my current complaints about unsound API of HAL, I filed issues #2247 + #2248 and I hope we can fix them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment