Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPUQueue.writeBuffer/writeTexture #509

Closed
wants to merge 7 commits into from
Closed

Conversation

kvark
Copy link
Contributor

@kvark kvark commented Dec 3, 2019

This PR brings setSubData calls on the queue for buffers and textures.

This PR can be seen as either:


Preview | Diff

Copy link
Contributor

@Kangz Kangz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
@kvark
Copy link
Contributor Author

kvark commented Dec 3, 2019

Thanks for the quick review!
I'm open to bikeshed the naming. WriteXxx doesn't sound bad to me, applied now.

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: let's make sure to merge this correctly with #498.

re: naming: These functionally are related to copyBufferToX. How about copyTo{Buffer,Texture} or copyDataTo{Buffer,Texture}?

spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
spec/index.bs Outdated Show resolved Hide resolved
@kvark
Copy link
Contributor Author

kvark commented Dec 4, 2019

@kainino0x thanks for the detailed review! An update is to follow.

note: let's make sure to merge this correctly with #498.

It looks like #498 could easily land first, and I'll rebase

re: naming: These functionally are related to copyBufferToX. How about copyTo{Buffer,Texture} or copyDataTo{Buffer,Texture}?

Given that we already have copy* commands on the the command encoder, I think we should avoid confusion by using a different word here (i.e. "set" or "write").

Copy link
Contributor

@kainino0x kainino0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! Would like to hear opinions from others on this though.

@kainino0x
Copy link
Contributor

Given that we already have copy* commands on the the command encoder, I think we should avoid confusion by using a different word here (i.e. "set" or "write").

OK, my preference is not that strong.

Copy link
Contributor

@kdashg kdashg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see details about when and why this might be useful as opposed to mapping.

@Kangz
Copy link
Contributor

Kangz commented Dec 5, 2019

I would like to see details about when and why this might be useful as opposed to mapping.

The main arguments mentioned previously is that this more ergonomic than mapping and with decent performance.

spec/index.bs Outdated
GPUTextureCopyView destination,
GPUExtent3D copySize);

void copyTextureToBuffer(
GPUTextureCopyView source,
GPUBufferCopyView destination,
GPUBuffer destination,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I kinda like the symmetry between GPUTextureCopyView and GPUBufferCopyView, maybe GPUBufferCopyView could extend GPUTextureDataLayout?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea I haven't considered, let me try it


void writeTexture(
ArrayBuffer sourceData,
GPUTextureDataLayout sourceLayout,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have the same 256 limitation as for copies between buffers and textures?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question!

I don't think it should have the limitation, since the data is coming from the CPU side, via a shortcut method, so we have both a room and a permit to manage the alignment automatically.

This could also be one more thing towards the list of why an API like this is useful. cc @jdashg

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this too, I think it should not have the limitation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't have the limitation, it must copy bloat when this happens, which chips away at the "good enough performance" claims.

Copy link
Contributor

@litherum litherum Feb 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume when @Kangz says "256 limitation" he's referring to buffer copies having to have strides that are a multiple of 256? (I ⌘f-ed for "256" in the spec and didn't find anything...)

If the source data just happens to have stride that is a multiple of 256, just by coincidence, then no copy bloat, right? Just like how, if writeBuffer() is called when the buffer just happens to be ready, just by coincidence, then no copy bloat either, right? It doesn't make a lot of sense to mandate 256 here when we aren't mandating an internal copy.

Perhaps this can be alleviated with additional spec text describing a set of steps which, if applications take, they are guaranteed to be on the fast path. These steps would say "use fences to know when your buffer is ready" and "make the stride a multiple of 256" (possibly among other things).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment wasn't super clear: the 256 limitation is around the rowPitchInBytes in GPUTextureDataLayout. copyBufferToTexture and copyTextureToBuffer require it to be a multiple of 256 and I was wondering if the same applied for writeTexture.

Just like how, if writeBuffer() is called when the buffer just happens to be ready, just by coincidence, then no copy bloat either, right?

That's an implementation detail, and at least in Chromium/Dawn we'd like to avoid doing it if we can (it requires knowing for each resource when's the last time they were used by the GPU, which we'd like to avoid tracking).

@kvark
Copy link
Contributor Author

kvark commented Dec 5, 2019

@Kangz deriving seems to be working well, thanks for the idea!

@jdashg see Myles concerns in #418 plus #509 (comment) . An alternative is:

  • requiring the users to use a large chunk of boilerplate code that isn't trivial to write and debug (In general, I'm all for low level APIs that depend on abstractions like this, but the ship for WebGPU to be one has sailed a while ago. Now it's an API that needs to be usable directly).
  • requiring the users to always adhere to the 256 alignment requirement
  • wasting space in allocated buffers because CPU can't use the unused chunk space while the GPU copies from the rest of it
  • unable to write into the resource directly

@kdashg
Copy link
Contributor

kdashg commented Dec 7, 2019

  • It's not a large chunk
  • Shadow copies on misalignment is a disadvantage
  • Use different buffers
  • Other designs can write into resources directly

I'm not compelled by this reasoning.

@kdashg
Copy link
Contributor

kdashg commented Dec 7, 2019

We are currently sailing this ship, and I haven't given up on trying to keep in pointed in the direction I see as ideal. I'm not sure why you see these things as foregone conclusions, but the path of least resistance is not what gets us to the optimal spec.

@kvark
Copy link
Contributor Author

kvark commented Dec 9, 2019

@jdashg

It's not a large chunk

Could you (re-)point me to the full code please? It doesn't appear to be linked from your hack.md, and stuff I can find by a cursory look isn't the code we are talking about (e.g. #418 (comment)).

Shadow copies on misalignment is a disadvantage

There isn't an extra copy. The texture data provided by the user in this PR has to go to a driver-managed staging buffer first, anyway, which is the CPU copy, and we can just resolve the alignment there.

Use different buffers

This means doing something custom that the polyfill you are describing doesn't do. So the polyfill (alone) can't replace setSubData in general. That is to say, having a more powerful (and harder to use, as a consequence) way to upload data, in addition to setSubData, is a good feat to have. Users would opt into complexity, just like they do in Metal today, when, say, they start using MTLHeap objects.

Other designs can write into resources directly

Do you mean the variants where the user provides the range, and we can detect if the range is not used by the GPU? This behavior is yet to even be attempted to be specified, since the current usage rules can allow parts of the buffer to be used differently.

We are currently sailing this ship

We've been sailing this ship for 3 years by now, and we are on the verge of settling on the MVP. We can, indeed, still change the course, but it's getting more and more costy to do so.


- `destination.texture` texture wasn't created with {{GPUTextureUsage/COPY_DST}} flag in {{GPUTextureDescriptor/usage}}.
- `destination.texture` must not be destroyed
- `destination.origin + size` exceeds the size of texture dimensions of the mipmap `destination.mipLevel`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume that size is specified to be in bytes and origin in pixels? If so, then the operation destination.origin + size is using mismatched units and will not work, especially if there are padding pixels at the end of each scanline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size here is a GPUExtent3D in pixels. Unfortunately the types are not locally visible (you have to check the IDL)...

spec/index.bs Outdated
- `destination.origin + size` exceeds the size of texture dimensions of the mipmap `destination.mipLevel`.
- any of the `size` extents are zero
- `sourceOffset + requiredSize` exceeds `sourceData.byteLength`, where `requiredSize` can be computed as
`(sourceLayout.imageHeight * sourceLayout.rowPitch * (size.depth - 1) + sourceLayout.rowPitch * (size.height - 1) + size.width) * bpp`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume that rowPitch is specified as being bytes per scanline, regardless of the pixel format. If so, we shouldn't also be multiplying everything by bpp

spec/index.bs Outdated
Comment on lines 1836 to 2486
dictionary GPUTextureDataLayout {
GPUBufferSize offset = 0;
required unsigned long rowPitch;
required unsigned long imageHeight;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please spec the units of these fields. I presume that offset, rowPitch and imageHeight are in bytes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset and rowPitch are in bytes but imageHeight is in texels. We discussed this a while ago in . #148

spec/index.bs Outdated
Comment on lines 1874 to 2524
Takes the `sourceData` contents starting from the offset `sourceOffset` and
schedules a write operation of these contents to the `destination` buffer in the queue.
The operation does nothing and produces an error if any of the following is true:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Would be good to spec that the contents of sourceData are read during the execution of the writeXXX function call. Any subsequent modifications to sourceData do not affect what is written at the time that the scheduled operation runs.

spec/index.bs Outdated
The operation does nothing and produces an error if any of the following is true:

- `destination` buffer wasn't created with {{GPUBufferUsage/COPY_DST}} flag in {{GPUBufferDescriptor/usage}}.
- `destination` buffer must not be destroyed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `destination` buffer must not be destroyed
- `destination` buffer is destroyed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is implied by ["destination" buffer isn't in the "unmapped" state.], so you could also remove this.

spec/index.bs Outdated
The operation does nothing and produces an error if any of the following is true:

- `destination.texture` texture wasn't created with {{GPUTextureUsage/COPY_DST}} flag in {{GPUTextureDescriptor/usage}}.
- `destination.texture` must not be destroyed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `destination.texture` must not be destroyed
- `destination.texture` is destroyed

@kvark
Copy link
Contributor Author

kvark commented Feb 24, 2020

Thanks for the reviews! Addressed the notes and rebased now.

@kvark
Copy link
Contributor Author

kvark commented Mar 30, 2020

On client-side tracking and direct writes to the buffer if it's not in use: I think this ability should be left for a follow-up extension, since in a general case the original buffer is not even mappable, so it's impossible to write any data in directly without a copy. So for now we can say this API is only meant to stage and queue uploads.

@kainino0x kainino0x changed the title GPUQueue.setSubData methods GPUQueue.writeBuffer/writeTexture Apr 13, 2020
@kvark
Copy link
Contributor Author

kvark commented Apr 22, 2020

Note that at least until we are able to map memory in WASM space, ArrayBuffer is how WASM application are going to communicate with the API. So having writeBuffer is minimal copies (and optimal) for the near future, unlike any of the mapping proposals.

@kvark
Copy link
Contributor Author

kvark commented May 7, 2020

Closing in favour of #749 with #761

@kvark kvark closed this May 7, 2020
ben-clayton pushed a commit to ben-clayton/gpuweb that referenced this pull request Sep 6, 2022
…gpuweb#509)

* Support passing skip/fail test expectations in wpt or by command line

On the command line, test expectations are accepted by a command line
flag --expectations path/to/expectations.ts, where expectations.ts is
a module that exports RawTestQueryStringWithExpectation[] as
"expectations". In WPT, the page should define a global variable
|loadWebGPUExpectations| which is a
Promise<RawTestQueryStringWithExpectation[]>.

The test expectations are filtered down to only what is relevant for
the running test query, and applied to the testcases before they run.
A 'skip' expectation will entirely skip a testcase, and a 'fail'
expectation will surface failures as passes. If a 'fail' expectation
is found, but a test actually passes, the harness will log an warning
for the unexpected pass.

* Update tools/deno

Co-authored-by: Luca Casonato <lucacasonato@yahoo.com>

* Address comments from code review

* testcaseQuery -> filterQuery

Co-authored-by: Luca Casonato <lucacasonato@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants