Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] [Fermi] Is there a way to accumulate buffer offset after transform feedback (aka streamout) #15

Closed
imirkin opened this issue Apr 2, 2016 · 3 comments

Comments

@imirkin
Copy link
Contributor

imirkin commented Apr 2, 2016

Something that GL supports is to do

enable TF
draw 1
pause TF
draw 2
resume TF
draw 3

Now the idea is that things will get accumulated into the TF buffer from draw 1 and draw 3 but not draw 2. The way this is implemented in hardware, is that after draw 1 happens, there's a query you can run by writing to 3d method 0x1b0c:

0x0d005002 | (tfb buffer index << 5)

On Kepler, this does what one might hope -- it returns the full offset, i.e. the amount of buffer written + the tfb buffer offset (written to 3d method 0x290). However on Fermi it just overwrites that value. Which means that the offset retrieved from the query after draw 3 is complete only counts the quantity of bytes written by that draw alone, not including the offset.

Is there some bit of cleverness I'm missing to make this work on Fermi in a way that doesn't involve me waiting for draw 1 to complete before I configure the parameters for draw 3?

By the way, the problematic situation is triggered by the later cases of this dEQP test:

dEQP-GLES3.functional.transform_feedback.basic_types.interleaved.points.lowp_float

@imirkin
Copy link
Contributor Author

imirkin commented Apr 17, 2016

After staring at the blob driver's traces, it seems to be doing the exact same things nouveau is, but is getting the correct results (or at least the test passes). Which means either there's something it does slightly differently which is causing the hardware to behave properly (I notice it turns TF on/off left and right and uses a short query rather than a long one, and has slightly different synchronize/etc behavior, although attempting to do the same in nouveau did not improve things), or it's some grctx setting which controls it.

So... is there some GR bit which makes that query accumulate on top of the existing buffer offset? Or any other advice on making this work properly?

@imirkin
Copy link
Contributor Author

imirkin commented Apr 21, 2016

Based on Ben's suggestion, I set bit 0 of 0x50405c and it all magically started working. Looks like in rnndb, this was previously documented as

$ lookup -a gf100 0x50405c
PGRAPH.GPC[0].TPC[0].POLY.TFB_UNFUCKUP_OFFSET_QUERIES => 0

So I guess someone knew at some point :) But then it was forgotten.

@Gnurou
Copy link
Owner

Gnurou commented Apr 22, 2016

Sorry for not having come with this answer before you found out. Well it wasn't trivial so not sure we would have thought of this. Closing this issue.

@Gnurou Gnurou closed this as completed Apr 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants