Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImGui: Avoid frame count display race condition for input recording and display correct value #10793

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

TheTechnician27
Copy link
Contributor

@TheTechnician27 TheTechnician27 commented Feb 8, 2024

Description of Changes

Remove the "+ 1" for the frame count to be displayed during input recording/playback by ImGui.

Rationale behind Changes

Previously, we displayed the incorrect value for the current frame while creating an input recording. Namely, if the internal counter was n, we displayed n+1. This meant that, for example, the first frame of a fresh recording reads "1/0 (0)", where '1' is the current frame, and '0' is the total frames for the recording. Essentially, we were always displaying the wrong value for what the current frame was. Fixes part of (but not all) #8128.

Suggested Testing Steps

This does not change the actual internal logic of how input recording works, just what's displayed on the user's screen. Basically, create new recordings, play back current ones, etc., and see if the current frame counter (the one on the left of the slash) is ever incorrect.

Screenshots

Incorrect (current)

Frames

Fixed:

Frames_fixed

For both of these, please look at the frame counter in the top right.

@stenzek
Copy link
Member

stenzek commented Feb 9, 2024

This is not the correct fix. The input recording state is owned by the EE thread, the overlay is drawn on the GS thread. It is a race.

The EE thread should push status to the GS thread periodically instead. You can use MTGS::RunOnGSThread() for that.

@TheTechnician27
Copy link
Contributor Author

This is not the correct fix. The input recording state is owned by the EE thread, the overlay is drawn on the GS thread. It is a race.

The EE thread should push status to the GS thread periodically instead. You can use MTGS::RunOnGSThread() for that.

I'll have to look into that when I get the chance. Essentially then we would be calling RunOnGSThread on the EE thread in order to pass a message to the GS one? I will say that for now, in defense of the current PR, this would fix the issue where the displayed framecount is unequivocally wrong 100% of the time. It seems like the '+ 1' might've been put there as a bodge, but now it means that every rendered frame has n+1 as the current where n is the total frames.

@stenzek
Copy link
Member

stenzek commented Feb 9, 2024

It seems like the '+ 1' might've been put there as a bodge, but now it means that every rendered frame has n+1 as the current where n is the total frames.

Could've been. Whole thing's a race regardless; I probably either missed it during review, or didn't care because few users use input recording :P

I've briefly discussed it with Kam, the most straightforward way would be to have a struct, or string with the relevant information you want displayed, keep a copy of it as a static/local in the GS thread, and then have a function that's invoked via RunOnGSThread() from the EE thread when it needs to be updated.

@TheTechnician27
Copy link
Contributor Author

TheTechnician27 commented Feb 11, 2024

I've briefly discussed it with Kam, the most straightforward way would be to have a struct, or string with the relevant information you want displayed, keep a copy of it as a static/local in the GS thread, and then have a function that's invoked via RunOnGSThread() from the EE thread when it needs to be updated.

So if I'm understanding the race condition correctly: in ImGui (GS), we're polling five pieces of information from InputRecordingFile (EE). Because of that, we would like a struct which contains all this information so that we can avoid the EE updating one of these pieces of information out of sync with the others (for example, we query information 1, then the EE updates information 2, and then we query information 2). The five* pieces of information ImgGuiOverlays polls from InputRecordingFile are: a std::string filename and u64s framecounter (which I'll call currentframe in the struct for clarity), totalframes, framecount, and undocount. *(Edit: I didn't realize that g_FrameCount was part of Counters.h, not part of the InputRecording, so it'll be four pieces of information from InputRecording).

I think I have this correct so far, but from here, please correct me on any points that seem iffy/are wrong.

In MTGS (.cpp or .h?), I will create a static struct called InputRecordingData which has all five pieces of Exodia information. My plan is to have a mutex (fair would probably be best?) shared between on the GS thread and the EE thread which is locked for reads and writes. If InputRecording is trying to update, it will try to acquire the mutex, and if ImGui is trying to poll the struct, it will try to acquire it.

For the write, I would have a RunOnGSThread() call at the end of the InputRecording.cpp method 'incFrameCounter()', and this call would tell MTGS to run its new method which I'll call UpdateInputRecordingData.

For the read, I would create a new method in ImGui called DrawInputRecordingData. This would also attempt to acquire the mutex at the start and release it at the end, and it would have effectively the same code as seen in my PR, except the function calls to g_inputrecording would be replaced by accesses to the struct.

As for the struct itself, I don't know where it should be instantiated. I see that MTGS has a ThreadEntryPoint, but I'm completely lost on what it's doing and would probably need to just be explicitly told "this is where this should go."

I'm very much not used to programming in a real-world, collaborative scenario, so any handholding you feel like affording is deeply appreciated.

EDIT: I just realized the struct should go in ImGui, I think, not MTGS. My mistake. I wasn't realizing correctly how RunOnGSThread() worked.

@TheTechnician27
Copy link
Contributor Author

TASScreenshot

I'm just about done; I just need to test to see a) if this works consistently, and b) if I can get that global frame counter in parentheses under control.

@TheTechnician27 TheTechnician27 changed the title ImGui: Display correct value for input recording framecounter ImGui: Avoid frame count display race condition for input recording and display correct value Feb 12, 2024
@TheTechnician27
Copy link
Contributor Author

TheTechnician27 commented Feb 12, 2024

So whereas is_recording, filename, current_frame, total_frames, and undo_count will absolutely not race here, the g_FrameCount is still a race because it's decoupled from g_InputRecording. So the value in parentheses will sometimes be off by a frame, but everything else is correct, and realistically, there may not even be a need to display the g_FrameCount.

Incidentally, because the global is one frame behind the current and total most of the time, I could just bodge it and add a + 1 when the RunOnGSThread lambda takes in g_FrameCount.

@TheTechnician27
Copy link
Contributor Author

TheTechnician27 commented Feb 12, 2024

I couldn't take it; I cracked; I fixed everything. I wanted to play in the sandbox and fell into a sinkhole. This fully resolves #8128.

  1. The current frame count is now correct (unlike when it was being + 1'd).
  2. The total frame count now properly decrements to where it should be if a save state is loaded during recording (however, it correctly stays at the cap should you play back a recording).
  3. The global count in parentheses syncs correctly with the current and total frame counters (and accurately reflects total frames elapsed with save and load states).
  4. There is no longer a data race here between the GS and EE threads.

@TheTechnician27
Copy link
Contributor Author

Actually, sorry, this PR gets just about everything right except that save state recordings don't load (power-on recordings load just fine). I don't have time to investigate tonight, so I'll look tomorrow.

@TheTechnician27
Copy link
Contributor Author

TheTechnician27 commented Feb 12, 2024

I was actually mistaken before: save state recordings load just fine, and it was just a bizarre fluke (I've loaded about 10 since then with no issue). On top of that, I've just fixed an issue present in the old build which must have been why we originally bumped the current counter + 1: we were actually stopping the replay a frame early, and so if we had had, say, 504 frames total, the replay would've stopped at 503/504. Finally, by sheer and complete accident, I have actually done what I believe is create a useful feature: loading a recording now starts it paused, meaning you now press space to start playing it back instead of it just going on its own. This may be considerably more useful, as it makes it easier to watch the recording in full without missing anything while you're shifting your focus from the menus back to the screen. Basically, it gives you more control over when you start the playback. 😄

@stenzek
Copy link
Member

stenzek commented Feb 14, 2024

A lock isn't needed; the data gets captured/copied via the lambda, and then executes in the context of the GS thread.

Personally, I wouldn't bother storing all this data. Just construct the string(s) that you want to display on the EE thread, then shove it across.

@TheTechnician27
Copy link
Contributor Author

A lock isn't needed; the data gets captured/copied via the lambda, and then executes in the context of the GS thread.

Personally, I wouldn't bother storing all this data. Just construct the string(s) that you want to display on the EE thread, then shove it across.

Oh, I didn't know based on the name "MTGS" if there would ever be more than one GS thread running concurrently. If there will just be one, then you're right that the lock is extraneous. In this case, the only information I store in the struct is whether we're recording (which is how we distinguish whether to print "Recording" or "Replaying") as well as the data which will be interpolated into the strings that DRAW_LINE prints. (I'm ignoring the mutex in the struct here since I'll be getting rid of it.)

Although I could hardcode it to send over the complete, interpolated string, if that's what you're referring to, I elected to send the raw data to avoid spaghetti where anyone wanting to change the output string in ImGuiOverlays has to go track it down in InputRecording.

@stenzek
Copy link
Member

stenzek commented Feb 15, 2024

There's only a single GS thread. It's just "multithreaded" in that the GS emulation doesn't happen on the EE thread.

You could construct the string in ImGuiOverlays, it's just a namespace (e.g. UpdateInputRecordingInfo()). Just put a note in the header/declaration for the function, that it should only be called on the EE thread.

(and then, pass it to MTGS::RunOnGSThread(), update the static/local var with the move'd string)

@TheTechnician27
Copy link
Contributor Author

You could construct the string in ImGuiOverlays, it's just a namespace (e.g. UpdateInputRecordingInfo()). Just put a note in the header/declaration for the function, that it should only be called on the EE thread.

(and then, pass it to MTGS::RunOnGSThread(), update the static/local var with the move'd string)

Gotcha. My first attempt tried to make use of that InputRecordingUI::UpdateInputRecordingData() method, but for some reason it wasn't working (that's why it's still there but currently unused). In all likelihood, it was some small issue I could've resolved had I kept at it.

I'm also not entirely sure how to make this idea of string passing instead of value passing work within the framework of translations either, because for each line, we do a TRANSLATE_FS call on the not-yet-composed string as seen here:

DRAW_LINE(fixed_font, TinyString::from_fmt(TRANSLATE_FS("ImGuiOverlays","Frame: {}/{} ({})"), g_InputRecordingData.current_frame, g_InputRecordingData.total_frames, g_InputRecordingData.frame_count).c_str(), IM_COL32(117, 255, 241, 255));

Consequently, it seems like I would have to not just store the precomposed string in the struct, but rather the entire result of TinyString::from_fmt(TRANSLATE_FS("ImGuiOverlays","Input Recording Active: {}"), g_InputRecording.getData().getFilename()).c_str(). That, or I would store only the translingual portion of the interpolated string (e.g. "59/61 (127)") in the struct and then interpolate that already interpolated data into the full string when this TRANSLATE_FS is run.

If you feel strongly that string passing should be done instead of value passing, I can implement it, but I'm not sure how I (me, specifically, due to lack of experience) could construct it in such a way that it doesn't end up being slower, less flexible, and harder to read than what I have currently.

@stenzek
Copy link
Member

stenzek commented Feb 21, 2024

You can construct the full translated string(s) on the EE thread.

Speed-wise, it's not really any different, since both threads are each doing it once per frame; it just avoids a second source of truth for the recording data.

@TheTechnician27
Copy link
Contributor Author

You can construct the full translated string(s) on the EE thread.

Speed-wise, it's not really any different, since both threads are each doing it once per frame; it just avoids a second source of truth for the recording data.

Sorry for the long delay, stenz. The PR now constructs the strings in the EE thread within InputRecording.cpp and then sends them off to the GS. I've retested it, and to my eye, it looks like it still works as intended.


namespace InputRecordingUI
{
struct InputRecordingData
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply having a function that passes strings across would make more sense than exposing private details in a structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply having a function that passes strings across would make more sense than exposing private details in a structure.

I'm really not sure how to go about this. Ideally what I would want do to make things perfectly atomic is to have the GS, when it needs to print those strings, stop and request them from the EE, waiting until it can provide them in full. I believe right now my code technically has the issue that the GS thread can be trying to read from the struct while the EE thread is trying to write to it. We of course only have one writer, but I still think we nonetheless run into the readers-writers problem. Basically, what I would like to do ideally is:

  • GS notices it needs these three strings.
  • GS sends a request off to the EE to give it these three strings.
  • GS blocks while it waits for the EE to get to a point where it can compose the strings.
  • EE composes and sends the strings to the GS, resuming its own work.
  • GS unblocks and prints the strings to the ImGui overlay, continuing on from there.

Does this seem feasible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just replace the struct with a function that takes the 4 arguments (e.g. `InputRecordingUI::SetRecordingData()), then store it to static variables in the function body.

Since you're using RunOnGSThread(), the update will be atomic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants