-
-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OBS Studio (various versions) fail to record using nvenc (various versions) randomly on Windows 10 #8009
Comments
in the future, please provide at least one log file that isn't hosted as a third party archive. Also, be mindful when sharing dump files, as they contain all information that OBS might have in memory, including potentially sensitive information such as stream keys. |
Alright, I see to it that a new log file is generated. ...one moment... https://obsproject.com/logs/t6GgAX64K2SLwi91 - Sixth recording. File has 1K. No idea whether this is helpful or not, this time the recording stopped after a few seconds, but apart from that the symptoms were the same. "Total Data Output" and "Bitrate" being zwero, and "Disk full in (approx.) showing a large negative number. Edit: The recording stopped when I Alt-Tabbed out of the full screen game to get hold on the OBS GUI. Edit 2: Interestingly in Mass Effect 3 Legendary I had to use the Task Manager to kill obs+ffmpeg again. Only ME2:LE seems to cause it to stop by itself when Alt-Tabbing out. |
Just tested the new OBS-29.0.0. First recording went through just fine, second froze. Here is the log. Edit: Let's go through the log snippet of the failed recording
Here a whole bunch of stuff is simply missing. In the log you can see that after the successful recording logged above, many entries labeled
Now
This is after I Alt-Tabbed out of the game, closed it, and closed OBS (which was still "recording" with "Stopping..." button in blue background) window.
The rest is the result of killing the obs64.exe from task manager detail view. I hope the lines that did not show up might tell someone what the heck is going on. |
I found some sort of "workaround", at least in Mass Effect 1 Legendary Edition, I haven't tested this with ME2:LE or ME3:LE, yet. When the recording does not start (Data written and Bitrate stay at 0), I press my hotkey to stop the recording, and the button in the OBS Window stays on "Stopping..." with blue background, I Quicksave the game and then Quickload. During the loading, when the scene is rebuilt, the "Stopping..." disappears and OBS is no longer frozen. This could have been a coincidence, so I made some test, and I can reproduce this reliably. OBS stays frozen until I load a save game which would of course reset the engine and rebuild the scene. I have tested waiting for between 10 seconds and over 5 minutes to make sure this was no coincidence, but OBS always came back to life at the same point in loading the savegame. I hope this gives some more hints about where to look... |
Unfortunately the "Workaround" does not always work. Today I had one occasion where the Recording did not stop on game save reload. When I Alt-Tab'd out, "Stopping.." button kept being frozen. When I closes OBS Studio, it said this would stop all recordings, but obs64.exe and obs-ffmpeg-mux.exe were still in the task manager and had to be killed. Unfortunately I did not remember to take a dump of the process first. If this happens again, I'll dump it and make it available for you. |
I could reproduce the "workaround not working" full freeze. The recording would not stop, 0 bytes output, and I tried the "reload save game fix" from above. But The recording was stuck like so often now. So I tried to Alt-Tab out, this "fixed" the freeze a few times, too, but not this time. Eventually I closed ME1:LE, closed OBS Studio and went into Taskmanager to find obs64.exe there still running with 86% GPU usage. (For what?) Here is the log: 2023-01-18 07-59-17.txt In the log you can see:
|
This issue has cost me so much time already, that I went to something desperate: Full reset of all drivers and going back to OBS 27.1.3. So I once again deleted all graphics drivers with DDU, but this time I re-installed the first driver packages for both Intel and nvidia graphics that Dell had, so I went back to nvidia drivers 451,67. I then completely removed OBS, installed OBS 27.1.3 and deactivated "HAGS". OBS Studio shortcut is configured to start as Administrator. As the nvidia drivers had to be fully wiped, I had to re-do my configuration: As those nvidia drivers are ancient in drivers world, I updated to the latest available pre-500 drivers: 474.14 So far I was able to record over 50 clips without any issues. Next step is to update to the latest Dell-approved nvidia drivers, which will be 512.36. |
I've been following this, but haven't been able to reproduce any of the issues described, and it's sounding more and more like an environment or driver issue. I'm going to go ahead and close this for now, but if this starts happening again and can be isolated to specific reproduction steps, feel free to comment and we can reopen. |
I am not convinced, yet, because I was able to record dozens of clips with OBS Studio 27.1.3 and nvidia drivers 517.66 (latest official drivers from Dell). However, maybe you are right. I will update to latest OBS and nvidia drivers today, and if the issue is gone, it means that something got awry somewhere and was fixed by me completely removing everythin g and starting from scratch with the drivers. I duly hope it is an environmental thing that got fixed with all the cleaning and reinstalls. But if the freezing comes back after the updates, then I am afraid I have to reopen this. Edit: Sorry, but the first attempt to record something with OBS studio 29.0.1 went into a hard freeze, where I had to kill the process in task manager. |
I have rolled back and was able to do plenty recordings using OBS Studio 27.1.3 with latest nvidia drivers 528.24 I am very sorry, I wanted this to be an issue with some environmental things, but this is clearly a bug introduced in OBS 27.2 as that's the earliest version I got the freezes with. Please re-open and investigate. I have provided dumps and logs. If there is anything else I could provide, just give the word. For the time being I am nailed on OBS 27.1.3. |
@Fenrirthviti : I have proven that this is an actual bug. Could this please be opened? Bugs do not go away by closing their reports. To clarify: I am super sorry that I was not able to pin this on some driver- or settings-issue on my machine. But I have also reported plenty of clues were to look (like the log comparison above and dumps), so I do hope this can be fixed some day. OBS 27.1.3 with nvidia drivers 528.24 have been super stable. No issue for over 150 clips recorded so far. Upgrade to OBS 29.0.1: full freeze on any of the first 10 attempts to record something. Totally random, like before. If I should attempt a shot in the dark, I'd say this is some dead-lock in a thread race. |
You have proven there is an issue on your system, but there is still no evidence I can find that this is a specific bug in OBS. This is most likely some kind of environment issues specific to your system. I can reopen this, but I've tested everything exactly as presented, with the driver versions given, and have not been able to replicate. If anyone else is able to test and confirm, that would be helpful for narrowing down what is going on. As a note, I don't have the experience to dive in to crash dumps, so if anyone who would like to take a look at those and provide any insight that would also be welcome. |
Thanks. Now, what I do know, is, that the shipped ffmpeg version has been upgraded with OBS 27.2, which is one of the reasons why I am so desperate to get the higher versions working. So here are my video settings: (*)
Here are my encoder settings for OBS 27.1.3:
And finally those settings adapted to OBS 27.2+
(*) Just to make sure this is not becoming an issue again, here another quote from the old issue about the settings mapping:
I do not want to sound rude, I just want to avoid wasting time with discussions about hardware and settings that worked fine for a 4-digit amount of recordings. I have reasons why I am recording the way I do record. Edit: Of which 2,650 MKV files are still on my Backup drive. |
did you make sure that HAGS is off each time you tested with 29 ? with reinstalls it gets sometimes surreptitiously re-enabled. HAGS has been causing a lot of issues. You do note that hags was off with 27, just want to make sure it was off too with 29 |
It was always on and never caused any troubles. As you can see in #6062 I tried turning it off and it caused my FPS to drop tremendously. But the last time I cleaned everything with DDU and started from scratch, I turned HAGS off first, and this time I had no ill effects. So I have tested both OBS-29.0.0 and OBS-29.0.1 with both HAGS off and on. |
The short answer is: no. The long answer is:This may be possible, but perhaps not in the way you think. We do a one time NVENC settings migration in 28.1.0+ (which was improved in 28.1.1) using recommendations from NVIDIA's NVENC Preset Migration Guide to migrate things like "Performance", "Quality", and "Max Quality" to "P3", "P5", and "P5" respectively, with appropriate Tuning and Multipass parameters based on a best guess of common GPU generations and selected output resolutions in the wild. Once this migration has been performed, the user is free to fine tune and select any parameters they want, and the users chosen parameters will be used instead. This one-time migration does not occur again once a "preset2" value is present in your encoder settings. It is not a live translation that occurs on each launch of the app or encoder initialization. Separately, there is code in our NVENC implementation that checks during encoder initialization if "preset2" has been set, and if it has not been set, then it will translate the settings in place to appropriate combinations of Preset, Tuning, and Multipass. However, once again, this does not occur if "preset2" is set. Further separately, FFmpeg itself will attempt to translate the old pre-SDK10 preset values to SDK10+ preset values if an older value is presented as the preset. Since we are passing new preset values as of OBS Studio 28.1, this does translation not occur. FFmpeg itself also uses the new preset values since FFmpeg 4.4. At this point, please note that NVIDIA's NVENC Preset Migration Guide and various encoder throughput figures most likely assume YUV420/I420 or NV12 (see: https://developer.nvidia.com/blog/introducing-video-codec-sdk-10-presets/), not YUV444/I444. I444 is more data, and requires more work in OBS due to data conversion, and thus may result in higher load. Additionally, because you are using I444 instead of NV12 or P010, OBS is falling back to the FFmpeg NVENC implementation, which I do not believe uses texture sharing, so there will be additional system resource load. You can see that your system has some mild overload in the log posted on Jan 7:
All of that said and set aside, while "the SDK10+ presets are not 1:1 when compared to the pre-SDK10 presets" may be a contributing factor, it is likely not the root cause of either, "Clicking the 'Stop Recording' button sometimes does not stop my recording immediately," or, "My MKV files sometimes end up with 0 bytes in them." Nothing stood out in the dumps. As there is already an abundance of information in this Issue, please answer the following questions concisely to confirm some details:
I have a vague idea of part of the problem, but I don't know enough about how it all pieces together to voice my thoughts at this time. We are looking into it. |
256GiB
197GiB at the moment
Internal KXG60ZNV1T02 NVMe KIOXIA 1024GB (6 Gbps) I made several tests using fio on Gentoo Linux under various workloads. The worst result was:
and the best result was:
27.2.4 There were some failed recordings with 1K files as far as I remember. The problem got out of hand with full freezes in November after updating to 28.1. (*) Mentioned in the other issue, the ones about my settings and the new translation after the ffmpeg upgrade.
Yeah, I went for breakfast and came back half an hour later to a still "Stopping Recording..." Interestingly enough tabbing out of the game I am recording sometimes, not always, ends the stopping. The other times I have to kill obs and ffmpeg using task manager.
NVIDIA Shadowplay is deactivated. Windows Game Bar is not installed. Origin/EA overlays are deactivated.
I never tried that, but will try this out tomorrow.
I have it currently enabled. One odd thing: Now that I have rolled back to 27.1.3 I did not bother to turn it off, as you can see above, and have the first few seconds of encoding lag with it turned on... I wonder whether they go away if I turn lookahead off again...
Thank you very very much! |
I think this is the first I've heard of this specific Issue occurring in 27.2.x. Do you have any logs of this Issue occurring in 27.2.x? The other issue (presumably #6062) was about FFmpeg's NVENC preset translation. It is not about encoder failures. This Issue, as far as I can tell, is about an encoder failure of some kind. Let us please keep this Issue scoped to its specific problem. |
No. I would have to install that version first. I can try tomorrow. Here is the today's experiments: recordEncoder.json after upgrading to and start of obs-29.0.2:
13 recordings so far in Mass Effect 2 Legendary Edition without any issues. 😄 Updated the settings via File->Settings->Output : Switched Multipass Mode from "Two Passes (Quarter Resolution)" to "Single Pass" resulting recordEncoder.json:
4 recordings with these settings, then the next full "stopping..." freeze came. (Log: https://obsproject.com/logs/EpEQzhN4pHkB5Sup)
Direct Copy&Paste, I did not delete anything between line 3 and 4, there simply was nothing. Went back to top configuration to see whether the 13 were just luck... Only one recording this time, froze on second recording. (Log: https://obsproject.com/logs/TOOGfOiZAunZDzZu) Maybe something is stuck or jammed when ending the first session? I will reboot and try a complete fresh login. ... unfortunately I got a call then and my time was up. I will try again tomorrow, and will then generate the 27.2.4 log you asked for. |
The most recent logs notably, lack this line:
This is perplexing, as all other cases I've observed had this. If you're getting encoder freezes/failures and that line is not present, then I'm afraid my lead is gone. |
No, the 13 were ust luck. Second clip froze this morning. This time I killed the ffmpeg process (after closing OBS via close icon). My idea was, that ffmpeg might hang and simply blocks OBS, but it did not. obs64.exe is still there.
Maybe its a new symptom in OBS 29.0.2? |
Started 27.2.4, removed lookahead and b-frames again and deactivated multipass in the settings.json:
(Although I am pretty sure 27.2.4 did not have a "multipass" setting.) First attempt froze: https://obsproject.com/logs/zimeg5x27HHGulU7 |
I can not reproduce the issue on Gentoo Linux with obs-studio-29.0.2 Configuration printed: What version does OBS Studio for Windows ship exactly? All I could find was:
That is either ffmpeg 5.0 or ffmpeg 5.1 ... which one is it? I only found that I could update to ffmpeg 5.1 on my Gentoo and see whether the issue can be reproduced there. But I would have to rebuild some really heavy packages like QtWebEngine, so I'd rather not do it if OBS is built against ffmpeg 5.0. |
I don't know what this means. Could you please clarify what version froze?
Unlikely. The specific code for that log line was added in OBS Studio 28.
FFmpeg 4.4.1 (plus some cherry-picked commits and patches) for OBS Studio 27.2. FFmpeg 5.0.1 for OBS Studio 28.0 and 28.1. FFmpeg 5.1.2 for OBS Studio 29.0. The DLL properties would give not only the FFmpeg versions but also the exact individual library versions. I will add that there is a similar report of this occurring on Fedora with OBS Studio 28 (#7534), though I would ask that you please stick to this GitHub Issue as we consider how or whether they are the same or just related. My previous theory was that perhaps commit 898256d played a part in this, or that some other 28-specific changes may be involved (the FFmpeg encoders were refactored). I am still not 100% convinced that the encoder failure you're seeing in 27.2.x is the same encoder failure in 28.x+ - you may simply be running into multiple different types of encoder failure. I will also add that so far, I have only seen these issues with FFmpeg NVENC, which in your case, is being used because of the specific settings you have selected. If you would like to provide additional debugging information, you could build OBS with these lines enabled (using |
Version 29.0.2 - the version I did 13 recordings which that went just fine.
Ah, okay. I was on Linux and only did some inspection of the content of the full package zip and the source archive. Before I answer the rest, here is the todays experiments results. (I am still hoping that this is "just" some weird accumulation of issues on my system, and that the "freezing" in obs is just a symptom...) Went back to 27.1.3 and after a reboot I recorded over 50 clips without any problems. So there must be something happening with all the updates, downgrades, uninstalls and reinstalls on Windows. So let's be more thorough this time:
Erm.. the default is to disable look-ahead, and to enable Psycho Visual Tuning? Really? The resulting recordEncoder.json astonishes me:
Looks like it only stores what is not set to default. Which leads to the question what else is hard-coded that I couldn't've taken into account, yet? First Tryout
This is weird. The settings when I had absolutely no encoding lag in 29.0.2 versus now:
Note: DXGI_SWAP_CHAIN_DESC is also the same. Alright... I am desperate, so I now set bf to 2, disabled multipass and enabled both lookahead and psycho_aq.
Second Tryout That did the trick, 0.0% encoding lag again. Unfortunately, after 11 recordings went smooth and fine, the 12th froze again. But the recording eventually stopped by itself when I exited the game, so it left a 1K mkv behind, and here is the log snippet, commented by me:
At least I did not have to kill it and there is something written to the log this time.
The more I think about it, the more I agree.
Yes, jim-nvenc is limited to NV12 color format, and 4:2:0 with 2 planes do not cut it for me. It may be fine for streaming, but not for the post processing I do.
Thank you very much for the Link, I will try to get it built on my dev VM. (Gaming dual-boot has no dev tools, but I have a VM for that.)
Oh, don't worry. Some of my tools I built at work spill out tons of lines, like one for every *alloc/free, when run in debug mode. As long as I can grep or regex search, I'll be fine. 😉 |
After looking over the commit briefly, several things came to my mind... Maybe I should try to make a tsan build on Linux first. That's how I check my programs which multi-thread. Also, with such bufferings, an lsan build might be worth the hassle... |
I finally had the freeze on Linux. OBS is just sitting there with "Stopping Recording..." I have compiled the whole thing with Anyway, I hope I can hook in a gdb session to exactly see where we are. |
Some first impression information: (Without any in-depth knowledge about what they mean, yet)
It does not look like the Hotkey even resulted in an attempt to stop the recording. But ThreadSanitizer printed out a lot of warnings about data races.
The other might just be Qt related and more or less harmless, but I'll go through them nevertheless. |
I am learning my way around the sources while fixing various issues. Most are harmless but still bugs, so might be worth it. Some are more serious, like heap use after free. (My work can be seen here: https://github.com/Yamakuzure/obs-studio/commits/fix_multithreading) (*) However, I got a new clue today. OBS Studio 27.1.3 also got messed up lately, since EA forced all Origin users, including me, to switch to the abysmally buggy "EA Desktop App". This app is very notoriously hogging CPU and GPU making recording almost impossible whenever it does anything in the background. (**) The behavior is different, like, for example, the recording starts and runs for a few seconds, and then halts. When I stop it, I get the "Stopping..." button that never goes away on its own, but it is normally enough to Quickload the last Save, which would re-init the game engine, to make OBS finish the stopping. With HAGS turned off I do not only get the freezes I have never seen on that version before, but also a very odd thing that I can not really explain: Sometimes the recording does start, but the FPS in the stats Window get displayed in red and are somewhere between 50 and 90. The game itself is more like 0.5 FPS. Ending the recording takes a while, but the moment OBS stops recording, FPS go back up to solid 145 ingame and fixed 120 in the stats window. My best bet right now is, that the EA desktop App is "stealing" GPU memory in the background, and the little OBS adds, is then too much for my T2000. And to make things worse, Windows started greeting me with full screen "ads" about upgrading to Windows 11. (*) Working in the sources feels odd, like a trip back to the 1990s. Hard break on 80 characters? Really? My tty on my laptop has a width of 236 characters already. () Although meant to fix problems with Shadowplay, the pinned comment here: https://www.youtube.com/watch?v=DwxdASZz5is has a workaround to make the EA desktop App play nicely with recording. Might be helpful with OBS, too? I'll test it this week-end. (*) Also I found this post: https://www.reddit.com/r/linux_gaming/comments/q98x0u/disable_origin_client_hardware_acceleration/ So it looks like I never had problems with HAGS enabled, because it caused origin.exe onto the Intel HD GPU. And when I deactivated HAGS the first time, nvidia drivers scheduled it onto the nvidia card? (***) I was so curious, that I tried it out. Re-enabled HAGS and made sure that EA Stuff is kept away from my Nvidia Quadro. But time is up for today, so tests will come tomorrow. |
There's a lot here. I'm going to respond to the items that seem relevant to this Issue.
I'm not inclined to agree with that assessment at this time.
To me, this is starting to sound like nothing in one particular version in OBS causes one distinct issue.
This almost sounds like the GPU is freeing up resources during a quick load, which causes the encoder to finally dump its backlogged data. This is how things are supposed to work, as far as I understand.
With what information I have available, I'd be inclined to agree that you're hitting some kind of resource limit, causing the encoder to get behind and then get stuck or fail. What does GPU-Z say about the GPU's total memory? What does it say about load and memory usage when you encounter one of these problem sessions? It should be easy to verify the GPU memory usage of different versions of OBS.
I don't see anything in this video that is something that affects this scenario. They are talking about Shadowplay hooking EA App as a game. OBS would not do that, unless it's a Vulkan app. Even then, I do not believe it would contribute to a potential encoder failure.
Sure, disabling hardware acceleration for other apps may free up GPU resources (in the specific example in the post, they seem to get back about 300MB of GPU RAM). You may be forcing the app to do software rendering, which may be less performant, or maybe your CPU has enough headroom to accommodate this, or maybe the app changes its behavior if it does not have access to hardware acceleration. All of this is starting to sound like a system resource usage issue causing an encoder to fail or get behind (the frames are coming out of the encoder slower than we're putting them in).
Intel GPUs do not support HAGS. HAGS should not necessarily decide which GPU an app runs on. It just makes the GPU-based scheduling processor handle GPU task scheduling instead of the CPU. That origin.exe switched GPUs is odd, but should not be based on the HAGS setting. The one point of interest here is that perhaps, because origin.exe was not on your NVIDIA GPU, the NVIDIA GPU had more headroom (either in GPU Load or in Memory Used/Available), which would still point to a resource constraint issue. Again, let us please focus on the important points of this Issue and not get lost in the weeds on what workarounds people recommend for avoiding conflicts between EA App and other software, or the OBS coding style, or Windows 11 ads. This Issue, while important to us, is already extremely long, and I find that such Issues are less approachable and off-putting to people who would otherwise be interested in resolving the problem. Thank you for your understanding. |
nvidia-smi stats on idle before starting anything:
I have logged with GPU-Z, and the values say, that the maximum memory consumption was 3,913 MB. While this happened, I recorded a video with obs-27.1.3 with up to 368,000 kbit/s. But I also went through some scenes I had this stuttering, and they now worked fine after I set the wretched EA desktop App to be locked on the Intel UHD. So this was completely unrelated and not a clue at all.
I got carried away a bit, sorry. Meanwhile I have finished with the Sanitizers. If I can no longer reproduce the freezing on Linux, I will build it for Windows and see what happens there. |
Finally a breakthrough! I caught obs-ffmpeg-mux here:
The source in question is:
So fread() basically hangs on reading stdin endlessly, because the header wanted is not sent. It looks like both I'll test a possible fix for that one tonight. Edit : It looks like there is more to it. Still investigating why the muxer does not get its headers in certain circumstances. |
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
This commit has two updates. 1) Modernise circlebuf_free() and circlebuf_init() As circlebuf::size is now atomic, have circlebuf_init() call atomic_init() on it if memset() is not safe to use. Also do not call circlebuf_init() after circlebuf_free(), that made no sense with both calling memset() anyway. 2) Change obs_output_actual_start() so that it initialize the capture circlebufs in output _before_ they are used. This change dramatically reduced the occurences of the freezes as reported in obsproject#8009 although they could still be observed. However, this at least seems to point at synchronization problems being the root cause of those freezes. Signed-off-by: Sven Eden <sven@eden-worx.com>
When switching darray::num to atomic_size_t, this also made the struct a non-trivial type in C++ mode requiring a non-trivial constructor. Embedding such into an anonymous union is invalid in C++17 when compiled with -Wpedantic using gcc, or with /std:c++17 using msvc. See: C5208/C7626 (learn.microsoft.com/en-us/cpp/error-messages/compiler-warnings/C5208?view=msvc-170) There are two possible solutions: A) change `num` to `volatile size_t` like `capacity` or B) overhaul both darray/DARRAY to comply with C++17 This is solution B) - marked EXPERIMENTAL - to test whether certain observed data races are (at least partly) responsible for obsproject#8009. If this overhaul indeed helps solving the mentioned issue, the more lightweight solution A) can be tested. Otherwise it must be discussed which solution to prefer. Signed-off-by: Sven Eden <sven@eden-worx.com>
This commit has two updates. 1) Modernise circlebuf_free() and circlebuf_init() As circlebuf::size is now atomic, have circlebuf_init() call atomic_init() on it if memset() is not safe to use. Also do not call circlebuf_init() after circlebuf_free(), that made no sense with both calling memset() anyway. 2) Change obs_output_actual_start() so that it initialize the capture circlebufs in output _before_ they are used. This change dramatically reduced the occurences of the freezes as reported in obsproject#8009 although they could still be observed. However, this at least seems to point at synchronization problems being the root cause of those freezes. Signed-off-by: Sven Eden <sven@eden-worx.com>
UPDATE Rebased my branch to 29.1.0-beta4 (Yamakuzure@194424d) Here is an update from my side: I have just recorded 68 Clips of varying lengths without any freeze. I finally fixed the issue. Or so it would seem, as my previous "best" was 12 clips before obs froze. ( Update : After the rebase on 29.1.0-beta4 I successfully recorded 46 clips with obs built in release mode. ) I believe the two relevant commits to look at were:
I changed a lot, and I also had to fork https://github.com/Yamakuzure/libdshowcapture.git, https://github.com/Yamakuzure/obs-browser.git, https://github.com/Yamakuzure/ftl-sdk.git, and https://github.com/Yamakuzure/obs-websocket.git for everything to work. (Note: Could we please get rid of that 1980s line length limit? It was really odd to work with an IDE that was two-thirds empty. Also, the 80-character line length limits cause really nasty line breaks which make stuff hard to read. Thanks.) (*) There are many commits being needed, so these can not be "cherry-picked", sorry. I know these are many many changes, but at least the whole suite now compiles fine with -Wall + -Wextra + -Werror on GNU GCC and with /W4 + /WX on Visual Studio 2022. Also, there is stuff I could not test. I have no DeckLink card, and I could not get AMF to compile. I do not have an AMD card, so I could not have tested it anyway. I have rebased on 29.1.0-beta4, but everything together is 95 commits now. Thank you very much for your patience and support! |
I wanted to make sure that this is not a dud, so I cloned the repo again and built 29.1.0-bet4 without my patches. What shall I say, 56 clips later it is clear that somewhere between beta3 and beta4 the issue got fixed. I then went back to 29.0.2, where the issue still existed, applied my patches, and the issue was gone. So, to cut this short, it is better to go with the official fix, whatever that was, and forget about my fork. ;-) (Unless you find some of my commits helpful, like the massive improvement to the logging. 😁 ) And no, I am not salty that someone "beat me to it", although I invested several hundred hours into this. I am just happy that I can film again without any problems. The last 6 months filming was a massive pain and the random freezes cost me so much time and nerves, I am so so happy that this is over. Thanks again for your patience and a huge "Thank You!!!" to whoever fixed it! |
Operating System Info
Windows 10
Other OS
No response
OBS Studio Version
Other
OBS Studio Version (Other)
28.0.3, 28.1.2, 29.0.0_beta1, 29.0.0_beta3
OBS Studio Log URL
Full log in dump archive, see below
OBS Studio Crash Log URL
No response
Expected Behavior
Press hotkey, do recording, press hotkey, recording stops and video is fine.
Current Behavior
Please see below, this can not be described in a few simple sentences.
Steps to Reproduce
Anything else we should know?
(Renewal of #7946, which was about a different issue.)
Symptoms:
(The negative number display will be suppressed by Prevent negative "disk full in" calculation when there's no output #7999.)
OBS Studio (2)
+> OBS Studio (8,3% CPU, 677MB RAM, 87,4% GPU (GPU 1 - Video Encode))
+> obs-ffmpeg-mux.exe (0% CPU, 1,1MB RAM, 0% GPU)
What I tried so far:
Rolled back and tested combinations of:
Nvidia drivers
with OBS Studio:
(The first occurrence was with R525 U2 + OBS 29.0.0_beta1, that's why I listed that version above, too.)
Also, to make sure this is not some orphaned file from some borked driver update interfering, I have booted into safe mode and used DDU to fully remove Intel UHD and nvidia drivers.
After another reboot into safe mode I used tweaks.com windows repair and let the full repair package run.
Hardware:
Dell Precision 7550 with custom upgrades.
CPU: Intel(R) Core(TM) i7-10875H
RAM: 2x 32GB Dual Channel 3GHz (Plus 2 empty slots. Nothing soldered or single channel)
GFX: Quadro T2000, 4GB RAM
Storage: 2x 1TB Class 50 NVME
OS:
Windows 10
Dump files and log, made with nvidia drivers 527.27 (11/28/2022) and OBS Studio 29.0.0_beta3:
https://mega.nz/file/iFgnUL4C#j2wtv65vUaChG08M7-rGM50PioruOQb6bLbdZJ5KueQ
The contents of the archive are:
Full log of the last recording session.
Dump where, after closing the window, the obs64 process went away on its own and the resulting video was metadata only with 1K size.
Dump taken while the "Stopping recording..." button was showing.
Dump taken after the window was closed, but obs64 porocess and this one stayed being active
Dump taken while the "Stopping recording..." button was showing.
I have recorded several thousand clips without any problems but #6062, and after that was fixed I never had any issue until sometime in November. This also means that I have recorded dozens of clips with the combination of OBS Studio 28.0.3 and Nvidia drivers R515 U4 (517.40).
The really mean and annoying detail is, that I can not guarantee that said combination never had any issue, because this happens so extremely randomly, that I just might have been lucky in October.
So my next step will be to roll back to
OBS Studio 27.2.4 with Nvidia drivers R515 U2 (516.59) from June 29, 2022, because that is the last combination I know works for sure.
I hope my dumps can help you guys! Thank you a lot for all of your hard work!
The text was updated successfully, but these errors were encountered: