-
-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SD 22 and SD 23 occurences have increased drastically lately #1340
Comments
I have noticed that when i try to delete affected resource from cache, it doesn't allow me to delete it "because it's in use", even if i am not connected to any server Additionally, when I tried to copy the affected resource (ipb in my newer case) it says that it cannot do that because it is open in Multi Theft Auto WOW64 Helper |
What happens if you try deleting the individual files in the folder. Do you get the same error for every file? |
WOW64 helper should not be using any files. |
MTA\wow64_helper.exe: I should note that I have accidentally (brain on autopilot) closed my MTA and now I won't be getting the errors for a while, but I'm sure they'll reappear like they always do lately (saying this in case if there will be another test with cache files, that i'll need to get the error again to do the test) |
If it happens again, try terminating |
|
This issue is almost guaranteed to happen when you join a server with a decent total download size for the first time.. as in, near completion it will read/write lock a random file, which will remain stuck (and in use) until you restart MTA, even if you remain disconnected from anything for a while. As in, with a decently sized DL, it's probably going to happen to some file. I can tell the majority of occurences happens in this way - after which player is forced to restart MTA and can then completely download server resources, but in multiple runs. Idk if the file being stuck is a result of bad implementation (e.g download error > file remains in use and locked) or if the download error is a result of the file becoming stuck for some reason. Anyways, servers using both internal and external HTTP are affected, so we're probably looking for a client downloader bug. Having a null MD5 string can indicate it opens a handle to the file that's about to be downloaded, but doesn't write the contents to it (due to being read/write locked.. other such cases also show this), after which it also doesn't release the handle so it remains stuck and in use. |
Sounds like a potential race condition between downloader and checker. (I don't know how distinct those two things are in our code.)
In my case I just tried to reconnect and it was fine. And yeah I agree that maybe some file handle is being left open somewhere. One obvious issue is that this code just returns null if mtasa-blue/vendor/bochs/bochs_internal/crc32.cpp Lines 31 to 60 in a4267de
And we just use that result without doing any error checking: mtasa-blue/Shared/sdk/CChecksum.h Lines 32 to 39 in a4267de
|
I can say it also depends on HTTP server performance. |
This has been happening quite regularly on our server, using the internal HTTP server. I have confirmed the checksum is correct in both the server's resource and cache directories, but is corrupted on the client's PC. Unfortunately i am unable to do more tests since i have not found a way to reproduce. |
One easy to way to trigger the 00000 error message is by following these instructions:
You can also simulate an invalid checksum by simply changing "None" to "Read", which only allows other processes (like MTA) to read the file, but not write to it. (Note that if you do this, you should probably change the content of I've got a pull request ready that will improve error messages (for now, only for internal HTTP servers), which you'll probably see in the timeline here in the next couple of minutes.
So some of the errors will be something from https://docs.microsoft.com/en-us/cpp/c-runtime-library/errno-doserrno-sys-errlist-and-sys-nerr?view=msvc-160 so any of these error messages could describe a user's exact issue. E.g. memory problems, too many files open (unlikely since this is apparently done sequentially, see #1511), etc. When reading files for checksumming, we open it using mtasa-blue/Shared/sdk/SharedUtil.File.hpp Lines 959 to 966 in a4267de
The |
dassert is a debug-mode-only assert . So it does nothing in release.
Basically what I think is happening, that a resource include tries to start this resource while it's files are still being downloaded. Then this line mtasa-blue/Client/mods/deathmatch/logic/CResource.cpp Lines 325 to 331 in a4267de
The solution might be to somehow delay the resource loading until Edit: Looking thru the comments have confirmed that it really is the HTTP server. As the checksum matches afterwards, and the file can be deleted, which means that it coudn't be opened before because the HTTP server was writing to it. |
That's issue #1466 Anyways, here's an update.. after 11f94de (PR #1778) the error messages that fit my story: #1340 (comment) .. of the most common download failure, which is when you first download a server's resources or need to download a significant update Have changed to more descriptive: So does "CRC could not open file: Permission denied" mean something, or is the logging that we added with PR #1778 still not sufficient to find out? It still sounds like affected files are in use (being occupied) by MTA while the HTTP server wants to modify/overwrite it into its final form. Can that be the meaning of "Permission denied" though? |
We could make the error more specific by using Win32 API stuff to open the file instead of C stdio (So like using CreateFile instead of Could also use Edit: Took a look at it. It uses |
Describe the bug
I am not really sure how to describe this other than the title. While administering another server I can notice that SD 22 and SD 23 are shown in like 15 players out of 150. This is not normal.
SD 22 seems to happen the most. I have recieved the error now myself:
Last time I encountered HTTP errors like this was in 2015
Some of my friends have also encountered this MD5 0000 thing within last month.
I won't say with absolute certainty that this is not normal, but I have a good feeling that the frequency of it has increased within last week or month. That is the only reason I'm reporting it; so that it can get some attention, or even better, be solved or explained.
To reproduce
I don't know
Expected behaviour
To not happen
Version
I don't know
Additional Information
Restarting the resource affected doesn't solve the issue, and still spits out errors:
This results in admin panel being bugged without having any other fix other than restarting the server, which can become a problem if it's a production server.
The text was updated successfully, but these errors were encountered: