Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP requests are causing the server to freeze #5515

Closed
wasied opened this issue Jul 9, 2023 · 15 comments
Closed

HTTP requests are causing the server to freeze #5515

wasied opened this issue Jul 9, 2023 · 15 comments

Comments

@wasied
Copy link

wasied commented Jul 9, 2023

Details

I'm opening this GitHub issue after all the servers on my machine started experiencing irregular server freezes for a few seconds, since the last update.

After several tests and attempts to understand the problem with some members on the Garry's Mod Discord, we concluded that this wasn't due to some code, but rather the HTTP function. This doesn't seem unreasonable considering there was an update related to DNS lookup recently.

My machine is hosted at OVHCloud. The firewall restricts TCP traffic to a bare minimum (we do not use TCP port 27015), and UDP is completely open but goes through OVHCloud mitigation, only on the ports of the Garry's Mod servers. We are using pterodactyl (so Docker).

I'm not the only one who has had this problem, I've had several colleagues who had the same issue, all were at OVHCloud but that doesn't necessarily make it the cause.

UPDATE: I've had people who weren't with OVHCloud have the same problem. However, they were all using pterodactyl.

Steps to reproduce

  • Simply enter this in the server console:
    lua_run for i = 1, 20 do http.Fetch("https://google.com", function() end, function() end) end (you can try with any URL)

You should see that your server traffic will be cut off for a few seconds, as it simply stops responding. Some servers do, some don't.

  • With 2 requests made at the same time, it's quite rare that it happens, about 1 time out of 10.
  • With 5 requests, it crashes about half the time.
  • With 20 requests, it crashes every time.

However, I don't think it's specifically related to the number of requests made at the same time, as the crashes happened even when only a single request was made on my server. I haven't been able to identify the exact source of the problem, but reproducing it is probably a good start to understanding the issue!

@FlorianLeChat
Copy link

This issue can be reproduced inconsistently in singleplayer or in P2P servers on Windows (main or x86-64 branches). I've also tested with a dedicated OVH server running Debian 12 and it's really only relevant to those using Pterodactyl, but on a traditional server, everything works fine. Perhaps you could use CHTTP or Reqwest to workaround that issue for now.

@timegivenzero
Copy link

The issue can be reproduced constistantly on any machine that I have tried, as long as its using Pterodactyl as its panel, Using CHTTP or Reqwest doesn't work around the issue at all. This is issue has only appeared since the latest gmod update where DNS resolving was changed. It appears that its being done on the main thread and causing the CPU to spike in use.

@TIMONz1535
Copy link

TIMONz1535 commented Jul 10, 2023

I can confirm that there is a problem with the freeze due to DNS resolve (maybe) in main branch
I make a request to a dead site and the game freeze for 11 seconds

SysTime 652.5015046    http.Fetch before function call    https://c.radikal.ru/c11/2009/f1/18e46abe21cd.png
-- freeze here when function call
SysTime 663.6396301    http.Fetch failed callback exec    https://c.radikal.ru/c11/2009/f1/18e46abe21cd.png    failed to resolve domain
SysTime 663.6399388    http.Fetch after function call

@wasied
Copy link
Author

wasied commented Jul 10, 2023

I haven't found a workaround yet. This makes my servers difficult to play. If anyone has found a solution, please share it!

@Kobralost
Copy link

Same thing for my server. We try CHTTP but it doesn't work properly.

@robotboy655
Copy link
Contributor

I have updated the implementation on dev beta, please let me know if that improves it at all. I am assuming when you say "crash" you mean "freeze".

@wasied
Copy link
Author

wasied commented Jul 10, 2023

Indeed yes, I mean a server freeze for about 5 seconds, after which it unfreezes. I'll test the modification on my side, thanks.

@wasied
Copy link
Author

wasied commented Jul 10, 2023

@robotboy655 I was not able to reproduce the bug with the dev branch. It seems like it's fixed!

@Kobralost
Copy link

Looks good for me too @robotboy655

@PossiblyCharles
Copy link

PossiblyCharles commented Jul 11, 2023

As far as I can see it's still reproducible on the dev branch. The problem is if the url is invalid something is blocking. If the servers got any addons that allow users to provide a URL that does a http.Fetch and they provide an invalid URL or ip... Lag spikes for days.

For instance,

A fine none blocking 3 gets.
lua_run for i=1, 3 do http.Fetch("https://google.com/", function() print(1) end, function() print(2) end) end

A fully blocking 3 gets. (img URL from a post above.)
lua_run for i=1, 3 do http.Fetch("https://c.radikal.ru/c11/2009/f1/18e46abe21cd.png", function() print(1) end, function() print(2) end) end

For now I personally have a cheap vps just verifying urls by running a get request from the vps before running any direct gets. Keep in mind this does mean all http.Fetchs that run have an extra delay but you'll never get the freezing that is happening at the moment.

This is using my cheapo vps. Leaving the IP in so you can use it too if you don't want to also set up a Flask app or whatever other API stuff.

local api_url = "http://72.14.182.169/check_url?url="

http.Fetch(api_url .. "https://google.com", 
    function()
        globalOldFetch  = http.Fetch -- Made global so we can undo this live if needed.

        http.Fetch = function(...)
            print("Fetch: Checking")
            local args = {...}
            globalOldFetch(api_url..args[1],
                function(body, len, headers, code)
                    if code == 200 then
                        print("Fetch: Valid")
                        globalOldFetch(unpack(args))
                    else
                        print("Fetch: Invalid")
                        args[3]("invalid url")
                    end
                end,
                args[3]
            )
        end
    end,
    function()
        print("http.Fetch temp fix url check api is down")
    end
)

@robotboy655
Copy link
Contributor

I have updated dev beta to run the DNS lookup in a separate thread, so it should no longer block the main thread. I myself cannot reproduce it being that slow even at 20 requests, so please do test this change and let me know if it solves it for you.

@hugoheml
Copy link

hugoheml commented Jul 12, 2023

Hey,
I can't reproduce the bug with 20 requests on Pterodactyl server (on Debian 11), thanks for your update.

Edit: Just for information, when I do a loop for 500 requests with invalid domains, I have a CPU spike to 0% for 2 seconds.

@PossiblyCharles
Copy link

To clarify for anyone that sees this at the moment. The issue I was finding of any URL that wouldn't resolve to a valid IP causing a freeze seems to require you to have a VPN on. Or at least it does for me... I have nord VPN on 24/7 and didn't realize it was related. So it might only affect VPN users.

@TIMONz1535
Copy link

TIMONz1535 commented Jul 13, 2023

It seems to be fixed.

But at some point I had an weird bug. I am sending 243 http.Fetch and 38 of them are dead site. At some point my internet dies (the internet provider sneezed and changed my dynamic ip address), and all 243 requests "failed to resolve domain". And I got 4 freezes of 11 seconds each, the first three prints 3 domains, the fourth prints the remaining 234 domains. This is probably because there are too many threads, because in my test I create them in single tick (so don't do that).

upd: edited info that internet has died

@robotboy655
Copy link
Contributor

I am rolling back the changes as it just refuses to work correctly even in a thread. Might try again later.

@robotboy655 robotboy655 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants