New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTPS requests to invaluement.com are performed every 5-6 seconds on idle server #3929
Comments
This issue was mentioned in now locked bug: #3877 |
ivm-sg.lua is sourced from https://github.com/fatalbanana/ivm-rspamd repository. It has been archived, recommending to use rspamd selectors: https://rspamd.com/doc/configuration/selectors.html |
Same with selectors. I switched to fatalbananas implementation because it has some slight advantages. |
If you don't like it, remove it. ;) |
I don't want to remove it. The issue is that the lists are updated much more frequently than they should be, which creates unnecessary load on both Mailcow server and |
It checks for modifications via header and quits if nothing changes.
… Am 10.01.2021 um 21:57 schrieb ValdikSS ***@***.***>:
I don't want to remove it. The issue is that the lists are updated much more frequently than they should be, which creates unnecessary load on both Mailcow server and invaluement.com service. The interval should be increased to several minutes, not seconds.
Please reopen the issue.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Even so, this is still a bug. It shouldn't check for updates more frequently than at least once in a minute. This is undesired behavior for both the server and for remote. |
It's not a bug. It's the default map refresh interval. We don't pull data (again) and only check for changes.
That's simply not a bug.
… Am 10.01.2021 um 22:08 schrieb ValdikSS ***@***.***>:
Even so, this is still a bug. It shouldn't check for updates more frequently than at least once in a minute. This is undesired behavior for both the server and for remote.
Links to other lists are checked much less frequently.
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
As far as I can see, map_watch_interval is set to 30 seconds, not 5 seconds. If you don't want to fix this bug, or don't know how to fix it, or it's much more complex to fix than it seems, please say so. Don't pretend like it's totally normal and everybody should refresh almost-static files with such interval, making full HTTPS connection every time without keep-alive. |
André, This is Rob McEwen, CEO of invaluement.com - on our website where we provide this free service - we recommend: It looks like you're doing (2) - and if so, thanks for that. This is critical. But I need your help with (1). Please change your update-checking interval to 60 seconds. For example, I've actually recently had about a $1K/year increase in my hosting costs just due to access to these files alone, and that is only going to increase as more use this, and as more types of ESP data files are added in the near future. btw - I'm using cloudflare for best performance, and I've greatly optimized the performance by configuring Cloudflare to NOT check for updates for these files, and to keep DDOS protection to a minimum on files in this folder. (these being at Cloudflare as 100% cached static files eliminates the need for DDOS protection anyways, for these particular files) Then, whenever a file updates, I alert their API of the change, so that they can only fetch the new copy THEN. This is amazingly efficient! Without this, they would check my server for updates OFTEN, during the middle of client updates, slowing down many client updates. However, even so, I'm using their "Argo" feature which improves network efficiency, and that is causing extra charges with all this extra traffic. (I guess I could turn Argo off? Maybe it wouldn't be that much of a difference either way?) So when I saw that increase, I went to cloudflare support and got lists of IPs that were causing the most traffic/connections - and MANY of them had PTR records OR were mail servers with SMTP banners - that had the word "mailcow" in them. So mailcow is a large cause of this extra traffic. So - again - please do me a favor and change your update interval to 60 seconds. I recognize that slightly less frequent checks will cause some amount of "false negatives" - but I think that amount will be extremely tiny compared to the amount of spam that this data blocks. Unlike spammers who burn through IPs and domains when they self-host, most of the Sendgrid spammers don't do "hit and run" burst sends, and then are never seen again. Most of the ESPs rate-limit their sending, especially for their less trusted customers. So getting the data up to 60 seconds later (probably averaging closer to 30 seconds later) - shouldn't cause a big difference, but will go a long way towards not overusing our free data. Thanks again for your help with this! Rob McEwen, CEO of invaluement.com |
Oh, Rob, thanks for your response. I respond by mail and didn't read your text until now.
I tried to get in contact with you about that a while ago and also tried to offer a regular payment. :)
Thanks for that service, it really helps a lot.
I will check the map interval for your service and reduce it if it's too much, no problem. And yes, we already check for modifications via header. :)
… Am 11.01.2021 um 04:42 schrieb kirkham ***@***.***>:
André,
This is Rob McEwen, CEO of invaluement.com - on our website where we provide this free service - we recommend:
(1) once-per-minute updates
AND
(2) caching the last downloaded data and then checking to see if the server version is newer than the stored version, and then only downloading the data if the server version is newer than the last download. (the data update is VERY unpredictable - it can often go several hours without an update - then it can have many updates minutes apart - this has more to do with if/when new spammers start using Sendgrid than anything I'm doing.)
It looks like you're doing (2) - and if so, thanks for that. This is critical.
But I need your help with (1).
Please change your update-checking interval to 60 seconds. For example, I've actually recently had about a $1K/year increase in my hosting costs just due to access to these files alone, and that is only going to increase as more use this, and as more types of ESP data files are added in the near future. btw - I'm using cloudflare for best performance, and I've greatly optimized the performance by configuring Cloudflare to NOT check for updates for these files, and to keep DDOS protection to a minimum on files in this folder. (these being at Cloudflare as 100% cached static files eliminates the need for DDOS protection anyways, for these particular files) Then, whenever a file updates, I alert their API of the change, so that they can only fetch the new copy THEN. This is amazingly efficient! Without this, they would check my server for updates OFTEN, during the middle of client updates, slowing down many client updates. However, even so, I'm using their "Argo" feature which improves network efficiency, and that is causing extra charges with all this extra traffic. (I guess I could turn Argo off? Maybe it wouldn't be that much of a difference either way?)
So when I saw that increase, I went to cloudflare support and got lists of IPs that were causing the most traffic/connections - and MANY of them had PTR records OR were mail servers with SMTP banners - that had the word "mailcow" in them. So mailcow is a large cause of this extra traffic. So - again - please do me a favor and change your update interval to 60 seconds.
I recognize that slightly less frequent checks will cause some amount of "false negatives" - but I think that amount will be extremely tiny compared to the amount of spam that this data blocks. Unlike spammers who burn through IPs and domains when they self-host, most of the Sendgrid spammers don't do "hit and run" burst sends, and then are never seen again. Most of the ESPs rate-limit their sending, especially for their less trusted customers. So getting the data up to 60 seconds later (probably averaging closer to 30 seconds later) - shouldn't cause a big difference, but will go a long way towards not overusing our free data.
Thanks again for your help with this!
Rob McEwen, CEO of invaluement.com
rob AT invaluement DOT com
+1 478-475-9032
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Excellent. Thanks! And sorry I had overlooked your previous email. In the next couple of days, I'll look for it and respond. Thanks again! |
No worries at all. |
@andryyy has it been fixed? |
Prior to placing the issue, please check following: (fill out each checkbox with an
X
once done)Summary
Mailcow performs HTTPS queries to
www.invaluement.com
domain (tohttps://www.invaluement.com/spdata/sendgrid-id-dnsbl.txt
andhttps://www.invaluement.com/spdata/sendgrid-envelopefromdomain-dnsbl.txt
URLs) every 5-6 seconds.Full TCP connection is established and closed for this query (from SYN to FIN), this is not a keep-alive ping.
This creates unnecessary load to
invaluement.com
server.The queries are performed by ivm-sg.lua script.
Logs
I found nothing in logs regarding these requests.
Tcpdump log with timestamps is attached.
tcpdump-log-invaluement-com.txt
Reproduction
www.invaluement.com
IPv4 and IPv6 addresses. Command for current address set:tcpdump host 104.22.15.144 or host 172.67.14.207 or host 104.22.14.144 or host 2606:4700:10::6816:f90 or host 2606:4700:10::6816:e90 or host 2606:4700:10::ac43:ecf
I've tried to intercept the data by replacing the URL to my HTTP mocking server which just returned HTTP 200 OK, but in this case there are no repetitive requests are performed.
System information
docker version
)docker-compose version
)git diff origin/master
, any other changes to the code? No.iptables -L -vn
,ip6tables -L -vn
,iptables -L -vn -t nat
andip6tables -L -vn -t nat
.docker exec -it $(docker ps -qf name=acme-mailcow) dig +short stackoverflow.com @172.22.1.254
(set the IP accordingly, if you changed the internal mailcow network) and post the output.The text was updated successfully, but these errors were encountered: