Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recurrent "503 Sender Already Specified" Issue #1192

Closed
amargovin opened this issue Feb 18, 2023 · 8 comments
Closed

Recurrent "503 Sender Already Specified" Issue #1192

amargovin opened this issue Feb 18, 2023 · 8 comments
Labels
needs-investigation Potential bug. Needs investigation

Comments

@amargovin
Copy link

Hi,

First, thanks for this great product :)

And now, the issue. We've been facing a slew of error messages - details below

Error:
503 Sender Already Specified

Versions:
2.3.0, 2.2.0 and also tried the commit to fix a smtppool-lib issue.

Provider:
AWS SES. (Our limits are 1m emails/day, send rate is 500 emails/second)

We've been facing this issue for the last few days (perhaps longer than that but we've only now realised this was happening)

  1. When a campaign is sent, the first 60-70% of the list gets sent without an issue. However, towards the last quarter we see the logs filled up with "503 Sender already specified" (thousands of these)

  2. The indicator in the campaign pages show the campaign was fully sent but when we compared it with the AWS SES stats we found about 10-20% of the emails were not sent.

  3. We've tried various combinations of Connections - Concurrency - Message Rates - Batch Sizes but we've not been able to get a handle on this. We've tried high throughput, slow throughput, even tried pausing campaigns midway and restarting them. The issue continues.

  4. We don't have a problem with lists that are smaller - say, 15-20k subscribers but noticed this happening with a list that is about 100k subscribers. We then tried splitting that big list into two 50k lists - this splitting did not help. The issue crops up after 30k sends or so. Sometime earlier too.

  5. Are we inadvertently triggering some sort of a race condition? We need to send out about 100-150k emails in 60 minutes - our SES send rate ceiling is 500 emails/second - much much higher than we need.

  6. Range of settings we've tried below

SMTP Connections 10-100
Concurrency 10-100
Message rate 1-20

SMTP idle timeout - 30s (We tried 3-5s, didn't help with the issue)
wait timeout - 20s (We tried 3-5s, didn't help with the issue)

  1. Since the last two days we're also seeing a lot of 'error sending message in campaign <campaign_name> - <campaign_subject>: subscriber : EOF'

  2. We're running the listmonk app on Digital Ocean Apps platform - it's a 512mb RAM/1 vCPU. Could it be a memory issue - we process upto 50,000 emails - each about 30kb in size. Just queuing up this might need more than 512mb? Not sure.

  3. What would you recommend our throttling be? (SMTP-Concurrency-Send Rate)

  4. Appreciate any help. Thanks again for the great product!

@amargovin
Copy link
Author

amargovin commented Feb 18, 2023

Hi,

Ver:
2.3.0

What we've noticed since this morning:

The campaign stops sending at an arbitary point and is marked as completed. In one instance it ran for 31k subscribers out of 45k, and in another it ran for 51k subscribers/75k approx.

Is the campaign terminating after enqueuing all messages (instead of terminating after actually sending all the messages?)? Why would we suddenly see both 503 Sender already specified and this?

image

image

@smadbe
Copy link

smadbe commented Feb 20, 2023

I have exactly the same issue with a much smaller campaign (2600 subscribers), I use SES as well.

I suspect one single email is the source of the problem and cause all the next ones to fail.

@knadh knadh added the needs-investigation Potential bug. Needs investigation label Feb 20, 2023
@knadh
Copy link
Owner

knadh commented Feb 20, 2023

The smtp-pool fix should've addressed the issue of incorrect connection re-use, if that's the root cause here @amargovin. The last time this was reported was a couple of years ago and it had to do with the "from" address (#300). @smadbe are you in a position to compile and use the master branch that has this fix to see if that addresses the issue in your setup? The fix is going to be available in the upcoming release next week.

@amargovin CPU/RAM here shouldn't be a constraint. EOF is thrown when the upstream abruptly/uncleanly terminates the connection (the SMTP server itself, or a firewall in front that is rate limiting or something).

@smadbe
Copy link

smadbe commented Feb 20, 2023

Hum no, I am running it on Pikapods so I have no control on the app code :-/
I need this mailing to be sent within very short delays, is there anyway to get around it... even if it is very slow ? ("batch size" to 1? Would the connection not be reused after a batch?)
I could still export the db and run it on a docker myself... but that's an option I would like to avoid...

@knadh
Copy link
Owner

knadh commented Feb 21, 2023

You can lower the concurrency to say 2 and "Max connections" in SMTP settings to also 2. Worst case, 1 should help. This issue is fixed and will be available in the upcoming release shortly, so the need to hack around this will go away soon.

@knadh knadh closed this as completed Feb 27, 2023
@amargovin
Copy link
Author

Hi @knadh - just checking on the next release - you had mentioned that it would be next week some 2-3 weeks ago. Just wondering if its in the pipeline. We continue to have the 503 issue.

Thank you for the help!

@zwolf
Copy link

zwolf commented Dec 14, 2023

I am also seeing this on my recently deployed instance of Listmonk. Using Listmonk 2.5.1, I am attempting to send to several campaigns with 75,000 subscribers each through Amazon SES (1M/day cap, ~5k/sec max).

A campaign will begin and send several tens of thousands of emails successfully. Occasionally, errors are logged related to malformed email addresses or similar, this is expected. However, at some point in the sending process, a 501 Invalid RCPT TO address provided error is logged. Immediately following this error is a large group of 503 Sender already specified errors, all with different subscriber IDs. My LM DB is populated via an overnight sync so I’ll have to wait till I need to send to another list to investigate the addresses that cause these, but they’re not uncommon in our list of user-provided addresses.

When the campaign hit a total of 1000 errors, it paused sending. Unpausing the campaign continued with the send for another few tens of thousand before the same process repeated. I reduced the concurrency and max connections settings and retried the campaign, but this slower sending rate eventually also hit the 501, then a bunch of 503s, and paused the campaign.

Eventually, I was able to run my last 75k subscriber campaign in one go by turning off the option to pause the campaign after 1000 errors. The 50x errors still filled the logs, but the send numbers eventually got to 75,000/75,000 and deliverability was as expected (incl errors for the malformed/incorrect addresses).

I’d be happy to open a new issue if you prefer, this seems like the same problem on the newest version. Thanks very much!

@MaximilianKohler
Copy link
Contributor

MaximilianKohler commented Jan 8, 2024

Check #1629 (comment), it gets triggered by an invalid character in an email.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-investigation Potential bug. Needs investigation
Projects
None yet
Development

No branches or pull requests

5 participants