New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feedback to new filter API + questions regarding tx-data/tx-commit/tx-rollback/tx-reset #1022
Comments
I poked @poolpOrg about documenting the opensmtpd threading and scaling models a while back for that reason. It's hard to understand how many instances of your filter might receive parallel submissions of sessions. It appears that you have one instance of a proc-exec filter, which means that it must handle multiple parallel submissions in multiple sessions, while session ids will not overlap. At least that's how I understand it right now :) |
This doesn't really sound useful to me, what problem will it solve ?
Unsure trying to compute such values is a good idea but you can already do it. The handshake will send you the timeout delay for a session:
The timeout event will have the timetsamp of when the event was generated, so:
Yes,
Yes, you can have multiple sessions in parallel, but only one transaction (message) can happen for a session at a given time. |
I don't understand what you want me to document actually. OpenSMTPD forks your filter as a standalone process ... and that's all. There's no such thing as threading/scaling model or "how many instances": if you have a proc-exec line in your config, then you have a standalone process and OpenSMTPD will communicate with it through stdio (stdin, stdout, stderr). If you worry that a single process can't handle the load, then your filter is free to use it's own concurrency model be it threading, multi-processing, coroutines, etc... but this is beyond the scope of OpenSMTPD and in most cases this will be overkill. A filter is a simple process running an infinite loop, reading commands on stdin and possibly replying on stdout, it doesn't get any simpler than that. |
As mentioned - it’s completely useless to echo the whole mail back and waste cpu time parsing the whole stuff twice. I think K.I.S.S. is a useful thing. In a real world scenario if one is telling a listener a story you don‘t want to have the listener reply every-single-word back to the storyteller, it just doesn‘t make sence. IMHO it also doesn‘t make sence for the data-line event. If i never want to change a line in my filter (just listen to a report), why should i waste resources to echo back everything? You mentioned the behaviour as tipp for my filter-archive with the „data“ event where i first used the „data“ filter instead of „tx-data“ report event. In that case it‘s OK to have the tx-data report event and use it instead of „data“ filter event. Why then isn‘t it OK just to have „tx-data-line“ report event additionally to „data-line“ event which needs a reply? In general for a better (logical) understanding of the filter/report API it would feel more natural to have the events (every event) just in two states available:
Meaning: Having all events with the equal same name which just can be registered as as „report“ or „filter“. Otherwise one have to look what event is available as filter or as report every time something needs to be changed/added. I ran into such „bug“ with filter-archive. timeout:
Also thank you for the clarification of tx-* events. That makes the situation more clear for me, it seems that i understood it correct.
Even if that question was from @jdelic i also wasn‘t aware of that. It is not obvious clear that OpenSMTPD is just starting one instance of a filter. As a reader may understand it, it may be possible that OpenSMTPD just starts more than one instance of a filter to have better performance and parallel proceeding (which doesn‘t collide with a simple filter-loop via stdin/stdout) . Ticket can be kept closed. |
I'm sorry but pretty much everything regarding your first point is wrong. I'll try explaining as simply as possible but it's hard to do so at the right level without assuming that you have at least some understanding of how the API works and why it works that way.
For a reason I'll explain shortly lines have nothing to do in reports and furthermore your assumption that this would reduce the CPU usage is wrong. Generating lines for both reports and filters would not only waste far more CPU than you think but would also make code CONSIDERABLY trickier. First of all, reports and filters have specific use-cases and HAVE to be issued in a specific order because all filters MUST get the same reporting events regardless of what filters want to do with their input. This means that to do what you want, OpenSMTPD would have to send each lines to filters twice... Not only this would at least double the CPU cost per line but it would be far from K.I.S.S.: writing the lines twice has a very big impact on the code. Today, OpenSMTPD streams raw client lines to a pipe and expects to read back a stream, possibly altered. The model looks very close to a simple unix pipe like Regardless of what filters do, this would kill performances as clients would have to be paused at each line while OpenSMTPD ensures all filters have acknowledge receiving the last line before generating a report for the next one. Today, clients lines are just written as a stream to the pipe using stdio buffering, we would no longer be allowed to do that (which is more CPU-friendly by the way) but would need to explicitely pause at each line for the double-write to happen safely. There are other shortcomings due to the fact that DATA content is not a regular SMTP phase, that there's no command/response, that there's no guarantee that a line of input will generate a line of output, and that these makes it impossible to guarantee that all filters receive relevant reporting lines. This is not a bug or a problem that's hard to overcome, that's a side effect of why lines have nothing to do in reporting events. I'll get to that.
The problem with your analogy is that it's NOT identical to what you're trying to abstract. A more valid analogy would be the following: If A tells B a story while C is not listening, B must repeat the story to C. In this case, A is mapped to a client, B to a filter and C to a message file. The filter's input is the client and the message's input is the filter: A -> B -> C If B doesn't echo back, C will have no input, plain and simple.
Because what you miss here is that the events that fall in the reporting API are those that relate to an SMTP session state change. In simpler words, a filter should be able to see a session the same way that OpenSMTPD sees it, and session events that cause a state change in the daemon also generate a report event so that they can cause the same state change in a filter. In your case, you had All transaction events, including DATA, cause a state change. The data lines don't, they serve no purpose in terms of session state, that you send one line or one thousand lines, that they contain a particular keyword or not, this doesn't alter the session. What alters the session though is the In light of this, the answer to your question is simple:
A man page is in progress to document how the API works smtpd-filters(7).
I have added the word "unique" a few minutes ago to make it more clear that there's only one instance for each filter. |
That sentence right there in my opinion very much defines a threading/scaling/execution model for filters, however you want to call it. it's a bit orthogonal to the other ongoing discussion, but here is what I believe the filter documentation should make clear about filters, i.e. what the ideal documentation would contain from my point of view. Some of this is already documented, of course:
And some context for my original question: |
Regarding the filter-api as you have described it in the last comment - yes. But without documentation one has to take assumptions which may lead to wrong beliefs regarding a topic.
It's not needed to explain as simple as possible, it's just easier to understand if there is something to understand which is explained somewhere. You did that perfectly with the last comment and can take most of the information and add it to smtp-filters.7. Just to mention it as suggestion: I also come from qmail and it was helpful for me every time to have some sort of "big picture" image/pdf/ascii art which exists for qmail in different architecture view, that's also available for Postfix. It doesn't need to be perfect or fully complete but it will explain most of the questions that will occur. Having a big picture of OpenSMTPD and the message flow will maybe minimize upcoming questions in this issue tracker (but that's purely guessed).
Maybe a "most important" information.
As mentioned above, important information: C (OpenSMTPD) NEEDS the information back because it doesn't buffer anything. Without replying it back the message/mail will completely be lost. Again, thank you for explanation/clarification.
The last part (filter error will take down the whole server) also makes me nervous but in fact that's ridiculous: |
I don't follow that argument at all. We're talking about mail servers, so there is ample previous art in the space. qmail and postfix both issue temporary failures to the upstream in this case, which is specifically in the SMTP spec for this situation. An example off the top of my head is easy, because it happened to me with qmail and amavisd like a decade ago: If my filter which inspects mime attachments runs into trouble because of a zip bomb, it dies. This happens on 1 out of 5000 emails currently in parallel delivery in the queue because of a bug in the zip file handling in the filter. What is the expected behavior here? You want to tell me that "not delivering 4999 emails" is the only safe choice? This means taking the overhead of restarting the whole server (gracefully shutting down in-flight, safely writing to disk, then reinitializing the whole queue envelope state from disk when systemd on-failure handling restarts at which point the email might take down the filter and by extension the whole server again?) over restarting the filter process, not to mention 4999 emails not being delivered in the meantime. My point of view is that you should decline to deliver that one email and continue with the others. Incidentally this is how qmail works and also how postfix handles the situation depending on Finally, it should be mentioned that writing code that is resilient against abnormal termination in the face of unpredictable input is impossibly hard. Making that design choice here seems wholly unnecessary to me as SMTP has transactional retries built in, including exponential back-off on the client side. And alerting the sysadmin by taking down the service... that's bad design. That's what logs are for. Sysadmins worth their salt have log watchers in place. I don't believe that you as a developer should make your program worse to address these things. Instead write documentation, point to best practices and support commonly used software that solves the problem (for example, include a rsyslog config that alerts the admin in the default for filter errors). |
@jdelic Personally we seem to share a near/equal opinions when I think about where I would like to use OpenSMTPD. I don't want to be rude to anybody (because OpenSMTPD feels so natural and great as replacement for qmail to me - postfix and me don't seem to be friends anytime) so nobody should misunderstand me but: When designing software which calls external "unknown" programs I would never trust external software and never would let it crash "me" (meaning my software). OK - returning temporarily errors to the smtp session and restarting the filter may be also possible but I think a complete OpenSMTPD service restart from the outside instead of a plain filter restart isn't that different or critical but a matter of taste. And also, the handling (of a temporarily non available smtp server even for some seconds) is part of the smtp rfc. Sort of... in summary - it's still just a design decision and matter of taste (and some different steps for restarting). @poolpOrg made it clear to me that these things are an expected behavior for him. |
You can certainly write a filter that returns a Temporary Failure if amavis is swamped, if Rspamd is unreachable or if some random errors occur during runtime. Try installing What you're not allowed to do is to have your filter itself crash and there are multiple reasons for that. The main one is that a piece of code that communicates with a daemon as sensitive as an MTA is just not allowed to be so buggy that it can't keep up a simple contract of not exiting an infinite loop. OpenSMTPD requires a minimal level of quality from filters and that level is "don't violate the protocol / don't exit". That doesn't mean that no crashes will happen, but it means that developers should do their share of testing upfront and be reactive to fix issues when reported by users. Filters that are unstable and don't get fixed will naturally disappear as people stop using them, filters that get proper testing and fixing will be increasingly reliable. If you absolutely HAVE to work with a buggy software or library, you can still write a filter which forks a child process to hold the buggy code and restart it when needed, but the filter process itself must be an infinite loop that never exits. That's our contract.
Sorry but I don't buy that. First of all, OpenSMTPD itself has the exact same constraint. Processes are not allowed to exit, it is considerably more complex than most filters and yet you won't see it aborting abnormally every two morning. When someone reports an abnormal abort, we track it, fix it and the issue is gone forever. This is what we want for filters. Then, filters get far more predictable input than OpenSMTPD as it does sanitizing before passing them. Filters are in a much more friendly environment, there's not much excuse to crash when the input is sanitized upfront. Finally, I wrote four or five filters in a row that handle various use-cases at different phases, most in a language that I was unfamiliar with, and I haven't experienced or had someone report a single crash ever since I released them. I'm not the only one, people have started playing with filters from day one in various languages such as C, Go, Rust, awk, shell or python and they are rock solid. If we allow broken filters, when one starts misbehaving we won't know if it's because of a memory corruption that could cause mail loss, if it's because of an attacker trying to exploit a vulnerability or if it's due to low code quality. Requiring a minimal level of quality benefits us all.
That is your opinion, not mine. Myself, I consider strictness to be good design and this level of strictness is the one we impose ourselves in OpenSMTPD. It is far more dangerous to let a daemon run with misbehaving code. If it is misbehaving, either the code is unpredictable in the first place or it's being tricked into doing something phishy, in both situations we shouldn't let it run. I don't want to make it easy for people to write bad filters, I want to make it easy for developers to spot that a filter is misbehaving during development and I want to make it harsh when they misbehave so that it is not acceptable to release a half-baked filter. I want to make buggy filters unusable so that if a bug is found, it HAS to be fixed. I don't want to see an advisory pop-up later because someone exploited a filter that was buggy and that we let run out of convenience. Your point is that because filters can be buggy the daemon should be friendly and let them fail gracefully, my point is that _because we provide a strict environment _ we will make the buggy ones unusable and raise the overall quality. This is not just a wild guess, this is an approach that proved to work multiple times in the past. Time will tell you if I'm right but by experience I know that this is the right approach.
You know I already write documentation, code and articles, right ? :-) |
Some new questions and feedback:
The text was updated successfully, but these errors were encountered: