-
Notifications
You must be signed in to change notification settings - Fork 1
/
index.txt
358 lines (259 loc) · 14.6 KB
/
index.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
pymta Documentation
*******************
pymta is a library to build a custom SMTP server in Python. This is useful if
you want to...
* test mail-sending code against a real SMTP server even in your unit tests.
* build a custom SMTP server with non-standard behavior without reimplementing
the whole SMTP protocol.
* have a low-volume SMTP server which can be easily extended using Python.
.. toctree::
:maxdepth: 2
Goals of pymta
==============
The main goal of pymta is to provide a basic SMTP server for unit tests. It must
be easy to inject custom behavior (policy checks) for every SMTP command.
Furthermore the library should come with an extensive set of tests to ensure that
does the right thing(tm).
Eventually I plan to build a highly customizable SMTP server which can be easily
hacked (just for the fun of it).
Development Status
==================
Currently (06/2009, version 0.4) the library only implements basic SMTP with
very few extensions (e.g. PLAIN authentication). However, as far as I know, it
is the only MTA written in Python that implements a process-based strategy for
connection handling which is an advantage because many libraries - including
most Python DB API implementations - can not be used in an asynchronous
environment and you can use your CPUs to their fullest extent. And last but not
least pymta comes with many unit tests and good, comprehensive documentation.
'Advanced' features which are necessary for any decent MTA like TLS and
pipelining are not yet implemented. Currently pymta is used only in the unit
tests for `TurboMail <http://www.python-turbomail.org>`_. Therefore it should
be considered as beta software.
Related Projects
================
There are some other SMTP server implementations in Python available which you
may want to use if you need a proven implementation right now:
* `Python's smtpd <http://docs.python.org/library/smtpd.html>`_
* `Twisted Mail <http://twistedmatrix.com/trac/wiki/TwistedMail>`_
* `tmda-ofmipd <http://tmda.svn.sourceforge.net/viewvc/tmda/trunk/tmda/bin/tmda-ofmipd?revision=2194&view=markup>`_
* `Son Of Sam Email Server <http://www.zedshaw.com/projects/sos/>`_
* `smtps.py <http://www.hare.demon.co.uk/pysmtp.html>`_
Python's **smtpd** is a module which is included in the standard distribution of
Python for a long time. Though it implements only a *very* basic feature set
this module is used as a basis for many smaller SMTP server implementations.
In the beginning I used this module for my unit tests too but quite soon I had
to realize that the code is old and messy and it is nearly impossible to
implement a custom behavior (e.g. reject certain recipients). pymta evolved out
of smtpd after multiple refactorings based on the idea to use a simple finite
state machine (initially repoze.workflow, now including a custom one).
**Twisted Mail** is probably the most featureful SMTP server implementation
in Python available right now. It uses the twisted framework which can be
either a huge advantage or disadvantage, depending on your point of view. It can
use TLS via OpenSSL (using the Twisted infrastructure). When I started out with
the naïve idea of just extending Python's smtpd a bit, I dismissed Twisted Mail
because it seemed to be quite hard to implement some custom behavior without
writing too much code.
**tmda-ofmipd** is another implementation which is based on Python's smtpd. It is
distributed only as part of a larger Python application which makes it harder
to use if you just need a plain Python SMTP server. Furthermore the code was not
cleaned up so it may be a bit hard to understand but it supports TLS (using
`tlslite <http://sourceforge.net/projects/tlslite/>`_).
**Son Of Sam Email Server** implements an SMTP server (based Python's smtpd too)
but focuses on delivery and user info lookup. There are no changes to Python's
smtpd so the server does not support any kind of recipient verification.
**smtps.py** is a really simple, single-threaded SMTP server rewritten from
scratch with a quite clean design (compared to Python's smtpd) although it only
implements the absolute minimum of SMTP and many things like the command parsing
are just hard-coded. On the other hand, the server's behavior can be changed by
implementing a custom strategy class. `Trac included an extended version <http://trac.edgewall.org/browser/trunk/trac/tests/notification.py>`_
of smtps in its test suite.
Installation and Setup
======================
pymta is just a Python library which uses setuptools so it does not require
a special setup. To serve multiple connections in parallel, pymta uses the
`multiprocessing <http://docs.python.org/library/multiprocessing.html>`_ module
which was added to the standard library in Python 2.6 (there are backports for
Python 2.4 and 2.5). Furthermore you need to install
`pycerberus <http://www.schwarz.eu/opensource/projects/pycerberus>`_.
pymta supports Python 2.7 and Python 3.4+.
multiprocessing
---------------
The `multiprocessing <http://docs.python.org/library/multiprocessing.html>`_
module hides most the operating system differences when it comes to multiple
processes. The module is included in Python 2.6 but it is available standalone
via pypi::
easy_install multiprocessing
If multiprocessing is not installed, pymta will fall back to single-threaded
execution automatically (therefore multiprocessing is no hard requirement in the
egg file).
Architectural Overview
**********************
pytma uses multiple processes to handle more than one connection at the same
time. In order to do this in a platform-independent manner, it utilizes the
multiprocessing module.
The basic SMTP program flow is determined by two state machines: One for
the SMTP command parsing mode (single-line commands or data) in the
SMTPCommandParser and another much bigger state machine in the SMTPSession to
control the correct order of commands sent by the SMTP client.
The main idea of pymta was to make it easy adding custom behavior which is
considered configuration for 'real' SMTP servers like `Exim <http://www.exim.org>`_.
The 'pymta.api' module contains classes which define interfaces for
customizations. These interfaces are part of the public API so I try to keep
them stable in future releases. Use IMTAPolicy to add restrictions on certain
SMTP commands (check recipient addresses, scan the message's content for spam before
accepting it) and IAuthenticator to authenticate SMTP clients (check username
and password). With an IMessageDeliverer you can specify what to do with
received messages.
Problems with asynchronous architectures
========================================
The two most important SMTP implementations in Python (smtpd and Twisted Mail)
both use an asynchronous architecture so they can serve multiple connections at
the same time without the need to start multiple processes or threads. Because
of this they can avoid the increased overall complexity due to locking issues
and can save some resources (creating a process may be costly).
However there are some drawbacks with the asynchronous approach:
* SMTP servers are not necessarily I/O bound. Some operations like spam scanning
or other message checks may eat quite a lot of CPU. With Python you need to
use multiple processes if you really want to utilize multiple CPUs due to the
`Global Interpreter Lock <http://en.wikipedia.org/wiki/Global_Interpreter_Lock>`_.
* All libraries must be able to deal with the asynchronous pattern otherwise you
risk to block all connections at the same time. Many programmers are not
familiar with this pattern so most libraries do not support this. This is
especially true for most of Python's DB api implementations which is why
`Twisted implemented its own asynchronous DB layer <http://twistedmatrix.com/projects/core/documentation/howto/rdbms.html>`_.
Unfortunately by using this layer you have to use plain SQL, because the most
popular ORMs like `SQLAlchemy <http://www.sqlalchemy.org/>`_ do not support
their layer.
Given these conditions IMHO it looks like a bad design choice to use an
asynchronous architecture for a SMTP server library which should be easily
hackable to handle even uncommon cases.
Components
***********
pymta consists of several main components (classes) which may be important to
know.
PythonMTA
=========
The PythonMTA is the main server component which listens on a certain port for
new connections. There should be only one instance of this object. When a new
connection is received, the PythonMTA spawns WorkerProcess (if you have the
multiprocessing module installed) which triggers a SMTPCommand parser that
handles all the SMTP communitcation. When a message was submitted successfully,
the new_message_accepted() method of your IMessageDeliverer will be called so it
is in charge of actually doing something with the message.
You can instantiate a new server like that::
from pymta import PythonMTA, BlackholeDeliverer
if __name__ == '__main__':
# SMTP server will listen on localhost/port 8025
server = PythonMTA('localhost', 8025, BlackholeDeliverer())
server.serve_forever()
**Interface**
.. autoclass:: pymta.PythonMTA
:members:
Policies
========
.. autoclass:: pymta.api.IMTAPolicy
:members:
Here is a short example how you can implement a custom behavior that checks the
HELO command given by the client::
def accept_helo(self, helo_string, message):
# pymta will return the default error message for the given command if
# you just return False
return False
# This will send out a '553 Bad helo string' and the command is
# rejected. pymta won't send any additional reply because you did that
# already.
return (False, (553, 'Bad helo string'))
# This is basically the same as above but now it will trigger a
# multi-line SMTP response:
# 553-Bad helo string
# 553 Evil IP
return (False, (553, ('Bad helo string', 'Evil IP'))
Authenticators
==============
.. autoclass:: pymta.api.IAuthenticator
:members:
Deliverers
==========
.. autoclass:: pymta.api.IMessageDeliverer
:members:
Message
=======
The Message is a data object contains all information about a message sent by
a client. This includes not only the actual RFC822 message contents but also
information about the SMTP envelope, the peer and the helo string used. The
information is filled as the client sends some commands so not all information
may be available at any time (e.g. the msg_data not available before the client
actually sent the RFC822 message).
Peer
====
The Peer is another data object which contains the remote host ip address and
the remote port.
SMTPSession
===========
This class actually implements the most complicated part of the SMTP state
machine and is responsible for calling the policy. If you want to extend the
functionality or need to implement some custom behavior which is beyond what you
can do using Policies, check this class.
The SMTP state machine is quite strict currently but I consider this a feature
and not something I'll try to improve in the near future.
Unit Test Utility Classes
=========================
pymta was created to ease testing SMTP communication without the need to set up
an external SMTP server. While writing tests for other applications I created
some utility classes which are probably helpful in your tests as well...
.. autoclass:: pymta.test_util.BlackholeDeliverer
:members:
.. autoclass:: pymta.test_util.DebuggingMTA
:members:
.. autoclass:: pymta.test_util.MTAThread
:members:
.. autoclass:: pymta.test_util.SMTPTestCase
:members:
Example SMTP server application
===============================
In the examples directory you find a pymta-based implementation of a debugging
server that behaves like `Python's DebuggingServer <http://docs.python.org/library/smtpd.html#debuggingserver-objects>`_:
All received messages will be printed to STDOUT. Hopefully it can serve as a
short reference how to write very simple pymta-based servers too.
Speed
=====
If you want to use pymta for a real SMTP server, you should not be concerned too
much about speed. If you go really for a high-volume setup with several million
messages per day and hundreds of simultaneous connections, you should tune one
of the well-known SMTP servers like Exim, Postfix or sendmail to get the maximum
performance. However, I measured theoretical peak performance using
`Postal 0.70 <http://doc.coker.com.au/projects/postal/>`_ to give you some
theoretical figures.
Environment and benchmark settings:
* System: Fedora 10 with an AMD x2 4200 (2.2 GHz), Python 2.5
* pymta: version 0.3, DebuggingServer with NullDeliverer and no policy.
* postal: 4 threads, no SSL connections, one message per connection (defaults)
With that configuration I got something between 1540-2270 messages per minute
(median 1879 messages) which is actually quite low. Many real SMTP servers would
deliver something between 5,000-10,000 messages per minute in a comparable
setting [#]_. During my measurements the system load was barely noticable (below
5%) so I guess most of the time is lost waiting for locks. Using a really fast
IPC mechanism or a custom PythonMTA implementation that uses the os.fork would
probably increase the throughput by quite easily.
.. [#] However, as soon you add some more complicated database queries or spam
and virus checks to that, the real throughput will decrease dramatically
(even if the scanning takes only 0.1 seconds per message you won't
exceed 600 messages per minute). In real setups the bare SMTP speed does
not matter that much.
License Overview
================
pymta itself is licensed under the very liberal `MIT license <http://creativecommons.org/licenses/MIT/>`_
(see COPYING.txt in the source archive) so there are virtually no restrictions
where you can integrate the code.
However, pymta depends on some (few) other packages which come with different
licenses. In order to ease license auditing, I'll list the other licenses here
(no guarantees though, check yourself before you trust):
* `Python <http://www.python.org>`_ uses the
`Python Software Foundation License 2 <http://www.python.org/download/releases/2.4.2/license/>`_
which is a BSD-style license.
* The `multiprocessing <http://docs.python.org/library/multiprocessing.html>`_
uses a `3-clause BSD license <http://creativecommons.org/licenses/BSD/>`_.
* `pycerberus <http://www.schwarz.eu/opensource/projects/pycerberus>`_ uses
the MIT license, just like pymta.
I believe that all licenses are GPL compatible and do not require you to publish
your code if you don't like to.