Skip to content

Guerrilla SMTPd is minimalist, event-driven I/O, non-blocking SMTP server. The design goal and purpose of Guerrilla SMTPd is to receive large volumes of email and nothing much else. It happily gobbles down millions of emails every week on a single processor system. (Guerrilla SMTPd is running on GuerrillaMail.com - a temporary email address serv…

Notifications You must be signed in to change notification settings

myselfhimself/Guerrilla-SMTPd

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

Guerrilla SMTPd

An minimalist, event-driven I/O, non-blocking SMTP server in PHP

Written for GuerrillaMail.com which processes tens of thousands of emails
every hour.

Copyright 2011
Author: Flashmob, GuerrillaMail.com
Contact: flashmob@gmail.com
License: MIT

Why do we need this?

Originally, Guerrilla Mail was running the Exim mail server, and the emails
were fetched using POP.

This proved to be inefficient and unreliable, so the system was replaced in 
favour of piping the emails directly in to a PHP script.

Soon, the piping solution  became a problem too; it required a new process to 
be started for each arriving email, and it also required a new database 
connection every time. 

So, how did we eliminate this bottleneck? Conveniently, PHP has a socket 
library which means we can use PHP to write a simple and efficient SMTP server.
If the server runs as a daemon, then the system doesn't need to launch a new 
process for each incoming email. It also doesn't need to run and insane amount 
of checks for each connection (eg, NS Lookups, white-lists, black-lists, SPF
domain keys, Spam Assassin, etc).

We only need to open a single database connection and a single process can be 
re-used indefinitely. The PHP server is able to multiplex simultaneous 
connections, and it can pass the email directly to the MySQL database as soon 
as it is received. (Version 2.0 also uses Memcached if available)

So now we can receive, process and store email all in the one process.
The performance improvement has been dramatic. 

However, a few months later, the volume of email increased again and
this time our server's CPU was under pressure. You see, the problem was that
the SMTP server was checking the sockets in a loop all the time. This is fine
if you have a few sockets, but horrible if you have 1000! A common way to solve
this is to have a connection per thread - and a thread can sleep while it is
waiting for a socket, not polling all the time.

Threads are great if you have many CPU cores, but our server only had two.
Besides, our process was I/O bound - so the bottleneck was not the CPU.
So how to solve this one? 

Instead of polling in an infinite loop to see if there are any sockets to be 
operated, it is much more efficent to be notified when the sockets are ready.
This Wikipedia entry explains it best, 
http://en.wikipedia.org/wiki/Asynchronous_I/O see "Select(/poll) loops"
To take advantage of epoll or kqueue would be nice, and luck has it
that an extension is available for PHP! One night later, and version 2 was made
to use libevent http://pecl.php.net/package/libevent

The improvement has been superb. The load average has decreased
substantially, freeing the CPU for other tasks. Both the disk and network
card have quite a bit of unused capacity, and we're ready for the next bump!

The purpose of this daemon is to grab the email, save it to the database
and disconnect as quickly as possible.

For version 2.0, MIME processing was removed and pushed further up the stack
since not all emails need to be processed.

This server does not attempt to filter HTML, check for spam or do any sender 
verification. These steps should be performed by other programs.
The server does NOT send any email including bounces. Again, this should
be performed by a separate program.



HOW TO USE / Installation:

- Since the server needs to use port 25, make sure to start as root/admin
- Ensure that your PHP has mb_string and mailparse extensions enabled.
also requires php sockets, iconv, eventlib
http://php.net/manual/en/book.libevent.php
http://php.net/manual/en/book.mailparse.php
http://www.php.net/manual/en/book.mbstring.php
http://www.php.net/manual/en/book.iconv.php
- Modify the 'Configuration' section in this script
- Ensure that no other server is listening on port 25
- Setup your MySQL database (schema below)

- Start on the command line like this:
root@server[] php smtpd.php -l log.txt
Arguments
-p Specify the port number
-v Verbose output to the console
-l log to this file. If no file specified, will log to ./log.txt

Finally, the server may fail from time to time.
It would need to be checked periodically and re-started if required

// Here is a simple script which can be placed on a cron job:
// (modify for your purposes)
/*
@exec ('ps aux | grep loop', $output, $ret_var);
foreach ($output as $line) {
        if (strpos($line, 'smtpd.php')!==false) {
                $running = true;
                break;
        }
}
if (!$running) {
        @exec ('sh /home/user/startsmtpd');
}
die();


Here is an example of /home/user/startsmtpd which will start the smtpd
in the background:

#!/bin/bash
/usr/bin/nohup php -d memory_limit=1024M /home/gmail/smtpd.php >> /home/gmail/smtpd_out 2>&1 &

Save the 2 lines above in to a file named startsmtpd
and then do: 
$[]chmod 755 startsmtpd

You may also place /home/user/startsmtpd in /etc/rc.local so that the
server starts when the server boots up.

(Note: /usr/bin/nohup ensures that smtpd.php stays running in the background)

Database Schema:

CREATE TABLE IF NOT EXISTS `new_mail` (
  `mail_id` int(11) NOT NULL auto_increment,
  `date` datetime NOT NULL,
  `from` varchar(128) character set utf8 NOT NULL,
  `to` varchar(128) character set utf8 NOT NULL,
  `subject` varchar(255) character set utf8 NOT NULL,
  `body` text NOT NULL,
  `charset` varchar(32) character set utf8 NOT NULL,
  `mail` longblob NOT NULL,
  `spam_score` float NOT NULL,
  `hash` char(32) character set latin1 NOT NULL,
  `content_type` varchar(64) character set latin1 NOT NULL,
  `recipient` varchar(128) character set latin1 NOT NULL,
  `has_attach` int(11) NOT NULL,
  PRIMARY KEY  (`mail_id`),
  KEY `to` (`to`),
  KEY `hash` (`hash`),
  KEY `date` (`date`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8;


Special Thanks to the authors of these pages:
http://devzone.zend.com/article/1086
http://www.greenend.org.uk/rjk/2000/05/21/smtp-replies.html
http://pleac.sourceforge.net/pleac_php/processmanagementetc.html
http://www.tuxradar.com/practicalphp/16/1/6
http://www.freesoft.org/CIE/RFC/1123/92.htm

About

Guerrilla SMTPd is minimalist, event-driven I/O, non-blocking SMTP server. The design goal and purpose of Guerrilla SMTPd is to receive large volumes of email and nothing much else. It happily gobbles down millions of emails every week on a single processor system. (Guerrilla SMTPd is running on GuerrillaMail.com - a temporary email address serv…

Resources

Stars

Watchers

Forks

Packages

No packages published