GitHub - chrishulbert/MegaComet: A comet server that will (hopefully) be able to handle 1 million (eg mega) connections.

chrishulbert / MegaComet Public

Notifications You must be signed in to change notification settings
Fork 1
Star 20

A comet server that will (hopefully) be able to handle 1 million (eg mega) connections.

splinter.com.au/tag/comet

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
testing		testing
.gitignore		.gitignore
config.h		config.h
khash.h		khash.h
klib docs.txt		klib docs.txt
klist.h		klist.h
makefile		makefile
megacomet.c		megacomet.c
megahash.html		megahash.html
megamanager.c		megamanager.c
megastart.c		megastart.c
readme		readme
start		start
status		status
stop		stop
testfauxapp.rb		testfauxapp.rb
testfauxworker.rb		testfauxworker.rb

Repository files navigation

MegaComet
---------

"Mega- (symbol M) is an prefix in the metric system denoting a factor of million"

This is my (ambitious!) attempt to create a comet server that can handle 1 million concurrent connections on an Amazon EC2 'large' server.

It requires libev to compile. So far, only being developed on my Mac. When it gets good enough, i'll make it work on an EC2 box.

https://github.com/chrishulbert/MegaComet
http://software.schmorp.de/pkg/libev.html

Architecture
------------

My ideas are:
* Simple = scalable ("no code is faster than no code")
* Scale through processes, not threads (like node.js - although i have to admit this is largely for simplicity's sake too)
* Bend the rules of HTTP a bit, for the sake of speed (eg no 'date' header returned)
* Best-effort delivery of messages. Eg this is to be appropriate for real time chat and channel changing, not appropriate for financial transactions. Engineering is all about trade-offs.
* etc... (you get the picture: hacky solutions over correctness)

In order to do this, i picture the following on an EC2 'large' server:

* 8 instances of MegaComet workers running, each listening on a different port, each of which has ~100k connections active.
* TO TEST: At this stage, 8 is a guesstimate for the 'best' number of workers. Maybe make it configurable? Would 128 be best, so we could keep each client below the C10K barrier?
* One instance of MegaComet manager running. This is responsible for remembering which client is connected to which worker.
* Clients will randomly long-poll (JSONP) directly to one of the 8 workers. Upon connecting, the worker tells the manager it has that client. Upon non-normal (only) port closing, the worker tells the manager that client is gone.
* Manager queues messages. Whenever a worker says a client has connected, it sends it any messages queued. Whenever a message arrives, it checks to see if that client is connected already, and if so sends it straight through. If the client isn't connected, the message is queued for a configurable time (default 1 minute).
* Manager > Client to communicate using simple fixed-width-fields binary protocol
* Your app > Manager to communicate using JSON
* Manager not necessarily written in C ? Something simple eg C# or Java or Python or Ruby ?
* TO TEST: Will the single server become a bottleneck? What if we allowed >1 ?

* TODO idea: rather than have the megamanager keep track of which client is connected to which worker, should we have a reproducible hash of the client id which always determines the worker? Is this necessary?

Manager protocol
----------------

Workers are expected to connect to the manager and specify their worker number:
1 n
Where N is the worker number (single byte)
When the app wants to send a message to the manager to send to the appropriate worker:
2 c m
Where c is the client identifier. This is a null terminated ascii string (or utf8).
Cannot contain a '.' for simplicity of parsing the HTTP messages.
TODO will null termination work with utf8?
m is the message, as a null terminated ascii/utf8 string.