Skip to content
/ nsq Public
forked from nsqio/nsq

realtime distributed message processing at scale

License

Notifications You must be signed in to change notification settings

lifengtian/nsq

 
 

Repository files navigation

Source: https://github.com/bitly/nsq

NSQ is a realtime message processing system designed to operate at bitly's scale, handling billions of messages per day.

It promotes distributed and decentralized topologies without single points of failure, enabling fault tolerance and high availability coupled with a reliable message delivery guarantee.

Operationally, NSQ is easy to configure and deploy (all parameters are specified on the command line and compiled binaries have no runtime dependencies). For maximum flexibility, it is agnostic to data format (messages can be JSON, MsgPack, Protocol Buffers, or anything else). Official Go and Python libraries are available out of the box and, if you're interested in building your own client, there's a protocol spec (see client libraries).

The latest stable release is 0.2.18. We publish binary releases for linux and darwin.

NOTE: master is our development branch and may not be stable at all times.

Build Status

Why?

NSQ was built as a successor to simplequeue (part of simplehttp) and as such was designed to (in no particular order):

  • provide easy topology solutions that enable high-availability and eliminate SPOFs
  • address the need for stronger message delivery guarantees
  • bound the memory footprint of a single process (by persisting some messages to disk)
  • greatly simplify configuration requirements for producers and consumers
  • provide a straightforward upgrade path
  • improve efficiency

If you're interested in more of the design, history, and evolution please read our design doc or blog post.

Features

  • no SPOF, designed for distributed environments
  • messages are guaranteed to be delivered at least once
  • low-latency push based message delivery (performance)
  • combination load-balanced and multicast style message routing
  • configurable high-water mark after which messages are transparently kept on disk
  • few dependencies, easy to deploy, and sane, bounded, default configuration
  • runtime discovery service for consumers to find producers (nsqlookupd)
  • HTTP interface for stats, administrative actions, and producers (no client libraries needed!)
  • memcached-like TCP protocol for producers/consumers
  • integrates with statsd for realtime metrics instrumentation
  • robust cluster administration interface with graphite charts (nsqadmin)

Client Libraries

Additional Documentation

NSQ is composed of the following individual components, each with their own README:

  • nsqd is the daemon that receives, buffers, and delivers messages to clients.
  • nsqlookupd is the daemon that manages topology information
  • nsqadmin is the web UI to view message statistics and perform administrative tasks
  • nsq is a go package for writing nsqd clients

For more information see the docs directory.

Performance

DISCLAIMER: Please keep in mind that NSQ is designed to be used in a distributed fashion. Single node performance is important, but not the end-all-be-all of what we're looking to achieve. Also, benchmarks are stupid, but here's a few anyway to ignite the flame:

On a 2012 MacBook Air i7 2ghz (GOMAXPROCS=1, go tip 8bbc0bdf832e) single publisher, single consumer:

$ ./nsqd --mem-queue-size=1000000

$ ./bench_writer
2013/01/29 10:24:24 duration: 2.60766631s - 73.144mb/s - 383484.649ops/s - 2.608us/op

$ ./bench_reader
2013/01/29 10:25:43 duration: 6.665561082s - 28.615mb/s - 150024.880ops/s - 6.666us/op

Getting Started

The following steps will run NSQ on your local machine and walk through publishing, consuming, and archiving messages to disk.

  1. follow the instructions in the INSTALLING doc (or download a binary release).

  2. in one shell, start nsqlookupd:

    $ nsqlookupd
    
  3. in another shell, start nsqd:

    $ nsqd --lookupd-tcp-address=127.0.0.1:4160
    
  4. in another shell, start nsqadmin:

    $ nsqadmin --lookupd-http-address=127.0.0.1:4161
    
  5. publish an initial message (creates the topic in the cluster, too):

    $ curl -d 'hello world 1' 'http://127.0.0.1:4151/put?topic=test'
    
  6. finally, in another shell, start nsq_to_file:

    $ nsq_to_file --topic=test --output-dir=/tmp --lookupd-http-address=127.0.0.1:4161
    
  7. publish more messages to nsqd:

    $ curl -d 'hello world 2' 'http://127.0.0.1:4151/put?topic=test'
    $ curl -d 'hello world 3' 'http://127.0.0.1:4151/put?topic=test'
    
  8. to verify things worked as expected, in a web browser open http://127.0.0.1:4171/ to view the nsqadmin UI and see statistics. Also, check the contents of the log files (test.*.log) written to /tmp.

The important lesson here is that nsq_to_file (the client) is not explicitly told where the test topic is produced, it retrieves this information from nsqlookupd and, despite the timing of the connection, no messages are lost.

Authors

NSQ was designed and developed by Matt Reiferson (@imsnakes) and Jehiah Czebotar (@jehiah) but wouldn't have been possible without the support of bitly:

Contributors

About

realtime distributed message processing at scale

Resources

License

Stars

Watchers

Forks

Packages

No packages published