A distributed cuckoo service for CRITs
Go Python HTML
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
check_results
etc/cuckoo_distributed_service
feed_cuckoo
lib
overseer
parse_and_submit
.gitignore
LICENSE
README.md

README.md

A distributed cuckoo service for CRITs

A common problem with CRITs is handling multiple long running services since CRITs is designed to keep connections to the service open until a response is sent. This can lead to a quick saturation of available resources which slows down the whole ecosystem significantly.

Since we hit this problem especially with cuckoo we decided to move the whole service to a distributed and better suited model.

Concept

The whole service consists of four microservices which communicate using AMQP. The services are connected to cuckoo and CRITs using their API over HTTP.

cuckoo_distributed

The cycle of a new sample looks as follows:

  1. CRITs runs the service
  2. The service sends an AMQP message
  3. The message is received by feed_cuckoo
  4. feed_cukoo schedules and performs the upload to cuckoo
  5. On success feed_cuckoo sends another AMQP message to check_results
  6. check_results periodically checks the status of the analysis
  7. When it's done check_results sends another AMQP message to parse_and_submit
  8. parse_and_submit evaluates the results and sends the results back to CRITs

As you can see there are three microservices mentioned here. The last one is not mentioned because it's the only optional microservice. It's called overseer and does exactly that. Whenever a microservice fails it relays the failed message to the overseer, he'll then resubmit the failed message three times, if it fails more than that it will dump the failed message into a text file for further analysis what went wrong. This makes the whole system very failsafe since none of your messages will actually get lost at any point in time.

Setup

Setting up the services is pretty easy. The only thing you need besides a working Cuckoo and CRITs instance is a AMQP server, we recommend RabbitMQ which is really easy to install and setup.

1 Install dependencies

RabbitMQ

Go

That's all. If you don't plan to use RabbitMQ for anything else you can safely use the default guest user. If this is not the case it is advised to create a new user.

2 Build the microservices

You'll need a working Go compiler and environment (v1.5) to compile the code.

go get github.com/cynexit/cuckoo_distributed
go get github.com/streadway/amqp

After this you can simply get into the directory of each service and build it using

go build

Since this results in a statically linked binary you can distribute these to other servers if you so desire.

Configurations

Each service has it's own config file. You may pass the path to the config file to each service via the config flag. If the flag is not set the service will look in it's current directory for a file called SERVICENAME.conf.json.

Common options

Lines found in most config files are

Amqp
AMQP connection string in the form of `amqp://USER:PASSWORD@HOST:PORT`
ConsumerQueue
The name of the queue to consume from
ProducerQueue
The name of the queue to produce to
FailedQueue
The name of the queue to send failed messages to
VerifySSL
Check HTTPS certificates
LogFile
Full path to the log file OR empty to use only stdout
LogLevel
Choose between `debug`, `info`, and `warning`

feed_cuckoo.conf

CheckFreeSpace
Only send new samples to Cuckoo if there is enough free space
CuckooURL
The url of your Cuckoo instance in the form of "https://cuckoo.your.network:PORT",
PrefetchCount
How many files should be handled simultaneously (Recommended: 1)
MaxPending
Only send new samples to Cuckoo if there are less "pending samples" than this value

Caution: To use CheckFreeSpace Cuckoo needs to be of version 1.3 or higher! If you still use 1.2 (like most people do) please patch utils/api.py (configure reports and samples):

def cuckoo_status():
    paths = dict(
        reports="/data/cuckoo/storage/analyses",
        samples="/data/cuckoo/storage/binaries",
    )

    diskspace = {}
    for key, path in paths.items():
        if hasattr(os, "statvfs"):
            stats = os.statvfs(path)
            diskspace[key] = dict(
                free=stats.f_bavail * stats.f_frsize,
                total=stats.f_blocks * stats.f_frsize,
                used=(stats.f_blocks - stats.f_bavail) * stats.f_frsize,
            )

    response = dict(
        version=CUCKOO_VERSION,
        hostname=socket.gethostname(),
        machines=dict(
            total=len(db.list_machines()),
            available=db.count_machines_available()
        ),
        diskspace=diskspace,
        tasks=dict(
            total=db.count_tasks(),
            pending=db.count_tasks("pending"),
            running=db.count_tasks("running"),
            completed=db.count_tasks("completed"),
            reported=db.count_tasks("reported")
        ),
    )

    return jsonize(response)

check_results.conf

PrefetchCount
How many samples should be checked for completion in the main loop? (Recommended: 100)
WaitBetweenRequests
Seconds to wait between each request to a Cuckoo instance? (Recommended: 5)

parse_and_submit.conf

PrefetchCount
How many samples should be handled at the same time? (Recommended: 10)
PushApiCallsMax
How many of the found API calls should be send to CRITs?
CuckooCleanup
Delete the sample and results from Cuckoo on finish (frees space on disks)
EnabledParsers
Which information should be parsed? `info`, `signatures`, `behavior`, `dropped`

ConsumerQueue and ProducerQueue are different when it comes to this service since you can actually "chain" multiple instances of this service. This is useful if you don't want one service to parse all the information at once but just a small and fast subset.

Example: Parsing info, signatures, and behavior is rather fast, parsing dropped files is rather slow (since each dropped file needs to be uploaded to CRITs). So you probably want to have two instances of parse_and_submit running: One which will receive the initial "sample is analyzed" message from check_results and then starts to parse everything except dropped files and then another instance which receives another message from the first instance and then parses only the dropped files and deletes the analysis results from Cuckoo on success.

The two config files for this example would look as follows:

For the first instance:

{
	"Amqp": "amqp://guest:guest@localhost:5672",
	"ConsumerQueue": "worker/parse_and_submit",
	"ProducerQueue": "worker/parse_and_submit_dropped",
	"FailedQueue": "worker/failed",
	"VerifySSL": true,
	"PrefetchCount": 10,
	"PushApiCallsMax": 5000,
	"CuckooCleanup": false,
	"EnabledParsers": ["info", "signatures", "behavior"],
	"LogFile": "/var/log/parse_and_submit.log",
	"LogLevel": "info"
}

For the second instance:

{
	"Amqp": "amqp://guest:guest@localhost:5672",
	"ConsumerQueue": "worker/parse_and_submit_dropped",
	"ProducerQueue": "worker/parse_and_submit",
	"FailedQueue": "worker/failed",
	"VerifySSL": true,
	"PrefetchCount": 10,
	"PushApiCallsMax": 5000,
	"CuckooCleanup": true,
	"EnabledParsers": ["dropped"],
	"LogFile": "/var/log/parse_and_submit_dropped.log",
	"LogLevel": "info"
}

As you can see the second instance consumes from the producer queue of the second instance. If ProducerQueue is empty the message will not be relayed further.

overseer.conf

ConsumerQueue
The queue you used as `FailedQueue` everywhere else
PrefetchCount
How many messages should be parsed simultaneously? (Recommended: 100)
DumpDir
The folder to dump messages into that failed three times.

Multiple instances

Running multiple instances of any microservice is very easy: just lunch them! If you see that one service is taking much more time just spawn another instance on another service and the AMQP broker will handle dividing the load between both. This also means that if you have more than one Cuckoo instance you can simply start another instance of feed_cuckoo and pass it a new config with a different Cuckoo URL. This way it is really easy to scale your analysis if needed.

So a more complex scenario would look like this:

cuckoo_distributed_complex

Known bottlenecks

There are two known bottlenecks, Cuckoo and CRITs.

Cuckoo is easy: Just create a new master and set up more workers and a new feed_cuckoo instance, done.

Now with CRITs it's another story: The API is kind of slow and you need to make sure that your mongo cluster is rock solid and your webserver is configured adequate to your expected workload. Setting up CRITs in daemon mode can help here but just be aware that if parse_and_submit is going slowly it's most likely a problem with CRITs. Set the log level to debug to get timing information.

Known issues

  • The samples are send over via AMQP not via CRITS API