Skip to content

Operational Functions

jarretraim edited this page Feb 22, 2013 · 1 revision

The following six functions make up the common, shareable portion of Meniscus. The goal is to produce underlying infrastructure suitable for multiple use cases. This section sets out the requirements for a general solution to the functional areas.

Collection

The collection portion of Meniscus is responsible for facilitating the entry of events into the system. As there will be many consumers of the eventing architecture, the collection function must support multiple schemes for moving events into the product. To achieve this goal, the following requirements should be met.

  1. The customer should not be required to install an agent on the host, but may choose to do so.
  2. The customer must be able to use the standard syslog system as well as its various implementations rsyslog and syslog-ng.
  3. The customer should be able to integrate their application into the logging system directly.
  4. Various flavors of Linux and Windows must be supported, either directly or through additional software.

Transport

The Transport function (or event grid) of the system is responsible for moving data that has been collected from hosts to the final destination for the data. The transportation system has several requirements both for functionality and scalability to be able to meet expected needs.

  1. The grid should be resilient to multiple failures in the system.
  2. The grid should be capable of routing events to one or more sinks based on the source of the event, the customer or the content of the message.
  3. The grid should provide a platform on which event filtering and enhancement can be performed.
  4. The grid should auto-scale to meet predefined SLAs for transit speed and failure tolerance.
  5. The grid should support multiple service levels that offer choices in event durability, processing speed, and security.
  6. The grid must support both transit and storage encryption.
  7. The grid’s source nodes must support compression of the event stream.

Storage

The logging system should support a selection of output sources that provide both processing and storage of data. Expected data output sinks should include:

  • ElasticSearch
  • Hadoop / HDFS / HBase
  • Cloud Files
  • NoSQL (MongoDB)
  • Event Processing & Enhancement

The logging system should provide a way in which a customer can perform some processing on individual events.

  1. The system should provide an interface to identify processable events by hostname or message contents.
  2. The system should allow events to be processed, dropped and augmented with additional data.
  3. The system must allow all event processing to be performed in Python.
  4. The system must prevent event processing for a single tenant from impacting other tenants.
  5. Note that this set of features is designed to allow processing on single events without dependency. This will allow the processing to be parallelized without damaging the integrity of the event stream.

Complex Event Processing

Complex event processing is the extension of the event processing section above. While event process is designed to only operate on a single event at a time, complex event processing is designed to allow processing on the stream as a whole.

  1. The system should support send or duplicating an event stream (or subset of an event stream) to a complex event processor.
  2. The system should allow the complex event processing infrastructure to filter events out of the event stream or place new events into the stream.
  3. The system must allow all complex event processing to be performed in Python.
  4. The system must prevent complex event processing for a single tenant from impacting other tenants.

Analytics

Analytics is the final stage for event processing. While both event processing and complex event processing operate on an event stream, analytics operates on an arbitrary slice of the stored event data. The logging system will not provide analytics directly. Instead, it will use the Analytics or Hadoop products to enable this type of processing.

  1. The system should support delivering events to analytics sinks including HDFS and HBase.
  2. The system should help optimize analytics usage by manipulating file sizes or using internal buffering.