Skip to content

How to write your own collector

skovzhaw edited this page Nov 4, 2016 · 39 revisions

Design

A usage collection service that sends out usage data into RCB system is designed to communicate directly with UDR micro service. It can communicate over AMQP or HTTP, sending out JSON messages including a specific format.

Communication

It is highly recommended to push data to RCB Cyclops over RabbitMQ, but if it is really necessary you can also utilise HTTP RESTful data ingestion. Your first point of contact will be UDR micro service, that will be consuming available data frames over a predetermined queue.

Format

In order to guarantee that data ingestion functions as expected, you will be asked to follow the specified JSON structure:

  • mandatory _class field
  • optional time field
  • recommended account field
  • recommended usage field
  • recommended unit field
  • optional metadata nested object

Compulsory field

The only required and at all times mandatory field is _class, which represents an object representation and has to be always specified. This field is later reused and communicated over many different layers of the overall Rating-Charging-Billing workflow, between all the utilised micro services.

Timestamp

If you decide to provide also a time field, it needs to be in Long digital representation and not as a String text. What name you use is completely up to you, as you will need to let us know by implementing an interface in UDR micro service.

In case that you omit this time field, a timestamp will be generated on a data frame delivery fully automatically.

Recommended fields

The UDR micro service is capable of persisting any JSON messages, even nested ones. However, it is still a good practice to include an account String field, as well as usage Double and unit String values. Presence of these fields is expected in the GenerateUDR command, that takes these individual UsageData streams and generates an UDR envelope for them.

Dashboard usage

If you intend to use Cyclop's Dashboard it is highly recommended to include a source inside of metadata field, which represents graph differentiator for individual usage records.

Example

Here you can see an example of OpenStack Ceilometer CPU usage

{  
    "_class": "OpenStackCeilometerCPU",
    "time": 1467642128,
    "account": "martin",
    "usage": 20,
    "unit": "s",
    "metadata": {  
        "source": "VM1",
        "project_id": "123"
    }
}

Batch ingestion

It is highly recommended to push data in batches, where number of included data points is completely up to you. You don't have to configure anything, all the micro services are capable of batch processing. Do not forget that if you are running multiple micro service instances, every JSON array is only processed by one of the instances, effectively enabling you to split the load, let's say of 20'000 records in 5 batches of 4'000, where 5 UDR services would consume those records in parallel.

[
    {  
        "_class": "OpenStackCeilometerCPU",
        "time": 1467642128,
        "account": "martin",
        "usage": 20,
        "unit": "s",
        "metadata": {  
            "source": "VM1",
            "project_id": "123"
        }
    },
    {  
        "_class": "OpenStackCeilometerCPU",
        "time": 1467642129,
        "account": "martin",
        "usage": 50,
        "unit": "s",
        "metadata": {  
            "source": "VM2",
            "project_id": "456"
        }
    }
]