Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
Collector is the stage to ingest events, it have a restful endpoint to ingest the event with json format. The ingestion can be single json event or a list of json events.
The Collector has a pluggable Validator, if validation failure, it will return error message to client. If validation passed, it will do geo enrichment and device classfication via Esper EPL. Then flow the events to next stage sessionizer.
The geo enrichment use Maxmind geolite2 geo db, and the device classfication used http://uadetector.sourceforge.net/
This product includes GeoLite2 data created by MaxMind, available from http://www.maxmind.com.
The collector with a sample event model for user behavior tracking, the model can be extended.
- The ipv4/ipv6 is required for the geo enrichment
- ua is required for the device classfication.
- si means stream id, it is mandatory
- ct means capture time, it was used to identify the event timestamp, it is optional. If it is missed, system current time will be used.
Raw Event tags
- Stream Id - si
- Tenant - tn
- Origin - or
- Capture time - ct
- User Agent - us
- IPV4 address - ipv4
- IPV6 address - ipv6
- Referrer - rf
- Event type - et
Only capture time is Long, others are String.
- City - _cty
- Continent - _con
- Country - _cn
- Region - _rgn
- Longitude - _lon
- Latitude - _lat
- Country ISO Code - _tlcn
- Device category - _dd_dc
- OS Family - _dd_os
- OS Version - _dd_osv
- User Agent Family - _dd_bf
- User Agent Type - _dd_d
- User Agent version - _dd_bv
There is a pluggable Validator to validate the events. See Validator
This is a jetstream app which can be run on the docker. It will expose below ports:
- 9999 for monitoring
- 8080 for rest end point
- 15590 for the Inbound replay message.
The rest end point path:
- /pulsar/ingest/PulsarRawEvent - For single event
- /pulsar/batchingest/PulsarRawEvent - For batch event
Both request and resposne Payload will be in json format. batch will be in a json array format.