Monitors the execution of the experiments. It receives the messages generated by the experiments through kafka, and stores the messages in a MongoDB database.
The monitor listens messages through the topic defined in the MAIN_TOPIC environment variable. Through this channel it will receive the ids of topics used by the different experiments to communicate the metrics. It will create a new thread and start listening on the new topic. The messages in these new topics will contain metrics information. When the experiment is finished a special message will be received.
The process is started executing start.sh
.
- confluent-kafka
- pymongo
- MAIN_TOPIC: Name of the topic used by the scheduler to communicate the topics used by the experiments.
- KAFKA_SERVERS: Addresses of the kafka servers.
- KAFKA_GROUP: Kafka group.
- MONGO (auto-explicative): MONGO_URL, MONGODB_DATABASE, MONGODB_USER, MONGODB_PASSWORD, and MONGO_PORT.
Currently the monitor can process messages in two formats.
It is the original format. There are two types of messages:
-
Metrics message. Each message contains the following information separated by TABs.
- Absolute time.
- Relative time (from the beginging of the experiment).
- Processed lines.
- Cost.
- Accuracy.
Any other extra field will be ignored.
-
End of experiment. The following messages can be received to indicate the end of the experiment:
<TIMEOUT_REACHED>
<ACCURACY_REACHED>
<EXPERIMENT_COMPLETED>
A new JSON format is accepted to allow further expansion. It has a variable number of types. So far
two are accepted. All messages must have a the following attributes: type
and msg
. The contents
of the msg
depeneds on the type
:
-
Type
metrics
: This message contains a list of attributes, pairs of keys and values. They will be stored in the mongo database with no further checks. It is expected that at least they have the following keys to be backwards compatible with the actual version of the scheduler:- cost.
- precision.
- experimentTime.
- time.
- epochs.
-
Type
service
: The type of message is used to comunicate the end of the experiment. So far the only accepted value is:experiment_completed
.