New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Utilities for debugging intelmq bots. #973

Closed

e3rd wants to merge 123 commits into certtools:master from CZ-NIC:bot_debugger

Member

e3rd commented May 12, 2017 •

edited by ghost

BotDebugger is called via intelmqctl. It starts a live running bot instance,
leverages logging to DEBUG level and permits even a non-skilled programmer
who may find themselves puzzled with Python nuances and server deployment twists
to see what's happening in the bot and where's the error.

Depending on the subcommand received, the class either

starts the bot as is (default)
processes single message, either injected or from default pipeline (process subcommand)
reads the message from input pipeline or send a message to output pipeline (message subcommand)

Further help was added to argparse help of intelmqctl:
Possible commands:

intelmqctl run bot-id (bot.start())
intelmqctl run bot-id message get (read the next message)
intelmqctl run bot-id message pop (read the next message and pop from queue)
intelmqctl run bot-id message send '{a:b}' (create message from string and send to output queue)
intelmqctl run bot-id process (process single message)
intelmqctl run bot-id process --msg '{a:b}' (process single message from string)
intelmqctl run bot-id process --dryrun (process single message from pipeline or --msg, but never really acknowledge nor send it to output pipeline)

There were commands I always wanted to have. I missed them when creating/quickly debugging the bots. If you find them useful, too, I'd be very glad to publish it in the main repository. I am open to any discussion concerning the new commands.

sebix and others added 30 commits

September 22, 2015 16:44


          ENH+TST: Squelcher bot added

71aff7a

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: squelcher config format changed

b669819

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: intelmqcli started

cd64e02

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: intelmqclt count by asn, show contact

190374f

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: integrate rt, create tickets by as

aeb0837

add attachments
less psql queries

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli improve layout

0944b74

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: UI enhancements, create and link tickets

8b8a789

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: show only not-null columns in data

c3e47d0

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli use less as pager

501f9a4

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: support for asn without contact, save new

d745c05

incidents without known contact are grouped by ASN
New or modified contact can be saved to DB

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          DOC: document workflow of intelmqcli

d534458

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: automatic sending, disable zipping of csv

fe06007

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: Save sent_at timestamp with rtir ids to events

50829af

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          BUG: squelcher: start when called

27b6cda

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: remove intelmq dependencies

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: 2 configs, recipient command adapted to old behavior

d0207d3

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: workarounds for old libraries and packages

5304e32

prettytable,
csv on python 2.x (2.6),
postgres 9.1

minor formatting issues
remove log_level from config, not used anymore

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: save recipient, show ASN once

287b97f

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: get mail body from database

558dfd0

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          Merge branch 'master' into certat

c5c86f2


          ENH: cli: better text shortening, more subtle inputs

07625f6

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          ENH: cli: send single events

02f8025

not active, needs config option first

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          Merge branch 'master' into certat

948dbc3


          TST: squelcher, truncate table on test start

73f6558

Signed-off-by: Sebastian Wagner <sebix@sebix.at>


          Merge branch 'master' into certat

ff57d73


          Update BOTS

a7268ba

Fixes certtools#359


          change file permissions for intelmqcli

f02769e


          Merge branch 'master' of https://github.com/certtools/intelmq

0dd1ffb


          Merge branch 'master' of https://github.com/certat/intelmq

d3b355a


          Typo

42bee88

e3rd and others added 10 commits

March 17, 2017 01:27


          if testing_to filled the mail gets deleted

842e816


          if testing_to filled, the mail gets deleted

d2f8eb1


          ignore testing_to

88fe873


          ignore testing_to

35dec9f


          Merge branch 'master' of github.com:certtools/intelmq

f0985e2


          Merge branch 'master' into test

efdab31


          botnet start/stop threading

aa90307

we dont have to wait 0.25 s * bot now


          botnet start/stop threading

306b28d

I removed the threading dependecy added last time. This works better: First, we start all the bots and then we wait once and then we ask all the bots for status.


          botnet start/stop threading

a7467ed

I removed the threading dependecy added last time. This works better: First, we start all the bots and then we wait once and then we ask all the bots for status.


          Utilities for debugging intelmq bots.

cf0de2b

ghost self-requested a review

May 15, 2017 08:24

ghost added this to the v1.1 Feature release milestone

ghost added the component: intelmqctl label

ghost suggested changes

View reviewed changes

ghost left a comment •

edited by ghost

The features are really great. I had something like this in mind too but never had the time to actually do it.

~~Sending messages does not work for me:~~ Wrong parameter, sry

I am missing a detailed explanation including examples for the users in docs/intelmqctl.md
Please fix the code style issues: https://travis-ci.org/certtools/intelmq/jobs/231607021#L2790

intelmq/bin/intelmqctl.py Outdated

+                          retval = 0
+                      except KeyboardInterrupt:
+                          print('Keyboard interrupt.')
+                          retval = 1

ghost May 15, 2017

A Keyboard interrupt is the usual stop method and thus it should be retval 0

intelmq/bin/intelmqctl.py Outdated

		retval = 1
		raise

ghost May 15, 2017

As the print 2 lines above is meaningless, you can remove that block (140-143) altogether.

intelmq/bin/intelmqctl.py

+                          parser_run_subparsers = parser_run.add_subparsers(title='run-subcommands')
+                          parser_run_message = parser_run_subparsers.add_parser(
+                              'message', help='Debug bot\'s pipelines. Get the message in the input pipeline, '
+                                              'pop it (cut it) and display it, or send the message directly to bot\' output pipeline.')

ghost May 15, 2017

missing s after bot\'

intelmq/lib/bot_debugger.py

+                          if message_action_kind == "get":
+                              self.instance.logger.info("Trying to get the message...")
+                              msg = self.instance.receive_message()
+                              print(msg)

ghost May 15, 2017

I'd use pprint here for a nicer output (also in the block below)

Member Author

e3rd May 17, 2017

Note we couldn't use pprint because it translates double quotes to single quotes which are not reparsable again by intelmqctl: JSON standards asks for double quotes only.

intelmq/bin/intelmqctl.py

+                                                          help='Never really pop the message from the input pipeline '
+                                                               'nor send to output pipeline.')
+                          parser_run_process.add_argument('--msg', '-m',
+                                                          help='Trick the bot to process this quoted dict '

ghost May 15, 2017

Actually it's not a quoted dict, but JSON. JSON's syntax is much more strict.

intelmq/lib/bot_debugger.py

+                                      print("Wrong formatted msg.")
+                                      return
+                                  self.instance.send_message(msg)
+                                  self.instance.logger.info("Message send to output pipelines.")

ghost May 15, 2017

s/send/sent/

intelmq/lib/bot_debugger.py

+                      def __init__(self, module_path, bot_id, run_subcommand = None, message_kind = None, dryrun = None, msg = None):
+                          module = import_module(module_path)
+                          bot = getattr(module, 'BOT')
+                          self.instance = bot(bot_id)

ghost May 15, 2017

This runs the bot's initialization. Thus, if the user only want's to get the message, this should not be run.

Member Author

e3rd May 15, 2017 •

edited by ghost

Is that really so bad? Why? Is that a bottleneck, or you mind the initialisation messages in the console? ("deduplicator-expert-cz: DeduplicatorExpertBot initialized with id deduplicator-expert and version 3.5.2 (default, Nov 17 2016, 17:05:23) as process 5560.") I may just suppress the messages.
There is so much things I have to implement if I want to connect to bot's pipelines without acually calling bot.init. So much of code that had to be reused; and I may do some small errors that would make the connection differ for debug session and for normal lifecycle of a bot. Loading configuration from different files, manually calling PipelineFactory...

ghost May 16, 2017

It's not about the log messages. Bot's are executing code in init(), e.g. connecting, loading and blocking resources etc.

For basic experts this is not really relevant, yes. But when they load and parse big files into memory, which is totally irrelevant for the message operations, that's annoying too.

To solve this we could add a second optional parameter to Bot.__init which controls the call to Bot.init.

Member Author

e3rd May 17, 2017

If Bot.init() is the only problem, it seems to me to be more easy to just strip it out by bot.init = lambda: pass before initialization occurs.
We are doing monkeypatching everywhere in this pullrequest, and I find it nicer than adding another parameter just for this debug case.

intelmq/lib/bot_debugger.py

+              from intelmq.lib.message import Event
+              from importlib import import_module
+              from intelmq.lib.utils import StreamHandler
+              from intelmq.lib.message import Event

ghost May 15, 2017

duplicate import

intelmq/lib/bot_debugger.py

+                                  try:
+                                      msg = Event(json.loads(msg))
+                                  except:
+                                      print("Wrong formatted msg.")

ghost May 15, 2017

Could be invalid data (from validation) or invalid syntax (JSON). I'd print the error message (you can use lib.utils.error_message_from_exc for this)

Member Author

e3rd May 15, 2017 •

edited

And what you think about this?

except (Exception, KeyError, TypeError, json.JSONDecodeError) as exc:                        
                        print("Message can not be parsed from JSON: " + error_message_from_exc(exc))
                        return

For the cmd intelmqctl run deduplicator-expert message send '1', the exception message will look like: Message can not be parsed from JSON: 'int' object is not subscriptable

For the cmd intelmqctl run deduplicator-expert-cz message send '{"fsd1": "test"}' we get: Message can not be parsed from JSON: '__type'

Shouldnt we use a default_type = "Message" or something in the MessageFactory.unserialize method? (Btw, thanks for letting me know about MessageFactory.unserialize, I missed that before.)

ghost May 16, 2017

Looks better. json.JSONDecodeError only exists in >= 3.5, below it's ValueError. As JSONDecodeError is a subclass of ValueError, catching only the latter one is fine.

The default type would be either Report or Event, based on the bot's group (the parameter "group"). Then the user does not need to give the message type at all and it's always correct :)

Also, for bot's without destination pipeline this throws an exception:

...
file-output: Opening '/opt/intelmq/var/lib/bots/file-output/events.txt' file.
file-output: File '/opt/intelmq/var/lib/bots/file-output/events.txt' is open.
file-output: Loading source pipeline and queue 'file-output-queue'.
file-output: Connected to source queue.
file-output: No destination queues to load.
file-output: Pipeline ready.
Traceback (most recent call last):
  File "/usr/local/bin/intelmqctl", line 9, in <module>
    load_entry_point('intelmq==1.0.0.dev7', 'console_scripts', 'intelmqctl')()
  File "/home/sebastian/dev/intelmq/intelmq/bin/intelmqctl.py", line 885, in main
    return x.run()
  File "/home/sebastian/dev/intelmq/intelmq/bin/intelmqctl.py", line 531, in run
    results = args.func(**args_dict)
  File "/home/sebastian/dev/intelmq/intelmq/bin/intelmqctl.py", line 541, in bot_run
    return self.bot_process_manager.bot_run(bot_id, run_subcommand, message_action_kind, dryrun, msg)
  File "/home/sebastian/dev/intelmq/intelmq/bin/intelmqctl.py", line 135, in bot_run
    BotDebugger(self.__runtime_configuration[bot_id]['module'], bot_id, run_subcommand, message_action_kind, dryrun, msg)
  File "/home/sebastian/dev/intelmq/intelmq/lib/bot_debugger.py", line 45, in __init__
    self._message(message_kind, msg)
  File "/home/sebastian/dev/intelmq/intelmq/lib/bot_debugger.py", line 80, in _message
    self.instance.send_message(msg)
  File "/home/sebastian/dev/intelmq/intelmq/lib/bot.py", line 332, in send_message
    raise exceptions.ConfigurationError('pipeline', 'No destination pipeline given, '
intelmq.lib.exceptions.ConfigurationError: pipeline configuration failed - No destination pipeline given, but needed

Member Author

e3rd May 17, 2017 •

edited

json.JSONDecodeError only exists in >= 3.5, below it's ValueError

Thanks, I didn't know that. I think 3.5 is quite spread now, so let's let it like this.

The default type would be either Report or Event, based on the bot's group (the parameter "group"). Then the user does not need to give the message type at all and it's always correct ☺

Am I right that Parser gets Report and others get Event (aside Collectors)? default_type = "Report" if self.runtime_configuration["group"] is "Parser" else "Event"

also, for bot's without destination pipeline this throws

Corrected, even for bots without input queue.

intelmq/lib/bot_debugger.py

+                          elif message_action_kind == "send":
+                              if msg:
+                                  try:
+                                      msg = Event(json.loads(msg))

ghost May 15, 2017

Reports are not possible. Change it to MessageFactory.unserialize to support this., takes a string.

Member Author

e3rd commented May 15, 2017

Wow, thanks a lot for such a thorough feedback. I'll be working on suggestions now and let you now in the thread!

e3rd added 5 commits

May 15, 2017 17:53


          Merge remote-tracking branch 'origin/bot_debugger' into test

d38e9a3


          first requested changes

17e9640


          working on feedback, pull 973

0c217f5


          Revert "working on feedback, pull 973"

4870f16

This reverts commit 0c217f5.


          Revert "first requested changes"

3d3bf04

This reverts commit 17e9640.

e3rd closed this

e3rd deleted the bot_debugger branch

May 15, 2017 19:14

e3rd mentioned this pull request

Bot debugging #975

Merged

6 tasks

ghost commented May 17, 2017 via email

True, haven't thought of that possibility.

ghost commented May 17, 2017 via email

|> json.JSONDecodeError only exists in >= 3.5, below it's ValueError Thanks, I didn't know that. I think 3.5 is quite spread now, so let's let it like this.|

|This code would *fail* in py < 3.5. We can talk about 3.3, but we definitely support 3.4. Lots of installations use this version. E.g. jessie has 3.4 which is the current stable version of debian. Feel free to search for other examples.|

|> The default type would be either Report or Event, based on the bot's group (the parameter "group"). Then the user does not need to give the message type at all and it's always correct ☺ Am I right that Parser gets Report and others get Event (aside Collectors)? ```default_type = "Report" if self.runtime_configuration["group"] is "Parser" else "Event"```|

|yes|

Member Author

e3rd commented May 17, 2017

Okok, I misunderstood you before, catching ValueError is definetely better!

Everything seems implemented. Please take a look at it :)

ghost modified the milestones: v1.1 Feature release, v1.0 Stable Release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment