Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing to add new samples to model #17

Open
enzok opened this issue Feb 5, 2018 · 2 comments
Open

Failing to add new samples to model #17

enzok opened this issue Feb 5, 2018 · 2 comments

Comments

@enzok
Copy link

enzok commented Feb 5, 2018

Using the following to initialize and add new samples to the default model:

from mmbot import MaliciousMacroBot

opts = {'benign_path': 'benign',
        'malicious_path': 'malicious',
        'model_path': 'model'}

mmb = MaliciousMacroBot(retain_sample_contents=True)
mmb.set_model_paths(opts["benign_path"],
                    opts["malicious_path"],
                    opts["model_path"])
mmb.mmb_init_model(modelRebuild=True)

Get the following error:

Traceback (most recent call last):
  File "./build2.py", line 17, in <module>
    initresult = mmb.mmb_init_model(modelRebuild=True)
  File "/usr/local/lib/python2.7/dist-packages/mmbot/mmbot.py", line 712, in mmb_init_model
    newdoc_cnt = self.load_model_data(exclude)
  File "/usr/local/lib/python2.7/dist-packages/mmbot/mmbot.py", line 346, in load_model_data
    newdocs = temp[temp['extracted_vba'].isnull()].copy()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2139, in __getitem__
    return self._getitem_column(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 2146, in _getitem_column
    return self._get_item_cache(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 1842, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 3843, in get
    loc = self.items.get_loc(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexes/base.py", line 2527, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: u'extracted_vba'

The model builds fine if it doesn't already exist.

@egaus
Copy link
Owner

egaus commented Mar 6, 2018

Hi @enzok, the current library doesn't support adding onto the canned model, only creating entirely new models from scratch. I intentionally removed the source macro data that went into the model because there were a number of samples that had sensitive information in them and I didn't want to leak it. Among other things, some of the data contained company names and connection strings to databases with credentials etc.

I'm hoping to revisit this library soon and refactor it. I'll look at ways I can allow for adding onto the model that comes with it, without providing all the raw source data that went into building the initial model. I realize incremental updates would be helpful for users, rather than starting from scratch.

@enzok
Copy link
Author

enzok commented Aug 1, 2018

hi @egaus, any chance you've had time to look at including incremental updates to the model?

cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants