Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mongodb reporting error #9

Closed
enzok opened this issue Jun 21, 2017 · 13 comments
Closed

Mongodb reporting error #9

enzok opened this issue Jun 21, 2017 · 13 comments

Comments

@enzok
Copy link
Contributor

enzok commented Jun 21, 2017

Not sure what's going on here, any ideas?

2017-06-21 15:21:20,505 [modules.reporting.mongodb] WARNING: results['procdump']['yara'] deleted due to >16MB size (29MB)
2017-06-21 15:21:20,506 [lib.cuckoo.core.plugins] ERROR: Failed to run the reporting module "MongoDB":
Traceback (most recent call last):
File "/opt/cuckoo/lib/cuckoo/core/plugins.py", line 631, in process
current.run(self.results)
File "/opt/cuckoo/modules/reporting/mongodb.py", line 202, in run
del report[parent_key][child_key]
TypeError: list indices must be integers, not str

@marirs
Copy link
Contributor

marirs commented Jun 21, 2017

Yes, thats the MongoDB document limit. If it crosses 16MB limit, Mongo cannot save. So the next step cuckoo does, is to see if it can delete some key and then attempt to save it. But out of luck. So that particular analysis will not be saved into mongo. If JSON report was enabled, then you should have report.json inside of storage/analysis//reports. But yet wont be displayed in the UI.

This happens sometimes, when you have lots of reporting stuff which exceeds the limit of the mongo document size.

To counter this somewhat, the compress results was the solution.

If you have pulled the latest from Kevin's repo, you can try to enable the compressresults in the reporting.conf file and restart cuckoo and try that sample again.

Let us konw how that goes :)

@enzok
Copy link
Contributor Author

enzok commented Jun 22, 2017

This already has the compressresults enabled. I'm just curious why the delete failed. Could the delete failure be handled more gracefully, so that just the offending results key is omitted from the results instead of the result failing completely.

@kevoreilly
Copy link
Contributor

Hi enzok, I agree this failure should be handled more gracefully. I'll try and work out a way to do this - if you can share a sample hash please do.

@enzok
Copy link
Contributor Author

enzok commented Jun 27, 2017

I modified mongodb.py with this code to remedy the issue (starting at ~ line 182):

    try:
        self.db.analysis.save(report)
    except InvalidDocument as e:
        parent_key, psize = self.debug_dict_size(report)[0]
        if not self.options.get("fix_large_docs", False):
            # Just log the error and problem keys
            log.error(str(e))
            log.error("Largest parent key: %s (%d MB)" % (parent_key, int(psize) / 1048576))
        else:
            # Delete the problem keys and check for more
            error_saved = True
            while error_saved:
                if type(report) == list:
                    report = report[0]

                try:
                    if type(report[parent_key]) == list:
                        for j, parent_dict in enumerate(report[parent_key]):
                            child_key, csize = self.debug_dict_size(parent_dict)[0]
                            del report[parent_key][j][child_key]
                            log.warn("results['%s']['%s'] deleted due to >16MB" % (parent_key, child_key))
                    else:
                        child_key, csize = self.debug_dict_size(report[parent_key])
                        del report[parent_key][child_key]
                        log.warn("results['%s']['%s'] deleted due to >16MB" % (parent_key, child_key))

                    try:
                        self.db.analysis.save(report)
                        error_saved = False
                    except InvalidDocument as e:
                        parent_key, psize = self.debug_dict_size(report)[0]
                        log.error(str(e))
                        log.error("Largest parent key: %s (%d MB)" % (parent_key, int(psize) / 1048576))
                except Exception as e:
                    log.error("Failed to delete child key: %s" % str(e))
                    error_saved = False

    self.conn.close()

Correct me if I'm wrong, but I don't believe that procdump results are being compressed. I think when there are too many yara strings, the results grow too large.

@kevoreilly
Copy link
Contributor

Ah yes, I will look at adding compression to procdump output too, as well as implementing the fix you have kindly posted above.

Thanks for your help.

@kevoreilly
Copy link
Contributor

I have now pushed this fix and enabled compression for procdump. Please let me know if this fixes (or alleviates) this issue.

@enzok
Copy link
Contributor Author

enzok commented Jul 3, 2017

Thank you.

@enzok
Copy link
Contributor Author

enzok commented Jul 3, 2017

Will compressing the report results affect elasticsearch db (search only)? I noticed I'm now getting serialization errors when storing data into elasticsearch.

@kevoreilly
Copy link
Contributor

Hmm possibly - I vaguely recall seeing problems previously with Elasticsearch and compression. Any chance you could provide some more details to help me try and narrow it down?

@enzok
Copy link
Contributor Author

enzok commented Jul 3, 2017

It appears that that the compressed data doesn't serialize. I added the following code to the elasticsearchdb.py reporting module and it solved the issue.

import json
import zlib

~ line 137:

        try:
            report["summary"] = json.loads(zlib.decompress(results.get("behavior", {}).get("summary")))
        except:
            report["summary"] = results.get("behavior", {}).get("summary")

@marirs
Copy link
Contributor

marirs commented Jul 3, 2017

I would rather do it this way:
Since you dont want the compressed results to sit in Elastic, and that the views can any ways parse if its compressed or not - you could change the order of these 2 processing files:

elasticsearchdb.py
Line 25:
order = 9998
Change to order = 9997

compressresults.py
Line 27:
order = 9997
Change to order = 9998

This way compressresults will be done after elasticsearch is reported.

@enzok
Copy link
Contributor Author

enzok commented Jul 3, 2017

That works for me. I completely forgot about being able to set the order.

@kevoreilly
Copy link
Contributor

Ah fantastic - thanks both for finding and fixing this. I will make this change now.

@enzok enzok closed this as completed Jul 3, 2017
kevoreilly added a commit that referenced this issue Mar 28, 2019
 Add certutil into suspicious windows tools
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants