-
-
Notifications
You must be signed in to change notification settings - Fork 397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better management of unconfigured analyzers #306
Comments
The way I see it is there are 2 cases here:
My suggestion:
|
On another note, we should use JSON logging format (for example, python-json-logger instead of just plain texts. This would allow users to load the logs into elastic;'s APM or other similar services. Thoughts ? |
About the JSON logger I think that for someone could be helpful. However it is already easy to ingest logs in ElasticSearch with the actual logs format because it is pretty standard and it would just require a simple grok filter + logstash.
This is a good compromise to avoid to make bigger changes. However it seems a little hack because the application would lie to the user who would think that those analyzers never executed. |
I'll suggest one method but you might get dissappointed by amount of work has to be done for that. But I feel it's safe way. And, In the first case, I believe every API Key follows a FIXED REGEX PATTERN. We should include that pattern in analyzer_config.json. And before sending a request/starting analysis, Make a check whether the API Key matches that Regex. If not, Set the status as Ignored/Unconfigured and It shouldn't be counted as error in report. Work needs to be done: We should identify regex patterns of all existing API Keys and for future analyzers too. |
I agree with rest of the things but, We shouldn't just rely on the presence of that API-KEY/secrets. Because, The recommended usage is to copy I suggest to perform a basic check. Like if that secret is not the default value assigned in the template. Or If it matches with that Analyzer's Regex pattern (Robust way but requires lot of effort). Edit : I am mistaken. I am assuming the test env file which has test as API key. I got your idea. Yes, A secret presence check will do the job. |
Yes, we could just check if the key is empty or not, that is the idea |
I'll work on this, If that's okay |
Sure, feel free and thank you! |
In this way, we are duplicating the API-key variable names under Since making big changes in the PS: Now that I think about this, this would involve tonnes of changes everywhere including the clients and maybe overkill for the problem that we are trying to solve which is just silencing unconfigured analyzers. |
the
Whether, we like it or not, all we can do is let the analyzer run (like it does now) and handle the exception in a way that whether we should show it to the user or not. |
Yes, there would be a overhead but is it really relevant? We are trying to solve a user-experience problem that is worth to waste some milliseconds of computation. I know that without a real benchmark is difficult to understand but I guess that IntelOwl bottleneck is not there but it is inside all the tools that the analyzers use and all the APIs that the analyzers interact with. Also, we could avoid to read the environment variables twice if we pass those values to the celery worker when we execute the analyzers. However this would require little changes to all the analyzers modules. It would be a painful work to do |
I feel this would be nice too. I was thinking right now, analyzers that require API_KEYS show up in the list, while starting a job - even if they aren't configured properly. If the user forgets to set them up, they don't get to know about it till they run the job and get the error in the result JSON. These unconfigured analyzers could be grayed out in the list and hovering over them could give a tooltip telling the user what exactly they forgot to configure. For accurate tooltips, we might need objects like @eshaan7 mentioned where at the time of adding an analyzer, we can specify the configuration required in terms of objects that we can check to know if the analyzer has been fully configured. If anyone of those objects of that Analyzer class, for example, are unconfigured/empty, we show that in the tooltip and when all have been configured, the analyzer is ungrayed and can be chosen. This will prevent any of the unconfigured analyzers to run accidentally or while using the |
Yes. Someway or Other, That's our Final Goal. I mean, User just couldn't select configured analyzer. Forgot to post the idea here. For Now, I'm considering only |
This is an example analyzer object inside the new format of "Cuckoo_Scan": {
"type": "file",
"disabled": false,
"description": "scan a file on a Cuckoo instance",
"python_module": "cuckoo_scan.CuckooAnalysis",
"config": {
"soft_time_limit": 500,
"queue": "long",
"max_post_tries": 5,
"max_poll_tries": 20,
},
"secrets": {
"api_key_name": {
"secret_name": "CUCKOO_API_KEY",
"type": "string",
"required": true,
"default": null,
"description": "API key for your cuckoo instance"
}
}
} Here, and this is how it would be stored in the cache after "Cuckoo_Scan": {
"type": "file",
"disabled": false,
"description": "scan a file on a Cuckoo instance",
"python_module": "cuckoo_scan.CuckooAnalysis",
"config": {
"soft_time_limit": 500,
"queue": "long",
"max_post_tries": 5,
"max_poll_tries": 20,
},
"secrets": {
"api_key_name": {
"secret_name": "CUCKOO_API_KEY",
"type": "string",
"required": true,
"default": null,
"description": "API key for your cuckoo instance"
}
},
"verification": {
"configured": false,
"error_message": "'api_key_name' unknown: expected string, got int (3 of 4 satisfied)",
"missing_secrets": ["api_key_name"]
}
} Since the |
Regarding this closed issue (#298), we have a problem with those analyzers that require the use of an API key but either the user does not have the key or has not configured it.
Problems faced by users:
they execute the "Run all available analyzers" option and get a lot of errors. This could be avoided just by executing the correctly configured analyzers.
they see those analyzers as "runnable" but, actually, they are not and, when executed, they just instantly crash. I think we should not allow the user to execute them without the configuration set but, at the same time, let the user understand from the GUI that those analyzers exists and can be appropriately configured (the
requires_configuration
flag could be enough for this).This seems easier than it is. ATM the API keys are checked in the code of each analyzer, while this should be done
before
the analysis actually runs. I am open to suggestions on how to solve this problemThe text was updated successfully, but these errors were encountered: