Add support for new OO-style scanners to ./scan#232
Conversation
…er versions. This means a lot of code repetition, but we can hopefully cut that down over time.
| # Run locally. | ||
| logging.debug("Subclass (noopabc) __init__ method for %s." % domain) | ||
| self.extra_environment = {"variable": domain} | ||
| super().__init__(domain, handles, environment, options, extra) |
There was a problem hiding this comment.
This is more than I'm comfortable seeing scanner writers have to remember to do. The original function signature is just init(environment, options). I'd prefer to see an __init__ method can go in the base class as a non-abstract method, and init remain as init(self, environment, options).
There was a problem hiding this comment.
I'm not sure why they have to remember any of this; the __init__ method is in the base class and they would only need to write this for their scanner if they have per-domain functionality. Having __init__ and init methods seems very confusing; if we want to take the per-domain out of __init__, we could move it to a method called per_domain or something similar.
Also, I'm assuming we would have a commented template for scanners that writers would be able to start from, rather than expecting them to remember what the methods and signatures are.
| # that use the network or are otherwise expensive would go. | ||
| # | ||
| # Runs locally or in the cloud (Lambda). | ||
| domain = self.domain # noqa |
There was a problem hiding this comment.
This is a stray line from earlier testing, I'll remove it.
| pass | ||
|
|
||
| @abstractmethod | ||
| def to_rows(self, data): |
There was a problem hiding this comment.
Is this meant to have a declaration of the return type?
There was a problem hiding this comment.
Yes, I'll add that in.
| # Some subclasses may define an ``extra_environment`` property in their | ||
| # __init__ methods, and we want to fold its values into the | ||
| # ``environment`` property. | ||
| self.environment.update(getattr(self, "extra_environment", {})) |
There was a problem hiding this comment.
There hasn't been an extra or extra_environment value used in scanners. I'd prefer to keep the number of moving parts smaller here if possible.
There was a problem hiding this comment.
The current structure of scan effectively passes around global variables that act as an extra environment. It's possible there's an elegant way to refactor it, but I've tried to change the structure of scan only minimally in this first pass. I'll take another pass through it and try to simplify this.
| @classmethod | ||
| @abstractmethod | ||
| def initialize_environment(cls: Type[ScannerABCT], environment: dict, | ||
| options: dict) -> dict: |
There was a problem hiding this comment.
I would prefer to see the call signature for scanners stay as it's been. I would also like to avoid passing TypeVar's into individual scanners, with the goal of keeping the task of writing a scanner as simple and approachable as possible.
There was a problem hiding this comment.
I'm not sure why this is a factor; I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration. As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options). Am I missing a consideration here?
There was a problem hiding this comment.
I'm assuming that we'll have a template for scanners that writers will copy from, meaning they won't have to worry about this type declaration.
I'm still assuming scanners will be written by hand, and would like that to be as accessible as possible. But even when using a template, I'd like it to be clear what's going on in the file, and what each function means and does.
There was a problem hiding this comment.
As for calling this class method, it remains the same, i.e. scan_class.initialize_environment(scan_class, environment, options). Am I missing a consideration here?
I think so? There's never been an initialize_environment(scan_class, environment, options) function. Only an init(environment, options) function.
…aration for turning gather into its own project. Refactor new scan OO appraoch so that what was “overall scan: class/per-domain: instance” is now “overall scan: instance/per-domain: instance.scan()”.
…the only values in handles will be scanner details, and that we know that Python 3 will preserve dict order.
|
@tadhg-ohiggins Is this ready for merge yet? I'm comfortable moving forward with the PR as it is. |
| 'duration': scan_utils.just_microseconds(duration), | ||
| 'durations': durations, | ||
| 'command': start_command, | ||
| 'scan_uuid': scan_uuid |
There was a problem hiding this comment.
Assigning this got moved lower in the script. Is this still set by this time?
There was a problem hiding this comment.
Do you mean the variables durations, start_command, and scan_uuid? If so, yes, they're still in scope at this point. (start_command is still a global variable.)
…3.6; back it out and check for path existence specifically.
… Path.open instead in domains_from.
The old and the new
Sometimes have to co-exist.
Let fitness decide.
This adds support for a new style of object-oriented scanner but still supports the older style. This means a lot of code repetition, but we can hopefully cut that down over time.
Supporting both for a while also means that we can incrementally switch them over.
This initial checkin only added an abstract base class approach for
noop.pyin the form ofnoopabc.pyas a proof-of-concept.