Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the crawl.open_api plugin with additional parameters #17194

Merged

Conversation

Projects
None yet
2 participants
@artem-smotrakov
Copy link
Contributor

commented Aug 16, 2018

First of all, thanks for adding crawl.open_api which allows scanning REST APIs. Currently the plugin discovers API endpoints by parsing Open API specification which should be available on a target. If API spec was found in one of the predefined location, crawl.open_api plugin then validates the specification.

Unfortunately it sometimes happens that an application doesn't provide API specification. Furthermore, sometimes applications provide API specs which may be not quite valid, and as a result, validation fails. These two problems with applications may prevent w3af to scan them. Ideally, applications should provide a valid specification in one of the well-known locations but sometimes updating applications may take time and block scans. To prevent such delays, and make crawl.open_api a bit more flexible, I would like to propose the following parameters:

  • no_spec_validation for disabling Open API validation
  • custom_spec_location for providing a local path to Open API spec

I am having some troubles with running tests, so no tests so far. While I am figuring out what's going wrong, I'd like to get early feedback. Thanks!

@andresriancho

This comment has been minimized.

Copy link
Owner

commented Aug 16, 2018

Thanks for the PR.

I completely agree with the two things you mention: sometimes specs are invalid, and in some scenarios the spec might not be available on the same domain where the APIs are.

@andresriancho

This comment has been minimized.

Copy link
Owner

commented Aug 16, 2018

Comments added!

@artem-smotrakov

This comment has been minimized.

Copy link
Contributor Author

commented Aug 21, 2018

@andresriancho What comments did you mean? Am I missing something?

:param spec_dict: The output of load_spec_dict
:return: A Spec instance which holds all the dict information in an
accessible way.
"""
config = {'use_models': False}
if self.no_validation:
om.out.debug('Open API spec validation disabled')
config.update({

This comment has been minimized.

Copy link
@andresriancho

andresriancho Aug 21, 2018

Owner

I haven't played with these settings. What will happen when the validation fails? Are errors written somewhere? Maybe we should write them to the debug log?

This comment has been minimized.

Copy link
@artem-smotrakov

artem-smotrakov Aug 21, 2018

Author Contributor

If the validation fails, then basically Spec.from_dict() returns None, and as a result, no API endpoints are added for further testing. If I understand correctly, any exceptions below are logged only if debug mode is enabled. Here is an example of a error I saw (it may depend on issues in a particular API spec):

The document at "file:///path/to/swagger.yaml" is not a valid Open API specification. The following exception was raised while parsing the dict into a specification object: "('expected string or buffer', TypeError('expected string or buffer',))"

Although I didn't see any additional info. I am not sure if printing out a full stacktrace would help, and I also didn't find a way how to enable additional logs in bravado. Maybe we can always print a warning if the validation fails, not only when debug mode is enabled. What do you think?

This comment has been minimized.

Copy link
@andresriancho

andresriancho Aug 21, 2018

Owner

With that log line we're fine. Thanks for the detailed answer.

with open(self._custom_spec_location, 'r') as f:
custom_spec_as_string = f.read()

headers = Headers([('content-type', 'application/json')])

This comment has been minimized.

Copy link
@andresriancho

andresriancho Aug 21, 2018

Owner

I don't remember exactly how headers were used (or if they were) but openapi supports both yaml and json.

Could you check in the rest of the code if these headers will be used to decide how the contents of the file are parsed? If so, then the header should be dynamically generated based on the file extension?

This comment has been minimized.

Copy link
@artem-smotrakov

artem-smotrakov Aug 21, 2018

Author Contributor

Correct, both yaml and json are supported. It looks like the header is just ignored, or maybe it doesn't affect much parsing the spec because I tested it with yaml, and it worked well. I just followed your advice here

#15087 (comment)

I'll check it it makes sense to set the header with correct type, or if it's fine to drop it.

This comment has been minimized.

Copy link
@artem-smotrakov

artem-smotrakov Aug 22, 2018

Author Contributor

I see that OpenAPI.can_parse() checks Content-Type header but after some testing I don't think that Content-Type: application/json breaks processing an Open API spec in YAML format. Nevertheless, I think it would be better to specify a correct content type here at least to prevent further confusions.

h = ('By default, the plugin looks for the API specification on the target,',
' but sometimes applications do not provide an API specification. ',
' Set this parameter to specify a local path to the API specification.')
o = opt_factory('custom_spec_location', self._custom_spec_location, d, STRING, help=h)

This comment has been minimized.

Copy link
@andresriancho

andresriancho Aug 21, 2018

Owner

Take a look at the input types (STRING and BOOL are input types). There is one for input files which will make sure (during the configuration phase) that the file exists.

This comment has been minimized.

Copy link
@andresriancho

andresriancho Aug 21, 2018

Owner

Change this one and we're A-OK for merging 👍

This comment has been minimized.

Copy link
@artem-smotrakov

artem-smotrakov Aug 22, 2018

Author Contributor

Good point! I'll update it, thanks!

This comment has been minimized.

Copy link
@artem-smotrakov

artem-smotrakov Aug 22, 2018

Author Contributor

I updated the type to be INPUT_FILE. Now if the file doesn't exist, the plugin reports it during configuration. Then, crawling/testing go as usual, but most likely no API will be tested because no API spec provided. It looks to me as an expected behavior, but it can hide configuration issues if w3af is run with a script (-s option). Does w3af have an option to exit with a non-zero code if a configuration step failed? It would help to identify configuration issues (it's probably out of scope in this PR).

@andresriancho

This comment has been minimized.

Copy link
Owner

commented Aug 21, 2018

Sorry, forgot to click on submit.

@andresriancho

This comment has been minimized.

Copy link
Owner

commented Aug 22, 2018

Awesome @artem-smotrakov ! Thanks for contributing, keep those PR's coming!

@andresriancho andresriancho merged commit a4973d4 into andresriancho:develop Aug 22, 2018

@andresriancho

This comment has been minimized.

Copy link
Owner

commented Aug 22, 2018

giphy

@artem-smotrakov

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2018

Thanks for the review @andresriancho !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.