Add textlint and its config #428
Conversation
|
@sotayamashita why is this in draft? And is this the same action as the action created by @kingthorin for the WSTG? (no issues with that, just asking). |
|
Seems to be the same config file, textlint seems included in package.json. There does not appear to be any invocation/use though. |
I am sorry that I cannot have time to complete it.
Yes, it is. I just copied from @kingthorin's WSTG. |
| yarn.lock |
sotayamashita
Jun 21, 2020
Author
Contributor
I thought it was our policy not to commit these lock files, so I added them.
I thought it was our policy not to commit these lock files, so I added them.
|
Hi @sotayamashita. Thanks for this PR - especially on the older cheat sheets there's a lot of...interesting capitalisation, so adding linting to prevent that in future is certainly a good thing. The amount of changes in this PR shows how inconsistent some of it was before. While I'm happy with the capitalisation changes (apart from a few exceptions flagged in the review), the language changes are more of a problem. These have clearly been done with a find/replace, which has caused various problems such as incorrect capitalisation, broken sentence structure (for example, "allow list" is not a verb, but it's used by one in some places), renaming sections (which would break any links to them), and incorrectly naming external links to other pages or projects. We had a similar issue in #402, which ended up with some very strange phrasing such as describing a systems "accessible to administer". Some of these changes are harmless as they only appear in text and actually read better (such as "he/she" to "they"), but others are more questionable, such as "regex" to "regular expression" (the term "regex" is actually more widely used, and there are some references to specific projects using that word which are now incorrect), and most significantly the use allow/block list. If these kind of language changes are going to be implemented, that needs to be discussed and done properly (rewriting sentence structure, updating the style guidelines, ensuring that the language is consistent with other projects and references, fixing links broken by section title changes, etc), rather than just as a unilateral change made in a PR about linting. My thoughts would be to remove those specific terms from the linter (and revert the associated changes), and then once this is merged the list of terms that we're checking for can gradually evolve over time (and as we spot common errors in the cheat sheets). @ThunderSon @mackowski what are your thoughts? |
| @@ -76,7 +76,7 @@ When it comes to error handling, the Hibernate Validator returns a `BindingResul | |||
|
|
|||
| **Use**: | |||
|
|
|||
| Checks if the annotated string matches the regular expression regex considering the given flag match. Please visit [OWASP Validation Regex Repository](https://owasp.org/www-community/OWASP_Validation_Regex_Repository) for other useful regex's. | |||
| Checks if the annotated string matches the regular expression regular expression considering the given flag match. Please visit [OWASP Validation regular expression Repository](https://owasp.org/www-community/OWASP_Validation_Regex_Repository) for other useful regex's. | |||
rbsec
Jun 21, 2020
Contributor
The project is called "OWASP Validation Regex Repository" - so this change is incorrect. I've seen a couple like this - perhaps the linting should ignore [link text]?
The project is called "OWASP Validation Regex Repository" - so this change is incorrect. I've seen a couple like this - perhaps the linting should ignore [link text]?
| // List of terms | ||
| "terms": [ | ||
| // Brands | ||
| "Airbnb", |
rbsec
Jun 21, 2020
Contributor
Why does this include random unrelated companies that have no relevant to the project? Seems like advertising..
Why does this include random unrelated companies that have no relevant to the project? Seems like advertising..
kingthorin
Jun 21, 2020
Contributor
This is an upstream carry-over which is itself an upstream (or well source) carryover.
This part of the list contains brand names to ensure they're properly capitalized. It could be trimmed, though obviously the thing is they won't be caught if they aren't in the config. So while they might not be applicable at the moment that doesn't mean they won't be in the future.
This is an upstream carry-over which is itself an upstream (or well source) carryover.
This part of the list contains brand names to ensure they're properly capitalized. It could be trimmed, though obviously the thing is they won't be caught if they aren't in the config. So while they might not be applicable at the moment that doesn't mean they won't be in the future.
| ``` | ||
| Please note there are some JavaScript functions that can never safely use untrusted data as input - **EVEN IF JAVASCRIPT ESCAPED!** | ||
| Please note there are some JavaScript functions that can never safely use untrusted data as input - **EVEN IF JavaScript ESCAPED!** |
rbsec
Jun 21, 2020
Contributor
This sentence was in capitals for emphasis, so putting JavaScript in proper case in the middle of it looks wrong.
This sentence was in capitals for emphasis, so putting JavaScript in proper case in the middle of it looks wrong.
kingthorin
Jun 21, 2020
Contributor
I agree this causes it to look out of place. However, I'd also suggest there must be a better way to emphasize things than to YELL at the reader 😉
I agree this causes it to look out of place. However, I'd also suggest there must be a better way to emphasize things than to YELL at the reader
sotayamashita
Jul 2, 2020
•
Author
Contributor
I'll use https://github.com/textlint/textlint-filter-rule-comments to prevent it from being converted
I'll use https://github.com/textlint/textlint-filter-rule-comments to prevent it from being converted
| @@ -410,9 +410,9 @@ The `X-XSS-Protection` header has been deprecated by modern browsers and its use | |||
| | Data Type | Context | Code Sample | Defense | | |||
| |-----------|------------------------------------------|--------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | |||
| | String | HTML Body | `<span>UNTRUSTED DATA </span>` | HTML Entity Encoding (rule \#1). | | |||
| | String | Safe HTML Attributes | `<input type="text" name="fname" value="UNTRUSTED DATA ">` | Aggressive HTML Entity Encoding (rule \#2), Only place untrusted data into a whitelist of safe attributes (listed below), Strictly validate unsafe attributes such as background, id and name. | | |||
| | String | Safe HTML Attributes | `<input type="text" name="fname" value="UNTRUSTED DATA ">` | Aggressive HTML Entity Encoding (rule \#2), Only place untrusted data into a allow list of safe attributes (listed below), Strictly validate unsafe attributes such as background, ID and name. | | |||
rbsec
Jun 21, 2020
Contributor
The HTML property is "id", not "ID".
The HTML property is "id", not "ID".
kingthorin
Jun 21, 2020
Contributor
"an allow..."
id could be inline code fenced with backticks.
"an allow..."
id could be inline code fenced with backticks.
| | String | GET Parameter | `<a href="/site/search?value=UNTRUSTED DATA ">clickme</a>` | URL Encoding (rule \#5). | | ||
| | String | Untrusted URL in a SRC or HREF attribute | `<a href="UNTRUSTED URL ">clickme</a> <iframe src="UNTRUSTED URL " />` | Canonicalize input, URL Validation, Safe URL verification, Whitelist http and https URL's only (Avoid the JavaScript Protocol to Open a new Window), Attribute encoder. | | ||
| | String | Untrusted URL in a SRC or HREF attribute | `<a href="UNTRUSTED URL ">clickme</a> <iframe src="UNTRUSTED URL " />` | Canonicalize input, URL Validation, Safe URL verification, allow list http and HTTPS URLs only (Avoid the JavaScript Protocol to Open a new Window), Attribute encoder. | |
rbsec
Jun 21, 2020
Contributor
This sentence doesn't make sense "allow list" is not a verb
This sentence doesn't make sense "allow list" is not a verb
kingthorin
Jun 21, 2020
Contributor
The whole sentence is horrible to start with.
I'd suggest a re-write such as:
Canonicalize input, validate URLs, verify safe URLs, allow only HTTP(S) URLs (avoid the JavaScript protocol to open a new window), use an attribute encoder.
(That might not be perfect, I haven't read through the entire content for context. But, hopefully you get the idea.)
The whole sentence is horrible to start with.
I'd suggest a re-write such as:
Canonicalize input, validate URLs, verify safe URLs, allow only HTTP(S) URLs (avoid the JavaScript protocol to open a new window), use an attribute encoder.
(That might not be perfect, I haven't read through the entire content for context. But, hopefully you get the idea.)
| // "server side" | ||
| //], | ||
| [ | ||
| "smartphone(s)?", |
rbsec
Jun 21, 2020
Contributor
These words are not synonyms, "smartphone" is a subset of "mobile phone". Should be removed.
These words are not synonyms, "smartphone" is a subset of "mobile phone". Should be removed.
sotayamashita
Jul 11, 2020
Author
Contributor
Removed.
Removed.
| "build system(s)?", | ||
| "build tool$1" | ||
| ], | ||
| //[ |
rbsec
Jun 21, 2020
Contributor
Remove?
Remove?
sotayamashita
Jul 2, 2020
•
Author
Contributor
Removed.
Removed.
| "jQuery", | ||
| "LinkedIn", | ||
| "Lodash", | ||
| "MacBook", |
rbsec
Jun 21, 2020
Contributor
Again, why random brands in here?
Again, why random brands in here?
| "WebAssembly", | ||
| "WebStorm", | ||
| "WordPress", | ||
| "YouTube", |
rbsec
Jun 21, 2020
Contributor
I'm not too familiar with textlint, but my understanding of the format here is that:
- Individual words/phrases must have that exact capitalisation or they're be flagged
- Arrays will flag the first term (which is a regex) as incorrect, and recommend the second term as a replacement
Is this correct?
If so, it would be good to have a short comment at the top of this file explaining that, so it's clear how to add new terms
I'm not too familiar with textlint, but my understanding of the format here is that:
- Individual words/phrases must have that exact capitalisation or they're be flagged
- Arrays will flag the first term (which is a regex) as incorrect, and recommend the second term as a replacement
Is this correct?
If so, it would be good to have a short comment at the top of this file explaining that, so it's clear how to add new terms
sotayamashita
Jul 2, 2020
Author
Contributor
I am gonna investigate how to do that.
I am gonna investigate how to do that.
|
@sotayamashita will you be able to fix issues that @rbsec highlighted and read all the changes again to review their correctness? |
|
@mackowski I am sorry not to replay quickly. I am gonna fix it until tomorrow. @rbsec Thank you for the detailed review! |
|
No problem at all, thank you! |
|
I am sorry I am working now. |
| @@ -400,7 +400,7 @@ Access issues detected using the BASIC USER point of view: | |||
|
|
|||
| Even if the authorization matrix is stored in a human readable format (XML), it can be interesting to provide an on-the-fly rendering representation of the XML file in order to facilitate the review, audit and discussion about the authorization matrix in order to spot potential inconsistencies. | |||
|
|
|||
| The Following XSL stylesheet can be used: | |||
| The Following XSL style sheet can be used: | |||
sotayamashita
Jul 11, 2020
Author
Contributor
Should be stylesheet
Should be stylesheet
|
Thanks for the changes - I'd made a few small comments where the grammar has ended up a bit strange or where a few changes were missed when reverting. There's still a few comments left in the textlintrc file (would be nice to have a few comments in there explaining how to add stuff, and cleaning out some junk terms we don't need), but other than those things this looks nearly ready to merge. Thanks |
|
@rbsec Thank you for taking the time to read through it, even though I made some major changes. |
|
@sotayamashita thanks for making those changes - looks good to me. |
|
@kingthorin and @rejahrehim when possible, your review on the |
|
Looks good to me. |
|
If you want to tackle the "client-" / "server-" side thing the Web Security Testing Guide has settled on hyphenated variants after twitter and slack polling. Details here: OWASP/wstg#526 |
|
@kingthorin I'd agree on the hyphenated versions (which seem to be much more popular). Let's get this merged first, then we can start adding in extra terms like that. |
This PR covers issue #413