Skip to content

Add RFC9116 security.txt check#730

Merged
mxsasha merged 5 commits into
mainfrom
securitytxt
Oct 4, 2022
Merged

Add RFC9116 security.txt check#730
mxsasha merged 5 commits into
mainfrom
securitytxt

Conversation

@mxsasha
Copy link
Copy Markdown
Collaborator

@mxsasha mxsasha commented Aug 12, 2022

  • Currently fails as STATUS_FAIL, maybe this should be a lighter one like notice or info? Explore using INFO for recommendations, WARN for others.
  • The sectxt library is not yet publicly available so this will not build in CI and can not be deployed.
  • Content TBD in https://github.com/internetstandards/Internet.nl_content/pull/23
  • The text in the tech table is directly from the library and can use a bit of style improvement (PR created)
  • Would be nice to have some domains that have broken security.txt files. If we don't know of any, I can just make one.
  • Some of the naming in the code feels clunky and might need another look.
  • Look into making tech table translatable?
  • We currently do not have a consistency check between different hosts (including v4/v6) -> out of scope for now, new issue
  • Raise file size limit to 100KB
  • Make sure we verify content-type and encoding upon retrieval
  • Accept missing charset, i.e. only text/plain for content-type
  • Look into recognition of 301/302 redirects for false text/html detection
  • Remove canonical check for now
  • Be resilient for invalid utf-8 characters
  • Test against some of https://findsecuritycontacts.com/
  • Move to a separate subsection
  • Show which domain we retrieved the final security.txt?
  • Re-enable 3.7 in CI once sectxt supports it
  • Update to a PyPI version of sectxt before merge
  • 1.6 release notes and deploy docs
  • Documentation including which checks are implemented on our side and paths/connections considered
  • Check the canonical field, by passing both first and last URL in redirects to the parser
  • Generate error for security.txt in the root
  • Refactor: extract network testing logic and add testing
  • Clean up PR commit log
  • Stop after invalid content-type
  • Fix IPv4-only site test invisibility
  • Final code review
  • Final commit cleanup

@mxsasha mxsasha self-assigned this Aug 12, 2022
@mxsasha mxsasha linked an issue Aug 12, 2022 that may be closed by this pull request
@mxsasha mxsasha marked this pull request as ready for review August 15, 2022 08:03
@mxsasha mxsasha requested a review from baknu August 15, 2022 12:26
@mxsasha mxsasha added this to the v1.6 milestone Aug 15, 2022
@mxsasha mxsasha added enhancement Issues that suggest slight improvements to existing code, tests, etc. content Change (needed) to the content repository alongside with this issue/PR labels Aug 15, 2022
@mxsasha mxsasha changed the title [WIP] Add RFC9116 security.txt check Add RFC9116 security.txt check Aug 16, 2022
Comment thread interface/batch/openapi.yaml Outdated
@mxsasha mxsasha force-pushed the securitytxt branch 4 times, most recently from e020bd9 to 2c37e96 Compare August 30, 2022 11:43
@mxsasha mxsasha force-pushed the securitytxt branch 2 times, most recently from d6886e1 to 35aa70e Compare September 15, 2022 15:19
@WKobes
Copy link
Copy Markdown
Collaborator

WKobes commented Sep 26, 2022

Small bug, the security.txt test is invisible when the IPv6 address is unreachable. See screenshot.
https://dev.internet.nl/site/belastingdienst.nl/28435/
https://dev.internet.nl/site/www.belastingdienst.nl/28434/

image

@WKobes
Copy link
Copy Markdown
Collaborator

WKobes commented Sep 27, 2022

@mxsasha another perhaps unwanted outcome:

Several governmental domain names only act as redirect. They ignore the path when requested, and just always redirect. This leads to a long list of errors since the front page is now parsed as security.txt

E.g. https://dev.internet.nl/site/minaz.nl/28443/

Perhaps check the Content-Type first, and if this fails stop testing?

@mxsasha
Copy link
Copy Markdown
Collaborator Author

mxsasha commented Sep 29, 2022

Perhaps check the Content-Type first, and if this fails stop testing?

Seems reasonable to me.

Small bug, the security.txt test is invisible when the IPv6 address is unreachable. See screenshot.

Will look into it.

@mxsasha mxsasha force-pushed the securitytxt branch 2 times, most recently from 5e0b02a to 80f74e7 Compare September 30, 2022 11:17
@mxsasha
Copy link
Copy Markdown
Collaborator Author

mxsasha commented Sep 30, 2022

Perhaps check the Content-Type first, and if this fails stop testing?

Seems reasonable to me.

Small bug, the security.txt test is invisible when the IPv6 address is unreachable. See screenshot.

Will look into it.

@WKobes both should be fixed in dev.internet.nl now :)

Comment thread interface/batch/openapi.yaml Outdated
Comment thread checks/tasks/tls_connection.py Outdated
Comment thread documentation/securitytxt.md
Comment thread interface/batch/__init__.py Outdated
Comment thread interface/batch/openapi.yaml Outdated
Comment thread interface/batch/openapi.yaml Outdated
@mxsasha mxsasha force-pushed the securitytxt branch 2 times, most recently from 3fdb213 to 9faceed Compare September 30, 2022 12:43
@WKobes
Copy link
Copy Markdown
Collaborator

WKobes commented Sep 30, 2022

@mxsasha Thanks!

Did some testing:
This redirect fails, I think because the 302 response includes a different Content-Type? The security.txt itself should be valid.
https://dev.internet.nl/site/digitaltrustcenter.nl/28463/

This test now no longer prints a long list of errors. However, it still lists "Located security.txt on www.rijksoverheid.nl", while it probably still attempted to parse the front page? Maybe we could say in these cases 1that we were not able to determine whether a security.txt was present?
https://dev.internet.nl/site/minaz.nl/28470/

@WKobes
Copy link
Copy Markdown
Collaborator

WKobes commented Sep 30, 2022

Security.txt test dissappears here still. Not sure why, security.txt appears to be valid.
https://dev.internet.nl/site/www.gov.uk/28479/

@mxsasha
Copy link
Copy Markdown
Collaborator Author

mxsasha commented Sep 30, 2022

Did some testing: This redirect fails, I think because the 302 response includes a different Content-Type? The security.txt itself should be valid. https://dev.internet.nl/site/digitaltrustcenter.nl/28463/

Fixed.

This test now no longer prints a long list of errors. However, it still lists "Located security.txt on www.rijksoverheid.nl", while it probably still attempted to parse the front page? Maybe we could say in these cases 1that we were not able to determine whether a security.txt was present? https://dev.internet.nl/site/minaz.nl/28470/

Yes, that message is meant to say where we ended up redirecting to, i.e. which domain we retrieved the final evaluated file from. It's a bit hard to determine for us or not a file was meant to be a security.txt - we just take that if someone redirects .well-known/security.txt to a URL, they intended for that to land on a security.txt.

Security.txt test dissappears here still. Not sure why, security.txt appears to be valid.
https://dev.internet.nl/site/www.gov.uk/28479/

Will check.

@mxsasha
Copy link
Copy Markdown
Collaborator Author

mxsasha commented Sep 30, 2022

Security.txt test dissappears here still. Not sure why, security.txt appears to be valid. https://dev.internet.nl/site/www.gov.uk/28479/

I think this one is fixed now too - let me know if you still see it. Seems to have been intermittent, hence it seeming fixed before when it was not.

@mxsasha mxsasha merged commit 83f0603 into main Oct 4, 2022
@mxsasha mxsasha deleted the securitytxt branch October 4, 2022 09:19
@mxsasha mxsasha mentioned this pull request Oct 5, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

content Change (needed) to the content repository alongside with this issue/PR enhancement Issues that suggest slight improvements to existing code, tests, etc.

Development

Successfully merging this pull request may close these issues.

Subtest for security.txt (in website test)

2 participants