-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Security 2024 Updated security.txt Metric #125
Conversation
} else if (line.startsWith('Expires: ')) { | ||
data['expires'] = line.substring(9); | ||
data['expires'].push(line.substring(9).trim()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these need to be arrays instead of just string value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specification states that some fields (expires and preferred-languages) are only allowed to occur once, other fields are allowed to occur several times.
By using arrays, we can easily count how often fields occur multiple times or not and for a basic validity check validate that the fields that are only allowed to occur once do not occur multiple times.
Also we are not simply choosing the first or the last one if multiple entries exist. We could also do a string concatenation in such cases, however, that would make it more difficult to split the values once again if we want to ask whether some fields occur more than once.
Facebook uses two policies for example:
"policy": [
"https://www.facebook.com/whitehat/info/",
"https://about.meta.com/security/vulnerability-disclosure-policy"
],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels quite verbose when so many fields are empty:
"/.well-known/security.txt": {
"found": false,
"data": {
"status": 500,
"redirected": false,
"url": "https://example.com/.well-known/security.txt",
"signed": false,
"contact": [],
"expires": [],
"encryption": [],
"acknowledgments": [],
"preferred_languages": [],
"canonical": [],
"policy": [],
"hiring": [],
"csaf": [],
"other": [],
"all_required_exist": false,
"only_one_requirement_broken": false,
"valid": false
}
},
Can we only include the fields when they are present to reduce the storage and query size?
Don't know about the storage and query size of empty arrays. The keys are always the same so there might be some optimization possible. However, I adapted the query to only keep non-empty fields. Hope that does not make the query more complex. |
"/.well-known/security.txt": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://example.com/.well-known/security.txt",
"signed": false,
"other": [
[
"background-color",
"#f0f0f2;"
],
[
"margin",
"0;"
],
[
"padding",
"0;"
], Was not great, as the inline CSS is detected as Other directives.
|
Custom metrics for https://almanac.httparchive.org/en/2022/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240605_HZ_7 Custom metrics for https://example.com/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240605_J4_8 {
"_well-known": {
"/.well-known/assetlinks.json": {
"found": false
},
"/.well-known/apple-app-site-association": {
"found": false
},
"/.well-known/gpc.json": {
"found": false
},
"/robots.txt": {
"found": false
},
"/.well-known/security.txt": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://example.com/.well-known/security.txt",
"content_type": "text/html; charset=UTF-8"
}
},
"/.well-known/change-password": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://example.com/.well-known/change-password"
}
},
"/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/": {
"found": false,
"data": {
"status": 500,
"redirected": false,
"url": "https://example.com/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/"
}
}
}
} Custom metrics for https://securitytxt.org/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240605_8D_9 {
"_well-known": {
"/.well-known/assetlinks.json": {
"found": false
},
"/.well-known/apple-app-site-association": {
"found": false
},
"/.well-known/gpc.json": {
"found": false
},
"/robots.txt": {
"found": true,
"data": {
"matched_disallows": {}
}
},
"/.well-known/security.txt": {
"found": true,
"data": {
"status": 200,
"redirected": false,
"url": "https://securitytxt.org/.well-known/security.txt",
"content_type": "text/plain; charset=utf-8",
"signed": false,
"contact": [
"https://hackerone.com/ed"
],
"expires": [
"2025-03-14T00:00:00.000Z"
],
"acknowledgments": [
"https://hackerone.com/ed/thanks"
],
"preferred_languages": [
"en, fr, de"
],
"canonical": [
"https://securitytxt.org/.well-known/security.txt"
],
"policy": [
"https://hackerone.com/ed?type=team&view_policy=true"
],
"all_required_exist": true,
"only_one_requirement_broken": false,
"valid": true
}
},
"/.well-known/change-password": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://securitytxt.org/.well-known/change-password"
}
},
"/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://securitytxt.org/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/"
}
}
}
} Custom metrics for https://facebook.com/WPT test run results: http://webpagetest.httparchive.org/results.php?test=240605_BP_A {
"_well-known": {
"/.well-known/assetlinks.json": {
"found": true
},
"/.well-known/apple-app-site-association": {
"found": true
},
"/.well-known/gpc.json": {
"found": false
},
"/robots.txt": {
"found": true,
"data": {
"matched_disallows": {
"Applebot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"baiduspider": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Bingbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Discordbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"DuckDuckBot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"facebookexternalhit": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Googlebot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Google-Extended": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Googlebot-Image": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"GPTBot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"ia_archiver": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"LinkedInBot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"msnbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Naverbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Pinterestbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Screaming Frog SEO Spider": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"seznambot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Slurp": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"teoma": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"TelegramBot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Twitterbot": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Yandex": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
],
"Yeti": [
"/login.php*&next=",
"/login.php/?next=",
"/login.php?next=",
"/login/*&next=",
"/login/?next=",
"/login/device-based/regular/login/*&next=",
"/login/device-based/regular/login/?next=",
"/x/oauth/"
]
}
}
},
"/.well-known/security.txt": {
"found": true,
"data": {
"status": 200,
"redirected": false,
"url": "https://www.facebook.com/.well-known/security.txt",
"content_type": "text/plain;charset=utf-8",
"signed": false,
"contact": [
"https://www.facebook.com/whitehat/report/"
],
"expires": [
"Thu, 04 Jul 2024 23:55:25 -0700"
],
"acknowledgments": [
"https://www.facebook.com/whitehat/thanks/"
],
"policy": [
"https://www.facebook.com/whitehat/info/",
"https://about.meta.com/security/vulnerability-disclosure-policy"
],
"hiring": [
"https://www.metacareers.com/areas-of-work/security/"
],
"all_required_exist": true,
"only_one_requirement_broken": false,
"valid": true
}
},
"/.well-known/change-password": {
"found": true,
"data": {
"status": 200,
"redirected": true,
"url": "https://www.facebook.com/login.php?next=https%3A%2F%2Fwww.facebook.com%2F.well-known%2Fchange-password"
}
},
"/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/": {
"found": false,
"data": {
"status": 404,
"redirected": false,
"url": "https://www.facebook.com/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/"
}
}
}
} Custom metrics for https://slack.comWPT test run results: http://webpagetest.httparchive.org/results.php?test=240605_21_B {
"_well-known": {
"/.well-known/assetlinks.json": {
"found": true
},
"/.well-known/apple-app-site-association": {
"found": true
},
"/.well-known/gpc.json": {
"found": false
},
"/robots.txt": {
"found": true,
"data": {
"matched_disallows": {
"*": [
"/oauth"
]
}
}
},
"/.well-known/security.txt": {
"found": true,
"data": {
"status": 200,
"redirected": false,
"url": "https://slack.com/.well-known/security.txt",
"content_type": "text/plain;charset=utf-8",
"signed": false,
"contact": [
"https://hackerone.com/slack/"
],
"policy": [
"https://hackerone.com/slack/"
],
"other": [
[
"Acknowledgements",
"https://hackerone.com/slack/thanks"
]
],
"all_required_exist": false,
"only_one_requirement_broken": false,
"valid": false
}
},
"/.well-known/change-password": {
"found": true,
"data": {
"status": 200,
"redirected": true,
"url": "https://slack.com/signin?redir=%2Faccount%2Fsettings"
}
},
"/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200/": {
"found": false,
"data": {
"status": 404,
"redirected": true,
"url": "https://slack.com/.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200"
}
}
}
} |
@tunetheweb Can this be merged before the crawl starts tomorrow? As written above there might still be a a very small number of sites with incorrect "other" values.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Updated custom metric for HTTPArchive/almanac.httparchive.org#3604
Description of the changes:
Update the parsing of
.well-known/security.txt
to take all new defined fields into account, save undefined/future/custom fields and a basic parsing of whether the file is valid (required fields exist and no field that is only allowed to occur once occurs more than once).Test websites: