Bot verification using shared secret header #2711

hdodov · 2024-01-19T10:21:07Z

I'm not sure this is right place/way to submit my idea. If I've made a mistake, please point me in the right direction.

Problem

It's difficult to allow search engines and only search engines to crawl your site:

User agent sniffing doesn't work because UAs can be faked
IP lookup doesn't really work because IPs may change over time and it can be hard to manage IP lists

In my case, we're using Cloudflare — one of the leading cloud cybersecurity providers — and we have to rely on bot scores or verified bot lists to decide whether we let traffic in or out. I think this is a poor solution because:

The score is subjective — it's not entirely clear how it's made up
The outcome is unknown — you can't really know which bots are going to be allowed

Solution

There could be a simple Bot-Secret header that bots add in their requests, so web servers can know if they should allow them or not. For example:

I verify my domain example.com in Google Search Console (GSC)
I generate a bot secret in the GSC admin, e.g. q02u6O6H9vVtxpIscXUNTLT7AqHJeTed
From now on, Googlebot should make requests to example.com with the following header attached:
```
Bot-Secret: q02u6O6H9vVtxpIscXUNTLT7AqHJeTed
```
In Cloudflare, I set up my WAF rule to allow requests with Bot-Secret that equals q02u6O6H9vVtxpIscXUNTLT7AqHJeTed

This way, I can know that Googlebot and only Googlebot is allowed past the WAF because only it has that secret.

If this turns into an agreed-upon standard and is supported by crawlers and cloud providers, I could:

Generate bot secrets in all search console admins and the like
List them in my cloud provider configuration or my own server middleware
Be sure that exactly these bots are being allowed and nothing and nobody else is

The text was updated successfully, but these errors were encountered:

reschke · 2024-01-19T14:06:58Z

The git repo is for tracking issues in specs we work on (or have worked on).

For discussions, please use the WG's mailing list: https://lists.w3.org/Archives/Public/ietf-http-wg/

(That said: why not simply use HTTP auth for this?)

mnot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bot verification using shared secret header #2711

Bot verification using shared secret header #2711

hdodov commented Jan 19, 2024 •

edited

Loading

reschke commented Jan 19, 2024

Bot verification using shared secret header #2711

Bot verification using shared secret header #2711

Comments

hdodov commented Jan 19, 2024 • edited Loading

Problem

Solution

reschke commented Jan 19, 2024

hdodov commented Jan 19, 2024 •

edited

Loading