"robots.txt" not working #5585

lcgkm · 2018-10-23T02:42:43Z

We found "robots.txt" file in Vault: https://vault.example.net/ui/robots.txt

File contents:
http://www.robotstxt.org
User-agent: *
Disallow: *

But It's not working because the request for the URI, "/robots.txt", returns 404 error.
If "/robots.txt" returns 3XX STATE CODE and the location is "/ui/robots.txt" (or "robots.txt" file exists in root "/"), then it will be working.

meirish · 2018-10-29T15:15:43Z

Hi @lcgkm ! I think this was an oversight on our part since it part of the default in the UI framework we use. Can you describe your use case a bit more? I'm more inclined to remove it entirely since if we added a redirect it would only be present if the UI was enabled.

lcgkm · 2018-10-30T01:47:21Z

How to create a /robots.txt file
Where to put it
The short answer: in the top-level directory of your web server.

The longer answer:

When a robot looks for the "/robots.txt" file for URL, it strips the path component from the URL (everything from the first single slash), and puts "/robots.txt" in its place.

For example, for "http://www.example.com/shop/index.html, it will remove the "/shop/index.html", and replace it with "/robots.txt", and will end up with "http://www.example.com/robots.txt".

So, as a web site owner you need to put it in the right place on your web server for that resulting URL to work. Usually that is the same place where you put your web site's main "index.html" welcome page. Where exactly that is, and how to put the file there, depends on your web server software.

Remember to use all lower case for the filename: "robots.txt", not "Robots.TXT.

Reference:
http://www.robotstxt.org/robotstxt.html

So, the search engine, like Google, will check https://vault.example.net/robots.txt
NOT https://vault.example.net/ui/robots.txt.
We don't need add a redirect, but we need put 'robots.txt' in the right place

meirish · 2018-10-30T04:33:48Z

Yep I’m familiar with robots.txt. The file is part of the ui code though (at least for now). Exposing vault publically is not generally recommended, so I was asking more about why you’re doing that (if that’s what’s happening) so that we can solve the issue for you rather than jumping to the implementation. In the event of no robots.txt, a crawler wouldn’t be authorized and there’s no site map so they wouldn’t know other endpoints to visit.

lcgkm · 2018-10-30T06:27:32Z

Exposing vault publically is not generally recommended

Yes. I totally agree with you. It's just an assumption. We assume someone took some mistake. And as a result, Vault is exposed to public network. (this is not a present/real situation.)

chrishoffman added the ui label Oct 25, 2018

meirish mentioned this issue Nov 5, 2018

serve robots.txt from the root when the UI is enabled #5686

Merged

chrishoffman closed this as completed in #5686 Nov 5, 2018

lcgkm mentioned this issue Nov 27, 2018

"robots.txt" not working hashicorp/consul#5005

Closed

hanshasselberg mentioned this issue Dec 13, 2018

Serve /robots.txt when UI is enabled. hashicorp/consul#5089

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"robots.txt" not working #5585

"robots.txt" not working #5585

lcgkm commented Oct 23, 2018

meirish commented Oct 29, 2018

lcgkm commented Oct 30, 2018

meirish commented Oct 30, 2018 •

edited

lcgkm commented Oct 30, 2018

"robots.txt" not working #5585

"robots.txt" not working #5585

Comments

lcgkm commented Oct 23, 2018

meirish commented Oct 29, 2018

lcgkm commented Oct 30, 2018

meirish commented Oct 30, 2018 • edited

lcgkm commented Oct 30, 2018

meirish commented Oct 30, 2018 •

edited