-
-
Notifications
You must be signed in to change notification settings - Fork 46
Description
π The bug
The debug process was found as this issue affects my sitemap.
π οΈ To reproduce
https://www.cymrukitchens.com/sitemap_index.xml
π Expected behavior
After deploying the sitemap works fine. After some time (and the time varies) the check on the robots configuration for all pages changes.
You'll see in the debug that this is the block which is triggered:

So it is reading the computed robots.txt which is pretty simple
https://www.cymrukitchens.com/robots.txt
# START nuxt-robots (indexable)
User-agent: *
Disallow: /*?s=*
# Block all from operational endpoints
User-agent: *
Disallow: /_cwa/*
# Block non helpful bots
User-agent: Nuclei
User-agent: WikiDo
User-agent: Riddler
User-agent: PetalBot
User-agent: Zoominfobot
User-agent: Go-http-client
User-agent: Node/simplecrawler
User-agent: CazoodleBot
User-agent: dotbot/1.0
User-agent: Gigabot
User-agent: Barkrowler
User-agent: BLEXBot
User-agent: magpie-crawler
Disallow: /
Sitemap: https://www.cymrukitchens.com/sitemap_index.xml
# END nuxt-robots
βΉοΈ Additional context
I think it must be the robots section that blocks non-helpful bots that triggers this? I'm going through this repo now and seeing if I can reproduce locally instead of waiting a while every time it deploys to figure it out.
In my module I do configure the robots.txt file in a couple of places, here mainly on the robots:config
hook.
https://github.com/components-web-app/cwa-nuxt-module/blob/main/src/runtime/server/server-plugin.ts