Skip to content

fix: parsing robots config erroneously returns indexable as falseΒ #233

@silverbackdan

Description

@silverbackdan

πŸ› The bug

The debug process was found as this issue affects my sitemap.

nuxt-modules/sitemap#466

πŸ› οΈ To reproduce

https://www.cymrukitchens.com/sitemap_index.xml

🌈 Expected behavior

After deploying the sitemap works fine. After some time (and the time varies) the check on the robots configuration for all pages changes.

You'll see in the debug that this is the block which is triggered:

https://github.com/nuxt-modules/robots/blob/main/src/runtime/server/composables/getPathRobotConfig.ts#L51-L59

Image

So it is reading the computed robots.txt which is pretty simple
https://www.cymrukitchens.com/robots.txt

# START nuxt-robots (indexable)
User-agent: *
Disallow: /*?s=*

# Block all from operational endpoints
User-agent: *
Disallow: /_cwa/*

# Block non helpful bots
User-agent: Nuclei
User-agent: WikiDo
User-agent: Riddler
User-agent: PetalBot
User-agent: Zoominfobot
User-agent: Go-http-client
User-agent: Node/simplecrawler
User-agent: CazoodleBot
User-agent: dotbot/1.0
User-agent: Gigabot
User-agent: Barkrowler
User-agent: BLEXBot
User-agent: magpie-crawler
Disallow: /

Sitemap: https://www.cymrukitchens.com/sitemap_index.xml
# END nuxt-robots

ℹ️ Additional context

I think it must be the robots section that blocks non-helpful bots that triggers this? I'm going through this repo now and seeing if I can reproduce locally instead of waiting a while every time it deploys to figure it out.

In my module I do configure the robots.txt file in a couple of places, here mainly on the robots:config hook.

https://github.com/components-web-app/cwa-nuxt-module/blob/main/src/runtime/server/server-plugin.ts

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions