Skip to content

emanuelefavero/robots-txt-templates-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

robots.txt templates

This is a collection of robots.txt templates

What is robots.txt?

  • Robots.txt is a file that tells search engines which pages or files the crawler can or can not request from your site

What is a crawler?

  • A crawler is a program that browses the web automatically. It is used by search engines to update their web index.

 

Add a comment in robot.txt

# This is a comment

Allow all

User-agent: *
Allow: /

Disallow all

User-agent: *
Disallow: /

Block a folder

User-agent: *
Disallow: /folder/

Block a file

User-agent: *
Disallow: /file.html

Block a file type

User-agent: *
Disallow: /*.pdf$

Allow only Google

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

Disallow only Google

User-agent: *
Allow: /

User-agent: Googlebot
Disallow: /

Link to your sitemap

User-agent: *
Sitemap: https://example.com/sitemap.xml

The Sitemap directive tells the crawler where to find your sitemap.

A sitemap is a file that lists the pages of your site. It is used by search engines to index your site.

Slow down the crawler

User-agent: *
Crawl-delay: 10

The Crawl-delay directive tells the crawler to wait at least 10 seconds between requests to your site.

 


User Agents

  • Googlebot - Used for Google Search
  • Bingbot - Used for Bing Search
  • Slurp - Yahoo's web crawler
  • DuckDuckBot - Used by the DuckDuckGo search engine
  • Baiduspider - This is a Chinese search engine
  • YandexBot - This is a Russian search engine
  • facebot - Used by Facebook
  • Pinterestbot - Used by Pinterest
  • TwitterBot - Used by Twitter

About

This is a collection of robots.txt templates

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published