Skip to content

Latest commit

 

History

History
6 lines (6 loc) · 339 Bytes

README.md

File metadata and controls

6 lines (6 loc) · 339 Bytes

extract_urls_from_sitemap_index

Scrape all the URLs from a sitemap index or a sitemap.xml. The parameter is the URL of the sitemap_index. Only works with XML format. The script will output an excel with three columns:

  • ID
  • Sitemap: in which sitemap was found the url
  • Url: A list of all the urls that appears in the sitemap(s)