Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Fixes to Markdown markup

  • Loading branch information...
commit 77f903d3843990f28cf644f3adba39e719c80b23 1 parent b2e1eda
@alexrabarts alexrabarts authored
Showing with 24 additions and 9 deletions.
  1. +24 −9 README.markdown
View
33 README.markdown
@@ -1,22 +1,23 @@
-= BigSitemap
+# BigSitemap
-== DESCRIPTION:
+## DESCRIPTION:
BigSitemap is a Sitemap generator specifically designed for large sites (although it works equally well with small sites). It splits large Sitemaps into multiple files, gzips the files to minimize bandwidth usage, batches database queries so it doesn't take your site down, can be set up with just a few lines of code and is compatible with just about any framework.
-
-== INSTALL:
+## INSTALL:
* Via git: git clone git://github.com/alexrabarts/big_sitemap.git
* Via gem: gem install alexrabarts-big_sitemap -s http://gems.github.com
-== SYNOPSIS:
+## SYNOPSIS:
The minimum required to generated a sitemap is:
+<pre>
sitemap = BigSitemap.new(:base_url => 'http://example.com')
sitemap.add(:model => MyModel, :path => 'my_controller')
sitemap.generate
+</pre>
You can put this in a rake/thor task and create a cron job to run it periodically. It should be enough for most Rails/Merb applications.
@@ -24,27 +25,38 @@ Your models must provide either a <code>find_for_sitemap</code> or <code>all</co
To generate the URLs, BigSitemap will combine the constructor arguments with the <code>to_param</code> method of each instance returned (provided by ActiveRecord but not DataMapper). If this method is not present, <code>id</code> will be used. The URL is constructed as:
+<pre>
":base_url/:path/:to_param" # (if to_param exists)
":base_url/:path/:id" # (if to_param does not exist)
+</pre>
BigSitemap knows about the document root of Rails and Merb. If you are using another framework then you can specify the document root with the <code>:document_root</code> option. e.g.:
+<pre>
BigSitemap.new(:base_url => 'http://example.com', :document_root => "#{FOO_ROOT}/httpdocs")
+</pre>
By default, the sitemap files are created under <code>/sitemaps</code>. You can modify this with the <code>:path</code> option:
+<pre>
BigSitemap.new(:base_url => 'http://example.com', :path => 'google-sitemaps') # places Sitemaps under /google-sitemaps
+</pre>
Sitemaps will be split across several files if more than 50,000 records are returned. You can customize this limit with the <code>:max_per_sitemap</code> option:
+<pre>
BigSitemap.new(:base_url => 'http://example.com', :max_per_sitemap => 1000) # Max of 1000 URLs per Sitemap
+</pre>
The database is queries in batches to prevent large SQL select statements from locking the database for too long. By default, the batch size is 1001 (not 1000 due to an obscure bug in DataMapper that appears when an offset of 37000 is used). You can customize the batch size with the <code>:batch_size</code> option:
+<pre>
BigSitemap.new(:base_url => 'http://example.com, :batch_size => 5000) # Database is queried in batches of 5,000
+</pre>
Google, Yahoo!, MSN and Ask are pinged once the Sitemap files are generated. You can turn one or more of these off:
+<pre>
BigSitemap.new(
:base_url => 'http://example.com',
:ping_google => false,
@@ -52,24 +64,27 @@ Google, Yahoo!, MSN and Ask are pinged once the Sitemap files are generated. Yo
:ping_msn => false,
:ping_ask => false
)
+</pre>
You must provide an App ID in order to ping Yahoo! (more info at http://developer.yahoo.com/search/siteexplorer/V1/updateNotification.html):
+<pre>
BigSitemap.new(:base_url => 'http://example.com', :yahoo_app_id => 'myYahooAppId') # Yahoo! will now be pinged
+</pre>
-== LIMITATIONS:
+## LIMITATIONS:
If your database is likely to shrink during the time it takes to create the sitemap then you might run into problems (the final, batched SQL select will overrun by setting a limit that is too large since it is calculated from the count, which is queried at the very beginning). Patches welcome!
-== TODO
+## TODO
* Support for priority and changefreq (currently hard-coded to 'weekly')
-== CREDITS
+## CREDITS
Thanks to Alastair Brunton and Harry Love, who's work provided a starting point for this library.
http://scoop.cheerfactory.co.uk/2008/02/26/google-sitemap-generator/
-== COPYRIGHT
+## COPYRIGHT
Copyright (c) 2009 Stateless Systems (http://statelesssystems.com). See LICENSE for details.
Please sign in to comment.
Something went wrong with that request. Please try again.