Add Dataset Basics RDoc file

jeremyevans · Apr 15, 2010 · 81af78a · 81af78a
1 parent c898e4d
commit 81af78a
Show file tree

Hide file tree

Showing 2 changed files with 107 additions and 0 deletions.
diff --git a/doc/dataset_basics.rdoc b/doc/dataset_basics.rdoc
@@ -0,0 +1,106 @@
+= Dataset Basics 
+
+== Introduction
+
+Datasets are probably the thing that separate Sequel from other database libraries.  While most database libraries have specific support for updating all records or only a single record, Sequel's ability to represent SQL queries themselves as objects is what gives Sequel most of it's power.  However, if you haven't been exposed to the dataset concept before, it can be a little disorienting.  This document aims to give a basic introduction to datasets and how to use them.
+
+== What a Dataset Represents
+
+A Dataset can be thought of representing one of two concepts:
+
+* An SQL query
+* An abstract set of rows and some related behavior
+
+The first concept is more easily understood, so you should probably start with that assumption.
+
+== Basics
+
+The most basic dataset is the simple selection of all columns in a table:
+
+  ds = DB[:posts]
+  # SELECT * FROM posts
+
+Here, DB represents your Sequel::Database object, and ds is your dataset, with the SQL query it represents below it.
+
+One of the core dataset ideas that should be understood is that datasets use a functional style of modification, in which methods called on the dataset return modified copies of the dataset, they don't modify the dataset themselves:
+
+  ds2 = ds.filter(:id=>1)
+  ds2
+  # SELECT * FROM posts WHERE id = 1
+  ds
+  # SELECT * FROM posts
+
+Note how ds itself is not modified.  This is because ds.filter returns a modified copy of ds, instead of modifying ds itself.  This makes using datasets both thread safe and easy to chain:
+
+  # Thread safe:
+  100.times do |i|
+    Thread.new do
+      ds.filter(:id=>i).first
+    end
+  end
+
+  # Easy to chain:
+  ds3 = ds.select(:id, :name).order(:name).filter{id < 100}
+  # SELECT id, name FROM posts WHERE id < 100 ORDER BY name
+
+Thread safety you don't really need to worry about, but chainability is core to how Sequel is generally used.  Almost all dataset methods that affect the SQL produced return modified copies of the receiving dataset.
+
+Another important thing to realize is that dataset methods that return modified datasets do not execute the dataset's code on the database.  Only dataset methods that return or yield results will execute the code on the database:
+
+  # No SQL queries sent:
+  ds3 = ds.select(:id, :name).order(:name).filter{id < 100}
+
+  # Until you call a method that returns results
+  results = ds3.all
+
+One important consequence of this API style is that if you use a method chain that includes both methods that return modified copies and a method that executes the SQL, the method that executes the SQL should generally be the last method in the chain:
+
+  # Good
+  ds.select(:id, :name).order(:name).filter{id < 100}.all
+
+  # Bad
+  ds.all.select(:id, :name).order(:name).filter{id < 100}
+
+This is because all will return an array of hashes, and select, order, and filter are dataset methods, not array methods.
+
+== Methods
+
+Most Dataset methods that users will use can be broken down into two types:
+
+* Methods that return modified datasets
+* Methods that execute code on the database
+
+=== Methods that return modified datasets
+
+Most dataset methods fall into this category, which can be further broken down by the clause they affect:
+
+SELECT:: select, select_all, select_append, select_more
+FROM:: from, from_self
+JOIN:: join, join_table, 
+WHERE:: where, filter, exclude, and, or, grep, invert, unfiltered
+GROUP:: group, group_by, group_and_count, ungrouped
+HAVING:: having, filter, exclude, and, or, grep, invert, unfiltered
+ORDER:: order, order_by, order_more, reverse, reverse_order, unordered
+LIMIT:: limit
+compounds:: union, intersect, except
+locking:: for_update, lock_style
+common table expressions:: with, with_recursive
+qualification:: qualify, qualify_to, qualify_to_first_source
+inserting:: set_defaults, set_overrides
+other:: clone, distinct, naked, server, with_sql
+
+=== Methods that execute code on the database
+
+Most other dataset methods commonly used will execute the dataset's SQL on the database:
+
+SELECT (All Records):: all, each, map, to_hash, select_map, select_order_map, select_hash, to_csv
+SELECT (First Record):: first, last, get, [], empty?
+SELECT (Aggregates):: count, avg, max, min, sum, range, interval
+INSERT:: insert, <<, import, multi_insert, insert_multiple
+UPDATE:: update, set, []=
+DELETE:: delete
+other:: columns, columns!, truncate
+
+=== Other methods
+
+See the Sequel::Dataset RDoc for other methods that are less commonly used.
diff --git a/www/pages/documentation b/www/pages/documentation
@@ -6,6 +6,7 @@
   <li><a href="rdoc/files/README_rdoc.html">README</a></li>
   <li><a href="rdoc/files/doc/cheat_sheet_rdoc.html">Cheat Sheet</a></li>
   <li><a href="rdoc/files/doc/opening_databases_rdoc.html">Connecting to a Database</a></li>
+  <li><a href="rdoc/files/doc/dataset_basics_rdoc.html">Dataset Basics</a></li>
   <li><a href="rdoc/files/doc/dataset_filtering_rdoc.html">Dataset Filtering</a></li>
   <li><a href="rdoc/files/doc/advanced_associations_rdoc.html">Advanced Associations</a></li>
   <li><a href="rdoc/files/doc/prepared_statements_rdoc.html">Prepared Statements/Bound Variables</a></li>