planning documentation for user guide.

JumpMind · Mar 16, 2010 · 7a7853e · 7a7853e
1 parent 18f6958
commit 7a7853e
Showing 1 changed file with 101 additions and 14 deletions.
diff --git a/symmetric/src/docbook/planning.xml b/symmetric/src/docbook/planning.xml
@@ -23,7 +23,7 @@
                </para>
 
 
-    <section id="planning-node">
+    <section id="identifying-nodes">
     <title>Identifying Nodes</title>
     <para>
       A <emphasis>node</emphasis> is a single instance of SymmetricDS. It can be thought of as a proxy for a database 
@@ -44,12 +44,14 @@
 
       </para>
     </section>
-    <section id="node-organization">
-            <title>Node Organization</title>
+    <section id="organizing-nodes">
+            <title>Organizing Nodes</title>
             <para> Nodes in SymmetricDS are organized into an overall node network, with connections based on what data needs
             to be synchronized where.  The exact organization of your nodes will be very specific to your synchronization goals.
             As a starting point, lay out your nodes in diagram form and draw connections between nodes to represent cases in which 
             data is to flow in some manner.  Think in terms of what data is needed at which node, what data is in common to more than one node, etc.
+            If it is helpful, you could also show data flow into and out of external systems.  As you will discover later,
+            SymmetricDS can publish data changes from a node as well using JMS.
     </para>             
 
             <para>Our retail example, as shown in <xref linkend="three-tier-store-server" xrefstyle="table"/>, represents a tree hierarchy with a single central office node connected
@@ -105,7 +107,7 @@
         </para>
         </section>
 
-     <section id="node-group">
+     <section id="grouping-nodes">
        <title>Defining Node Groups</title>
        <para>
          Once the organization of your SymmetricDS nodes has been chosen, you will need to <emphasis>group</emphasis> your nodes
@@ -139,7 +141,7 @@
 
      </section>
 
-      <section id="node-group-link">
+      <section id="linking-nodes">
        <title>Linking Nodes</title>
        <para>
             Now that Node Groups have been chosen, the next step in planning is to document the individual links between
@@ -155,8 +157,9 @@
         but when it comes time to send data to the central office a store node will do a push.
         </para>
      </section>
+
 
-     <section id="channel">
+     <section id="choosing-channels">
       <title>Choosing Data Channels</title>
         <para>When SymmetricDS captures data changes in the database, the changes are captured in the 
         order in which they occur.  In addition, that order is preserved when synchronizing the 
@@ -178,7 +181,9 @@
         <para>
           Choosing Channels is fairly straightforward and can be changed over time, if needed.  Think about the
           differing "types" of data present in your application, the volume of data in the various types, etc.  What 
-          data is considered must-have and can't be delayed due to a high volume load of another type of data?
+          data is considered must-have and can't be delayed due to a high volume load of another type of data?  For example,
+          you might place employee-related data, such as clocking in or out, on one channel, but sales transactions on another.
+          We will define which tables belong to which channels in the next sections.          
         </para>
 
         <important>
@@ -188,17 +193,99 @@
     </important>    
     </section>  
 
-    <section id="triggers-and-routers">
-      <title>Capturing and Routing Data Changes</title>
+    <section id="defining-data-changes">
+      <title>Defining Data Changes to be Captured and Routed</title>
       <para>
-      At this point, you have planned out the nodes in your application, grouped the nodes based on functionality, defined which node groups
-      send and receive data to which others (and by what method), and organized your data into Channels.  The largest remaining
+      At this point, you have designed the node-related aspects of your implementation, namely choosing nodes, grouping the nodes based on functionality, defining which node groups
+      send and receive data to which others (and by what method).  You have defined data Channels based on the types and priority of data being synchronized.  The largest remaining
       task prior to starting your implementation is to define and document what data changes are to be captured (by defining SymmetricDS <emphasis>Triggers</emphasis>), 
-      and to decide to which node(s) the data changes are to be <emphasis>routed</emphasis>. 
-
+      and to decide to which node(s) the data changes are to be <emphasis>routed</emphasis> to and under what conditions.  We will also, in this section, discuss the concept of
+      an <emphasis>initial load</emphasis> of data into a SymmetricDS node.        
       </para>
-      <section id="triggers">
+      <section id="defining-data-changes-triggers">
+       <title>Defining Triggers</title>
+
+        <para> SymmetricDS uses <emphasis>database triggers</emphasis> to capture and record changes to be synchronized to other nodes. Based on the configuration you provide, SymmetricDS
+        creates the needed database triggers automatically for you.  There is a great deal of flexibility in terms of defining the exact conditions under which a data change is captured.
+        Each trigger you define has a corresponding table associated with it.  In addition, each trigger can specify:
+           <itemizedlist>
+                <listitem>whether to install a trigger for updates, inserts, and/or deletes</listitem>
+                <listitem>conditions on which an insert, update, and/or delete fires</listitem>
+                <listitem>a list of columns that should not be synchronized from this table</listitem>
+                <listitem>a SQL select statement that can be used to hold data needed for routing (known as External Data)</listitem>
+           </itemizedlist>
+        </para>
+       <para>
+       As you define your triggers, consider which data changes are relevant to your application and which ones ar not.  Consider under what special conditions
+       you might want to route data, as well.  For our retail example, we likely want to have triggers defined for updating, inserting, and deleting pricing information
+       in the central office so that the data can be routed down to the stores.  Similarly, we need triggers on sales transaction tables such that
+       sales information can be sent back to the central office.
+       </para>
+       </section>
+
+       <section id="defining-data-changes-routers">
+          <title>Defining Routers</title>
 
+         <para>The triggers that have been defined in the previous section only define <emphasis>when</emphasis>data changes are to be captured
+         for synchronization.  They do not define <emphasis>where</emphasis> the data changes are to be sent to.  Routers, plus a mapping between Triggers and Routers,
+         define the process for determining which nodes receive the data changes.
+         </para>
+
+         <para>Before we discuss Routers and Trigger Routers, we should probably take a break and discuss the process SymmetricDS uses to keep track
+         of the changes and routing.  As we stated, SymmetricDS relies on auto-created database triggers to capture and record relevant data changes into a table, 
+         the
+          <xref linkend="table_data" xrefstyle="table"/> table.  After the data is captured, a background process
+            chooses the nodes that the data will be synchronized to.  This is called <emphasis>routing</emphasis> and it is performed by the Routing Job.
+            Note that the Routing Job does not actually send any data.  It just organizes and records the decisions on where to send data in a "staging"
+            table called <xref linkend="table_data_event" xrefstyle="table"/> and <xref linkend="table_outgoing_batch" xrefstyle="table"/>.
+        </para>
+        <para> 
+            Now we are ready to discuss Routers.  The router itself is what defines the configuration of where to send a data change.  Each Router
+            you define can be associated with or assigned to any number of Triggers through a join table that defines the relationship.
+            For each router you define, you will need to specify:
+            <itemizedlist>
+                <listitem>the target table on the destination node to route the data</listitem>
+                <listitem>the source node group and target node group for the nodes to route the data to</listitem>
+                <listitem>a router <emphasis>type</emphasis> and router <emphasis>expression</emphasis></listitem>
+                <listitem>whether to route updates, inserts, and/or deletes</listitem>
+             </itemizedlist>
+             </para>
+             <para>
+             For now, do not worry about the specific routing types.  They will be covered later.  For your design simply make notes of the information needed
+             and decisions to determine the list of nodes to route to.  You will find later that there is incredible flexibility and functionality available in routers.
+             For example, you will find you can:
+
+              <itemizedlist>
+                <listitem>send the changes to all nodes that belong to the target node group defined in the router.</listitem>
+                    <listitem>compare old or new column values to a constant value or the value of a node's identity.</listitem>
+                    <listitem>execute a SQL expression against the database to select nodes to
+                        route to. This SQL expression can be passed values of old and new column values.</listitem>
+                    <listitem>execute a Bean Shell expression in order to select nodes to route to.
+                        The Bean Shell expression can use the the old and new column values.</listitem>
+                    <listitem>publish data changes directly to a messaging solution instead
+                        of transmitting changes to registered nodes.  (This router must be configured manually in XML as an extension point.)</listitem>
+               </itemizedlist>
+
+        </para>
+        <para>
+        For each of your Triggers, decide which Router matches the behavior needed for that Trigger.  These Trigger Router combinations will be used to
+        define a mapping between your Triggers and Routers when you implement your design.
+        </para>
+        </section>  
+
+
+        <section id="defining-data-changes-trigger-routers">
+          <title>Planning Initial Loads</title>
+
+          <para>The mapping between Triggers and Routers defines more than just the many-to-many relationship between your Triggers and your Routers.  It also defines
+          two aspects of how initial loads occur, so now is a good time to plan how your <emphasis>Initial Loads</emphasis> will work.
+          SymmetricDS provides the ability to "load" or "seed" a nodes database with specific sets of data from its parent load.  This concept is known as an <emphasis>Initial Load</emphasis> of
+          data and is very, very useful for most applications.  Using our retail example, consider a new store being opened.  Initially, you would to pre-populate
+          the stores' database with all the item, pricing, and tax data for that particular store.  This is achieved through an initial load.  
+          </para>
+          <para>
+          For each Trigger Router you define, you can choose the order in which 
+          </para>
 
       </section>
     </section>