public
Description: A full featured solution for database replication under a Rails application layer
Homepage: http://www.akitaonrails.com
Clone URL: git://github.com/akitaonrails/acts_as_replica.git
First import
Fabio Akita (author)
Thu Apr 03 09:49:51 -0700 2008
commit  a76f6aeafad3628ec2e77dc36840cef3facbbdcb
tree    0e9cd933ecb51d2dcf152997ac5f000bc0e2fd1b
parent  355f7dc48f85f3a405ee1d61b07d9b4e44d9d41f
0
...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
0
@@ -0,0 +1,193 @@
0
+== ActsAsReplica
0
+
0
+This plugin is meant to be used in offline-client scenarios where the same Rails
0
+app is deployed in both the clients and in the main server. For instance using
0
+some other 3rd party solution as Joyent's Slingshot, Rails2Ext and so forth.
0
+
0
+Bear in mind that this is not a one-package-solves-all kind of solutions. It
0
+assumes the scenario of multiple offline clients and one master server. It doesn't
0
+replace heavy industrial level message queues or database level merge replication.
0
+It also doesn't support master-less distributed peer-to-peer replications. Only
0
+N-clients-1-master is supported by now.
0
+
0
+Clients can input data offline. This data will be recorded in a local sqlite3 file.
0
+Then it can connect to the server to pull more recent data from it and push its
0
+new data back to it.
0
+
0
+== Background Job
0
+
0
+This sollution relies on a background job and batch control. The Rails App can
0
+trigger the execution of the background job that will actually do the replication
0
+procedure. The plugin generator will create a sample SyncsController and views
0
+that you can tailor to your needs. In the background ('system' call in *nix and
0
+Process.create in Win32) it will start a script/runner process that calls
0
+lib/daemons/replicator.rb. The sample controller reads this log
0
+file to create a user feedback on screen via Ajax call.
0
+
0
+== Dependencies
0
+
0
+- gem install uuidtools
0
+- gem install fastercsv
0
+
0
+Win32Utils in Windows
0
+
0
+== Installation
0
+
0
+./script/generate replicator
0
+
0
+== Project Assumptions
0
+
0
+This plugin follows several assumptions:
0
+
0
+- Every replicable table has to have a Surrogate UUID-based primary key
0
+ This is made this way to avoid any possible primary key conflict between
0
+ the clients or server. Yes, I could use integer ranges for each client but this
0
+ would add unnecessary overhead to the process. I could also have made some
0
+ man-in-the-middle controller that would transact ids back and forth, but this
0
+ would be even more unnecessary. UUIDs are fast, simple and reliable.
0
+
0
+- This app has to have a User class with a singleton 'current_user' method.
0
+ The app has to make sure User.current_user always contain something (usually
0
+ with the before_filter method in the controller to get the currently logged
0
+ in user). Just define 'acts_as_auditor' in the User model for this.
0
+
0
+- The primary key of the User model also has to be a UUID, and it also has to
0
+ have a secondary UUID (column named GUID) that has to be available at the
0
+ RemoteClient model in the server. It means that the server doesn't need to
0
+ have a full User table with all the offline clients if it doesn't want to
0
+ (this may make the deployment process easier). And finally, this User model
0
+ also has to have a last_synced integer column to record the latest replicated
0
+ transaction log entry.
0
+
0
+- Every replicable table has to have UserStamps (created_by, created_at, updated_by,
0
+ updated_at) because this plugin uses this data to know how to track them. So,
0
+ it's not optional. The detail being the the created_by and updated_by columns
0
+ will hold the UUID primary key of the User.
0
+
0
+- The client can be behind a http proxy, using SSL connection and the web server
0
+ can request basic authentication credentials. Configurations can be held in the
0
+ config/syncable.yml file. Be careful though, as it supports the same
0
+ infra-structure as Net::HTTP, so probably Windows based servers need more tests
0
+ as they are usually not standards compliant. Refer to the SyncSetting model for
0
+ details. This table will contain only ONE SINGLE ROW for each client machine.
0
+ Be careful not to duplicate settings because one single setting will have
0
+ a specific UUID bound to the machine. This ID is important for it's used to
0
+ uniquely identify each client app that replicates back to the server.
0
+
0
+- It doesn't use XML for the payload packages for 2 reasons: first of all, I don't
0
+ personally like XML for data transfer. Second of all, YAML is lighter weight,
0
+ supported through all Ruby and Rails objects nativelly and easily human readable.
0
+ One can make an adapter later, as this is only a matter of marshalling. So it
0
+ may not be very easy to place message brokers in between the client and server.
0
+ But as I said, this is a very opinionated piece of software made for my own use.
0
+
0
+== Basic Workflow (started through /syncs/perform_sync in the client)
0
+
0
+(1) The client initiate a handshake process:
0
+
0
+GET /syncs/handshake.yaml
0
+
0
+(2) The server creates an internal session and sends back a cookie ID
0
+ (session ID), a hashed challenge key and it's own machine ID (UUID).
0
+
0
+(3) The client has to look for its internal users's GUID and create a
0
+ response to the challenge:
0
+
0
+POST /syncs/handshake.yaml?client_id=&challenge_response=
0
+
0
+(4) The server has the user's GUID mapped in the RemoteClient table so it
0
+ can compare the received response with its own. When the server receives
0
+ new data from the client, it looks for a correspondent entry in the
0
+ RemoteMachine table. Each user can be bound to many machines, each having
0
+ its own machind UUID. That way the user can choose to work in any client
0
+ app. installed in any machine and still be able to replicate data reliably.
0
+ Each RemoteMachine records the latest executed transaction log entry, so
0
+ it know where to restart the next time.
0
+
0
+(5) Now, the client requests the most recent data from the server. It has to
0
+ look for the last_synced column in its own User table.
0
+
0
+POST /syncs/down.yaml&for_when=9999
0
+
0
+(6) Server calls Replica.down internally and looks for all new data since the
0
+ 'for_when' integer received that was not created by the logged in user. Sends
0
+ back a ActsAsReplica::Structs::SyncPayload package encoded as YAML.
0
+
0
+(7) Client calls Replica.up internally to record the new data. If everything goes
0
+ fine, records the latest last_synced transaction entry ID in the User table.
0
+
0
+(8) Client calls Replica.down internally, using the latest recorded transaction entry
0
+ and machine ID obtained from the server upon the handshake described above.
0
+ It retrieves the newest data it has created offline and also creates a
0
+ ActsAsReplica::Structs::SyncPayload package that it posts to the server in
0
+ YAML format:
0
+
0
+POST /syncs/up.yaml?syncs=<YAML::Object>
0
+
0
+(8) Server calls Replica.up internally and processes the received package. If
0
+ everything goes fine, it updates the last_synced column in the
0
+ RemoteMachine table for this particular logged in user/machine.
0
+
0
+(9) Client compiles the results page with all that happened in this transaction
0
+
0
+== FIRST LOGIN
0
+
0
+When a brand new desktop stand-alone installation is done, the database is probably
0
+empty. But the user has to log into the server. So we have a bootstrap problem:
0
+how to log in if the local database is void of any user to do so?
0
+
0
+We have to integrate a "first login" procedure into your authentication system. The
0
+user is prompted for his username/password. The authentication proceed with a local
0
+verification. If it fails then it checks connectivity and then queries the server:
0
+
0
+(1) POST /syncs/handshake.yaml?username=XXX&password=YYY
0
+
0
+Ideally this is done through a SSL connection so the password is never disclosed
0
+over a plain text only protocol (further cryptography could help).
0
+
0
+(2) The server queries it's own local database. If it confirms it, then it sends
0
+back a YAML serialized array containing [@user, @revision]. This revision is for
0
+SVN upgrading integration (see lib/daemons/upgrade.rb).
0
+
0
+(3) The local call will automatically receive the server's serialized User object
0
+and properly persist it locally. Now you can authenticate the user and
0
+automatically start a replication/upgrade procedure as described in the previous section
0
+
0
+== INITIAL TESTS
0
+
0
+As this involves at least two peers, we have to load up at least two mongrel
0
+processes. In this particular test, we'll use the development and production
0
+environments at once as a testbed for a simple scenario.
0
+
0
+(1) First, everytime we want to test the whole scenario, we have to clean the
0
+databases. Migrations are already set to correctly populate both different
0
+environments. So, from the shell:
0
+
0
+rm db/*.sql*; rake db:migrate RAILS_ENV=development; rake db:migrate RAILS_ENV=production
0
+
0
+(2) Now, we start 2 mongrel processes in 2 different shells:
0
+
0
+./script/server -p 3000 -e development
0
+ or ./script/runner '@logged_user=User.find_by_login("admin").id; load "lib/daemons/replicator.rb"'
0
+
0
+./script/server -p 3001 -e production
0
+
0
+(3) Now, login with username 'admin', password 'admin' at:
0
+
0
+http://localhost:3000/users/login
0
+
0
+(4) Then manually type this URL:
0
+
0
+http://localhost:3000/syncs/perform_sync
0
+
0
+(5) The call above simulates a client starting synchronization with a server. If
0
+everything went fine, we can get in the ./script/console [environment] of each
0
+and check that totals for ReturnOrder.count and Batch.count are the same in both
0
+environments. The browser should disclose something similar to this:
0
+
0
+Perform Syncing Results:
0
+
0
+./script/runner 'puts ReturnOrder.count; puts Batch.count' -e development
0
+./script/runner 'puts ReturnOrder.count; puts Batch.count' -e production
0
+
0
+The results should be exactly the same

Comments

    No one has commented yet.