This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
Fabio Akita (author)
Thu Apr 03 09:49:51 -0700 2008
commit a76f6aeafad3628ec2e77dc36840cef3facbbdcb
tree 0e9cd933ecb51d2dcf152997ac5f000bc0e2fd1b
parent 355f7dc48f85f3a405ee1d61b07d9b4e44d9d41f
tree 0e9cd933ecb51d2dcf152997ac5f000bc0e2fd1b
parent 355f7dc48f85f3a405ee1d61b07d9b4e44d9d41f
| name | age | message | |
|---|---|---|---|
| |
.gitignore | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
CHANGELOG | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
MIT-LICENSE | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
README | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
Rakefile | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
TODO | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
generators/ | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
init.rb | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
install.rb | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
lib/ | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
script/ | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
test/ | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
trunk/ | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
| |
uninstall.rb | Thu Apr 03 09:49:51 -0700 2008 | [Fabio Akita] |
README
== ActsAsReplica
This plugin is meant to be used in offline-client scenarios where the same Rails
app is deployed in both the clients and in the main server. For instance using
some other 3rd party solution as Joyent's Slingshot, Rails2Ext and so forth.
Bear in mind that this is not a one-package-solves-all kind of solutions. It
assumes the scenario of multiple offline clients and one master server. It doesn't
replace heavy industrial level message queues or database level merge replication.
It also doesn't support master-less distributed peer-to-peer replications. Only
N-clients-1-master is supported by now.
Clients can input data offline. This data will be recorded in a local sqlite3 file.
Then it can connect to the server to pull more recent data from it and push its
new data back to it.
== Background Job
This sollution relies on a background job and batch control. The Rails App can
trigger the execution of the background job that will actually do the replication
procedure. The plugin generator will create a sample SyncsController and views
that you can tailor to your needs. In the background ('system' call in *nix and
Process.create in Win32) it will start a script/runner process that calls
lib/daemons/replicator.rb. The sample controller reads this log
file to create a user feedback on screen via Ajax call.
== Dependencies
- gem install uuidtools
- gem install fastercsv
Win32Utils in Windows
== Installation
./script/generate replicator
== Project Assumptions
This plugin follows several assumptions:
- Every replicable table has to have a Surrogate UUID-based primary key
This is made this way to avoid any possible primary key conflict between
the clients or server. Yes, I could use integer ranges for each client but this
would add unnecessary overhead to the process. I could also have made some
man-in-the-middle controller that would transact ids back and forth, but this
would be even more unnecessary. UUIDs are fast, simple and reliable.
- This app has to have a User class with a singleton 'current_user' method.
The app has to make sure User.current_user always contain something (usually
with the before_filter method in the controller to get the currently logged
in user). Just define 'acts_as_auditor' in the User model for this.
- The primary key of the User model also has to be a UUID, and it also has to
have a secondary UUID (column named GUID) that has to be available at the
RemoteClient model in the server. It means that the server doesn't need to
have a full User table with all the offline clients if it doesn't want to
(this may make the deployment process easier). And finally, this User model
also has to have a last_synced integer column to record the latest replicated
transaction log entry.
- Every replicable table has to have UserStamps (created_by, created_at, updated_by,
updated_at) because this plugin uses this data to know how to track them. So,
it's not optional. The detail being the the created_by and updated_by columns
will hold the UUID primary key of the User.
- The client can be behind a http proxy, using SSL connection and the web server
can request basic authentication credentials. Configurations can be held in the
config/syncable.yml file. Be careful though, as it supports the same
infra-structure as Net::HTTP, so probably Windows based servers need more tests
as they are usually not standards compliant. Refer to the SyncSetting model for
details. This table will contain only ONE SINGLE ROW for each client machine.
Be careful not to duplicate settings because one single setting will have
a specific UUID bound to the machine. This ID is important for it's used to
uniquely identify each client app that replicates back to the server.
- It doesn't use XML for the payload packages for 2 reasons: first of all, I don't
personally like XML for data transfer. Second of all, YAML is lighter weight,
supported through all Ruby and Rails objects nativelly and easily human readable.
One can make an adapter later, as this is only a matter of marshalling. So it
may not be very easy to place message brokers in between the client and server.
But as I said, this is a very opinionated piece of software made for my own use.
== Basic Workflow (started through /syncs/perform_sync in the client)
(1) The client initiate a handshake process:
GET /syncs/handshake.yaml
(2) The server creates an internal session and sends back a cookie ID
(session ID), a hashed challenge key and it's own machine ID (UUID).
(3) The client has to look for its internal users's GUID and create a
response to the challenge:
POST /syncs/handshake.yaml?client_id=&challenge_response=
(4) The server has the user's GUID mapped in the RemoteClient table so it
can compare the received response with its own. When the server receives
new data from the client, it looks for a correspondent entry in the
RemoteMachine table. Each user can be bound to many machines, each having
its own machind UUID. That way the user can choose to work in any client
app. installed in any machine and still be able to replicate data reliably.
Each RemoteMachine records the latest executed transaction log entry, so
it know where to restart the next time.
(5) Now, the client requests the most recent data from the server. It has to
look for the last_synced column in its own User table.
POST /syncs/down.yaml&for_when=9999
(6) Server calls Replica.down internally and looks for all new data since the
'for_when' integer received that was not created by the logged in user. Sends
back a ActsAsReplica::Structs::SyncPayload package encoded as YAML.
(7) Client calls Replica.up internally to record the new data. If everything goes
fine, records the latest last_synced transaction entry ID in the User table.
(8) Client calls Replica.down internally, using the latest recorded transaction entry
and machine ID obtained from the server upon the handshake described above.
It retrieves the newest data it has created offline and also creates a
ActsAsReplica::Structs::SyncPayload package that it posts to the server in
YAML format:
POST /syncs/up.yaml?syncs=<YAML::Object>
(8) Server calls Replica.up internally and processes the received package. If
everything goes fine, it updates the last_synced column in the
RemoteMachine table for this particular logged in user/machine.
(9) Client compiles the results page with all that happened in this transaction
== FIRST LOGIN
When a brand new desktop stand-alone installation is done, the database is probably
empty. But the user has to log into the server. So we have a bootstrap problem:
how to log in if the local database is void of any user to do so?
We have to integrate a "first login" procedure into your authentication system. The
user is prompted for his username/password. The authentication proceed with a local
verification. If it fails then it checks connectivity and then queries the server:
(1) POST /syncs/handshake.yaml?username=XXX&password=YYY
Ideally this is done through a SSL connection so the password is never disclosed
over a plain text only protocol (further cryptography could help).
(2) The server queries it's own local database. If it confirms it, then it sends
back a YAML serialized array containing [@user, @revision]. This revision is for
SVN upgrading integration (see lib/daemons/upgrade.rb).
(3) The local call will automatically receive the server's serialized User object
and properly persist it locally. Now you can authenticate the user and
automatically start a replication/upgrade procedure as described in the previous section
== INITIAL TESTS
As this involves at least two peers, we have to load up at least two mongrel
processes. In this particular test, we'll use the development and production
environments at once as a testbed for a simple scenario.
(1) First, everytime we want to test the whole scenario, we have to clean the
databases. Migrations are already set to correctly populate both different
environments. So, from the shell:
rm db/*.sql*; rake db:migrate RAILS_ENV=development; rake db:migrate RAILS_ENV=production
(2) Now, we start 2 mongrel processes in 2 different shells:
./script/server -p 3000 -e development
or ./script/runner '@logged_user=User.find_by_login("admin").id; load "lib/daemons/replicator.rb"'
./script/server -p 3001 -e production
(3) Now, login with username 'admin', password 'admin' at:
http://localhost:3000/users/login
(4) Then manually type this URL:
http://localhost:3000/syncs/perform_sync
(5) The call above simulates a client starting synchronization with a server. If
everything went fine, we can get in the ./script/console [environment] of each
and check that totals for ReturnOrder.count and Batch.count are the same in both
environments. The browser should disclose something similar to this:
Perform Syncing Results:
./script/runner 'puts ReturnOrder.count; puts Batch.count' -e development
./script/runner 'puts ReturnOrder.count; puts Batch.count' -e production
The results should be exactly the same




