HQ Data Model
The HQ Data Model is one method for managing data across multiple stores. This model centralizes data management and is not designed for resiliency against connectivity problems. On-location data stores are provided for performance as well as minimum necessary functionality during outages as opposed to full redundancy.
- A Database (DB) is a server running SQL, typically MySQL
- A Web Server is a server running the Office UI, typically via Apache
- In diagrams databases and web servers are shown as separate machines. This is for illustrative purposes and not strictly necessary.
- A Lane is a machine running the POS software and checking out customers
- Writes are SQL queries that add, remove, or alter records within the database or structures within the database.
- Reads are SQL queries that are not Writes
- Replication is an automated process where changes to one database are propogated to one or more slave DBs
- Manual Syncs are a non-automated process where data is transferred between databases.
The Master Database is the primary data store for the entire organization. If there is any discrepancy between databases, the Master Database is correct. Directing the vast majority of Writes to the Master Database greatly reduces the potential for discrepancies and inconsistencies. The Master Database always resides at HQ. HQ might be a retail location with its own Lanes but does not have to be.
The Store Database is a local copy of the Master Database. Each non-HQ location should have a Store Database replicating the Master Database. The Store Database serves two purposes. First, it reduces load on the Master Database and latency for end users by handling a portion of Reads. Second, it provides a local data store for lane transaction data. Because transaction data is only INSERTed and never modified it's safe to handle this data out-of-band from replication. One named database on the Store Database server should be excluded from replication for this purpose.
Slave Databases are entirely optional. One or more slave databases at each location can also replicate the Master Database. Slave Database can be used to further spread out the load of Read queries, as hot spares for other hardware, or for taking snapshot backups without taking the Master Database offline.
The Web Servers
The web servers run the Office software and provide the end-user UI. The HQ web server simply Reads from and Writes to the Master Database. A non-HQ web server should still be configured with the Master Database as its primary data store. All Writes will be directed to the Master Database while Reads will be load-balanced across all available databases.
The non-HQ web server is useful for a couple reasons. It has its own configuration specifying a store number and which lanes its responsible for. This allows syncing the correct products to the correct lanes. It also spreads out server load and reduces latency.
Lanes interact with their nearest database. HQ Lanes, if any, will interact with the Master Database. Non-HQ Lanes will interact with their Store Database. Writes to lanes are usually proxied through the location's Web Server which will read from an available database and copy the results out to the lanes. As mentioned above, non-HQ lanes should be configured to Read and Write from the local Store Database. Any loss of site-to-site connectivity will be entirely invisible to the Lanes.
Connectivity and Outages
Loss of connectivity between locations causes problems for non-HQ locations. The Store Web Server can no longer save any changes since all Writes depend on the Master Database and some fraction of Reads will fail as well (this could and should be mitigated in software detecting loss of Read DBs but is not currently). Possible mitigation strategies include:
- Wait / Fix the connectivity problem. Depending on circumstances this may be perfectly reasonable.
- Reconfigure Office to use the Store Database as its primary data store.
- If you use an SQL user account that only has read access to the Store Database, changes cannot be saved but no data inconsistencies will be introduced.
- If you use an SQL user account that has read/write access to the Store Database, you'll need to do some cleanup once connectivity is restored.
- If you use an SQL user account that has read access to the entire Store Database but only has read/write access to one named database, e.g. core_op, the scope of cleanup will be reduced. Giving read/write access as needed on a per-table basis could be even better.
Load Balanced Reads
This setup makes a clear distinction between Reads and Writes. It's the developers responsibility to keep track of these. Calling FannieDB::getReadOnly will return a connection to an available database where as calling FannieDB::get will return a connection to the configured primary data store. Alertnate, read-only databases (both Store DBs and Slave DBs) are configured on Office's Stores tab. Using SQL user account with read-only access is recommended. Developers are human and may accidentally send Write queries to a read-only database.
Changing an Item
Most types of changes are quite simple. A web server running Office writes to the master database at HQ and changes are distributed to other store databases via replication. Changes that should immediately propagate to all lanes are a bit more complicated.
When any web server running Office changes an item it:
- Writes changes to the master database at HQ
- Writes changes to its own lanes
- Sends an HTTP request to other store web server(s) indicating which item has changed
On receiving the HTTP request, other store web servers then sync the item to their own lanes.
Lanes and Non-HQ Databases
In the proposed set up, a non-HQ database server is replicating the HQ database server except for one non-replicated, writable database that receives lane transaction data. So with conventional names
trans_archive would all be replicas of HQ databases but
store_trans would be an intermediate step. Lanes would send their data to
store_trans and a separate process would shuffle this data over to
core_trans on the HQ database server.
- No loss of lane functionality if connectivity to HQ fails. There's stand alone, of course, but suspending & resuming transactions across lanes doesn't work in stand alone.
- Faster. Users notice when drawers are slow to open or receipts are slow to print and the time it takes to ship transaction data to the server is a direct contributor to that.
- More complicated. The lane may have to keep track of more than one database name and direct some queries to a different database.
- Reporting lag. Whatever background process is shuffling data from
store_transto HQ is going to run periodically.
- New place to make mistakes. Less of a long term concern but the extra step copying data will have room for bugs and oversights.
- Suspend/resume won't work across stores. This might be a pro or a con depending on user perspective and situation.