data-coordinator is a set of related services, libraries, and scripts that takes SoQL upserts, inserts them to the truth store, watches truth store logs and writes the data to the secondary stores.
- coordinator - contains the REST service for the data coordinator, as well as services for secondary watchers, various scripts including backup
- coordinatorlib
- common utilities and data structures for working with truth store and secondaries, at a lower level
- database migrations are in src/main/resources
- coordinatorlib-soql - en/decoders between SoQL and JSON, CSV, and SQL
- dummy-secondary - a dummy secondary store implementation for testing only
- secondarylib - trait for Secondary store
To run the tests, from the SBT shell:
project data-coordinator
test:test it:test
To run the data coordinator, from the regular Linux/OSX shell prompt:
bin/start_dc.sh
The above scripts builds the assembly if its not present and runs the fat jar on the command line, which is much more memory efficient than running it from sbt. If you need to force a rebuild, simply run sbt clean
beforehand.
If you haven't already, edit your /etc/soda2.conf
file to remove the contents of the com.socrata.coordinator.common.secondary.instances
section. It should resemble something like this:
com.socrata.coordinator.common = {
database = ${common-database} {
app-name = "data coordinator"
database = "datacoordinator"
}
instance = ${data-coordinator-instance}
secondary {
...
instances {}
...
}
From within the soql-postgres-adapter
repository, start secondary watcher like this:
sbt clean assembly
java -Djava.net.preferIPv4Stack=true -Dconfig.file=/etc/pg-secondary.conf -jar store-pg/target/scala-2.10/store-pg-assembly-3.1.4-SNAPSHOT.jar
To run migrations in this project from SBT:
sbt -Dconfig.file=/etc/soda2.conf "coordinator/run-main com.socrata.datacoordinator.primary.MigrateSchema migrate"
Alternatively, to build from scratch and run migrations:
sbt clean
bin/run_migrations.sh
To run migrations without building from scratch: bin/run_migrations.sh
The command is one of migrate
, undo
, redo
, and there is a second optional parameter for undo for the number of changes to roll back.
Running from sbt is recommended in a development environment because it ensures you are running the latest migrations without having to build a new assembly.
Below is a copy of the email distributed to engineering when breaking changes were made to the secondary watcher architecture:
Hi All, The secondary architecture has been inverted (thank you @robert.macomber). The secondaries (
pg
,geocoding
) are no longer dynamically loaded as jar files insecondary-watcher
. But, instead are now their own executable andsecondary-watcher
is now a library that they use.
The install and start scripts in
docs/onramp
have been updated (pending merge) -- they also now include the geocoding / region coding secondary.
How to update your stack:
- Pull main of
data-coordinator
,soql-postgres-adapter
, andgeocoding-secondary
(if you wish)- Fetch
docs
and check-out branchesen-7807
andaerust/en-7807
respectively (branches aren't quite merged but functioning).- Run
sbt assembly
for all of the above scala projects.- Update your
/etc/soda2.conf
file;com.socrata.coordinator.common.secondary.instances
should be empty (but still needs to be there :( ). Copy the new config files for the secondaries over to/etc
:
sudo cp $DEV_DIR/docs/onramp/services/pg-secondary.conf /etc/ sudo cp $DEV_DIR/docs/onramp/services/geocoding-secondary.conf /etc/
If you want to use the geocoding secondary you will need to add a MapQuest app-token to the config and add it to the
secondary_stores_config
table indatacoordinator
(truth).INSERT INTO secondary_stores_config (store_id, next_run_time, interval_in_seconds, is_feedback_secondary) VALUES( 'geocoding', now(), 5, true);
- You can now run the secondary stores as their own executable. (See `docs/onramp/start.sh for specifics)
- You can delete your
~/secondary-stores
directory 🎉
Thanks, Alexa
To update the library:
- Make a PR, get it approved and merge your commits into main.
- From main, run
sbt release
. This will create two commits to bump the version and create a git tag for the release version and then push them to the remote repo. - The Jenkins job for the main branch will build the stages "Check for Version Change" and "Publish Library" to publish the library.