Skip to content

Maintaining lobid API

Pascal Christoph edited this page Apr 9, 2024 · 7 revisions

configure

There is a web/conf/resources.conf_template which is to be copied to web/conf/resources.conf and configured properly.

webhook: updating, creating and switching an index

The webhook configuration is part of web/conf/resources.conf.

webhook = {
    alma = {
            update = {
                    filename = "/data/other/datenportal/export/alma/prod/update.xml.bgzf"
                    indexname = "resources-alma-fix-staging"
            }
            basedump = {
                    filename = "/data/other/datenportal/export/alma/prod/baseline.xml.bgzf"
                    indexname = "resources-alma-fix"
                    switch = {
                            automatically = "false"
                            minDocs = "83000000"
                            minSize = "49613349376"
                    }
            }
            token = "secret"
    }
    email = "change@me"
}
index = {
  name = "resources-alma-fix-staging"

update an index

If you want to update an index, call the webhook listener like this:

$user@prod:$ curl http://localhost:7507/resources/webhook/update-alma?token=secret

The parameter token must have the value configured in webhook.alma.token. This update call will ETL the file configured in webhook.alma.update.filename into the index named as described in webhook.alma.update.indexname. The file update.xml.bgzf contains all the updates up to the last basedump, so it's sufficient to only index this one file at any point (given that that the newest basedump is indexed).

create an index

If you want to create an index, call the webhook listener like this:

$user@prod:$ curl http://localhost:7507/resources/webhook/basedump-alma?token=secret

It's deliberately not possible to create an index not suffixed with a timestamp. It's also deliberately not possible to assign an alias not suffixed with -staging. Both (suffixing timestamp to the index-name and -staging to the index-alias) is done automatically. The index-alias is created by the value found in webhook.alma.basedump.indexname automatically suffixing -staging.

switching index aliases

We always have at least two indices and two aliases, one of them with the -staging suffix. This way we can always switch quickly back to an older index if there is a problem with the new one. Also, we have a stage index for testing purpose (if we don't switch the index automatically, that is).

automatically

The switching of the aliases is done automatically if webhook.alma.basedump.switch.automatically is set to true and the values defined there (minDocs and minSize) are fullfilled in the newly created index.

manually

If you want to switch an alias from e.g. resources-alma-fix-staging to resources-alma-fix, making the index available at https://alma.lobid.org/resources/ , you can switch the alias manually in two ways:

using the elasticearch plugin "cerebro"

(see (only hbz-internally): https://dienst-wiki.hbz-nrw.de/display/SEM/Elasticsearch+Indexe?src=contextnavpagetreemode)

invoking the switch-alias webhook

$ curl -Lvvv http://localhost:7507/resources/webhook/switchalias?token=secret

The index alias, which is to be switched, is configured at: webhook.alma.basedump.indexname . This alias is switched with the alias consisting of this index name plus suffixed -staging.

deployment

note: the deployment is automatized via cron. If the deployment at Friday is enough for you, you don't need to do anything here. From the cron:

00 20 * * Fri ssh $user@prod "cd /home/$user/git/lobid-resources-alma/ && git pull && mvn install -DskipTests=true && cd web/ && bash restart.sh lobid-resources-alma"
00 20 * * Tue ssh $user@stage "cd /home/$user/git/lobid-resources-alma/ && git pull && mvn install -DskipTests=true && cd web/ && bash restart.sh lobid-resources-alma"

If you want to deploy a new FIX or new programs: It is crucial to know that it is not sufficient to reside in the git repo ($user@stage:[~/git/lobid-resources-test/] ) and do your normal git pull and mvn clean install -DskipTests=true - you have to also restart the play app so that the new maven build and morphs are taken into account. Note that changes to the webhook parameters in web/conf/resources.conf are loaded dynamically whenever a webhook is called, so you don't need to restart anything when changing these configs.

Important: as the web app is deployed in a high available setting (i.e. mirrored at two different servers) you have to deploy at both servers: $prod and $stage to have them both in sync. See following "server" section how to do this.

Have a look at the logs in web/logs/*.log to see whats going on: application.log is the log file of the web app, ETL.log is the log file of the invoked ETL processes (that's the library you've built by doing mvn clean install at the root of the repo).

production

To invoke a "restart" of an API it's sufficient to kill the running web app - it's restarted automatically (see section "server.production" for details). You kill it by first getting the PID ("Process ID") with lsof -i:7507 and kill it by kill $PID. Wait a minute and see if the web app has started automatically by checking with lsof -i:7507 again.

test

Test runs at $stage:7502. It's also monitored, thus Killing the PID invokes a restart (which may take up to 5 minutes). You also could start it manually: first # monit stop lobid-resources-test and then: rm /home/$user/git/lobid-resources-test/web/target/universal/stage/RUNNING_PID; sbt "start 7502"

server

We have git-cloned repos at three locations:

production

The production web app serves "https://alma.lobid.org/resources/". It's deployed at two servers, one of them is the fallback server. If the production web app is failing, apache (using the HA proxy directive) automatically switches to the fallback server at $stage:7507 when $prod:7507 is down, e.g. when $prod:7507 is restarted when deploying. Thus we always have a seemingly working API. Apache switches back to $prod:7507 if that one is up again. Both apps are monitored by monit.

The web app (aka "the API") uses the index configured by index.name.

$prod, port 7507

At $user@prod:~/git/lobid-resources-alma the production API is deployed.

$stage, port 7507

At $user@stage:~/git/lobid-resources-alma the production API fallback is deployed.

test

This serves "http://test.lobid.org/resources/".

$stage, port 7502

At $user@stage:~/git/lobid-resources-test the test API is deployed.