Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Production system upgrade #63

Closed
19 of 22 tasks
datadavev opened this issue Oct 28, 2020 · 2 comments
Closed
19 of 22 tasks

Production system upgrade #63

datadavev opened this issue Oct 28, 2020 · 2 comments

Comments

@datadavev
Copy link
Contributor

datadavev commented Oct 28, 2020

Items here are moved from #20

Real Go-Live during maintenance window (starting 2020-11-12 05:00 PT):

Outcome of this task will be:

EC2 instance: uc3-ezidui01x2-prd
RDS instance: rds-ias-ezid-search2-prd rds-ias-ezid-search4-prd

Day before production upgrade:

  • ezid: On the day before, ensure that the EZID link-checker has been paused (Rushiraj)

Start of upgrade time window:

  • ezid: configure apache service on uc3-ezidx2-prd to display maintenance page (Rushiraj)

  • ias: clone rds instance: rds-ias-ezid-store-prd -> rds-ias-ezid-search2-prd (Martin)

  • ezid: Login with tunnel from 8080 to 18880 and prepare environment:

ssh ez-new-prd -L 8080:localhost:18880

sudo su - ezid
cdez
​
. ../etc/ezid_env.sh
echo $DJANGO_SETTINGS_MODULE
    > settings.production
​
EZDB='rds-ias-ezid-search2-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com'
  • ezid: Ensure apache on uc3-ezidui01x2-prd is shutdown:
sudo systemctl stop ezid

After database clone is complete:

  • ezid:Capture the shoulder table as it is on current prd (pre-migration and shoulder-merge-master):
out='prd.ezidapp_shoulder.not_migrated.'`date '+%Y-%d-%m'`'.sql.xz'
mysqldump -h $EZDB -u ezidrw \
    ezid ezidapp_shoulder \
    --skip-lock-tables \
    | xz > $out
less $out
  • ezid: Update database host configuration:
out='settings/ezid.conf.shadow'
perl -i -pe "s/^((store_host|search_host): ).*/\1${EZDB}" $out
less $out
  • ezid: Set database user to eziddba:
out='settings/common.py'
perl -i -pe 's/ezidrw/eziddba/g' $out
less $out
  • ezid: upgrade db schema on rds-ias-ezid-search2-prd adding new columns to ezidapp_shoulder:
./manage.py migrate
  • ezid: copy most recent version of master_shoulders.txt and minters to uc3-ezidui01x2-prd
  • ezid: synchronize minters (is this action done on db or app server?)
./manage.py shoulder-merge-master
  • ezid: Verify that the BerkeleyDB minter databases are present and working:
./manage.py shoulder-check-minters

Bring new EZID instance online:

  • ezid: Start EZID Apache:
sudo systemctl start ezid
  • ezid: test application functionality. url: https://uc3-ezid-ui-prd.cdlib.org

  • ezid: change application to point to the correct DataCite, Crossref, and Handle system instances (Rushiraj)

  • ias: update DNS alias ezid.cdlib.org to point to ALB instance: uc3-ezidui-prd-alb-1936286154.us-west-2.elb.amazonaws.com (Martin)

  • ias: make the ALB publicly accessible (Martin)

  • ezid: test application functionality. url: https://ezid.cdlib.org (everyone who can)

  • ezid: declare EZID is back on line

  • ezid: restore link-checker (will handle in separate issue)

  • ezid: update cron (will handle in separate issue)

  • ezid: clean up development servers etc (will handle in separate issue)

@datadavev
Copy link
Contributor Author

datadavev commented Nov 12, 2020

Issues encountered during the production upgrade:

  • An automated patch process restarted the old production service. This required restarting the database clone as content had changed on the old production instance. Consequence is that the production RDS is now rds-ias-ezid-search4-prd
  • The value for {production}ezid_base_url in ezid.conf was not updated to use the public address. This resulted in a couple of identifiers having the incorrect URL for target. impacted identifiers were:
MySQL [ezid]> select id, identifier, target from ezidapp_searchidentifier where target like 'https://uc3-ezid-ui-prd.cdlib.org/id/%';
+----------+------------------------+-------------------------------------------------------------+
| id       | identifier             | target                                                      |
+----------+------------------------+-------------------------------------------------------------+
| 34259391 | ark:/99999/fk4z90n85r  | https://uc3-ezid-ui-prd.cdlib.org/id/ark:/99999/fk4z90n85r  |
| 34259392 | doi:10.5072/FK2DJ5J51T | https://uc3-ezid-ui-prd.cdlib.org/id/doi:10.5072/FK2DJ5J51T |
| 34259394 | doi:10.15697/0W3B      | https://uc3-ezid-ui-prd.cdlib.org/id/doi:10.15697/0W3B      |
| 34259625 | ark:/81431/p38w3883x   | https://uc3-ezid-ui-prd.cdlib.org/id/ark:/81431/p38w3883x   |
| 34259626 | ark:/87602/m4/M166769  | https://uc3-ezid-ui-prd.cdlib.org/id/ark:/87602/m4/M166769  |
+----------+------------------------+-------------------------------------------------------------+
  • it was discovered later that a couple of rogue test processes were running, e.g.:
27759 pts/8    Sl+    1:36 /apps/ezid/.pyenv/versions/ezid/bin/python ./manage.py runserver 127.0.0.1:8080

These were writing to an older transaction log with errors like:

2020-11-12 07:53:07,897 ERROR - ERROR register_async._daemonThread/datacite OperationalError: (2005, "Unknown MySQL server host 'rds-ias-ezid-search3-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (2)")

Killing these processes resolved this issue. Since these were attempting to write to the old database, it is unlikely there was any consequence. In the future, it is recommended to do a server restart prior to the upgrade.

@datadavev
Copy link
Contributor Author

datadavev commented Nov 12, 2020

Some integrity errors are reported in the transaction log:

2020-11-12 07:00:42,236 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/87602/m4/M166769' for key 'identifier'")
2020-11-12 07:03:18,115 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/88435/5x21tp714' for key 'identifier'")
2020-11-12 07:08:39,398 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/99999/fk4030873c' for key 'identifier'")
2020-11-12 07:10:04,825 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/88435/2801pq60h' for key 'identifier'")
2020-11-12 07:10:14,893 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'doi:10.5072/FK28S4TB5R' for key 'identifier'")
2020-11-12 08:20:39,882 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/13030/m5sv35t3' for key 'identifier'")
2020-11-12 08:24:55,873 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/13030/m5p614gz' for key 'identifier'")
2020-11-12 08:35:18,464 ERROR - ERROR backproc._backprocDaemon IntegrityError: (1062, "Duplicate entry 'ark:/13030/qt43k2z2zx' for key 'identifier'")

More info on those identifiers:

select id,identifier,FROM_UNIXTIME(createTime) as createTime,FROM_UNIXTIME(updateTime) as updateTime,status,target,istest from ezidapp_storeidentifier where identifier in ('ark:/13030/qt43k2z2zx','ark:/13030/m5p614gz','ark:/13030/m5sv35t3','doi:10.5072/FK28S4TB5R','ark:/88435/2801pq60h','ark:/99999/fk4030873c','ark:/88435/5x21tp714','ark:/87602/m4/M166769') order by updateTime desc;
+----------+------------------------+---------------------+---------------------+--------+------------------------------------------------------------+--------+
| id       | identifier             | createTime          | updateTime          | status | target                                                     | istest |
+----------+------------------------+---------------------+---------------------+--------+------------------------------------------------------------+--------+
| 35274939 | ark:/13030/qt43k2z2zx  | 2020-11-12 08:35:15 | 2020-11-12 08:35:15 | P      | http://merritt.cdlib.org/m/ark%3A%2F13030%2Fqt43k2z2zx     |      0 |
| 35274938 | ark:/13030/m5p614gz    | 2020-11-12 08:24:52 | 2020-11-12 08:24:52 | P      | https://ezid.cdlib.org/id/ark:/13030/m5p614gz              |      0 |
| 35274937 | ark:/13030/m5sv35t3    | 2020-11-12 08:20:39 | 2020-11-12 08:20:39 | P      | https://ezid.cdlib.org/id/ark:/13030/m5sv35t3              |      0 |
| 35274910 | doi:10.5072/FK28S4TB5R | 2020-11-12 07:10:10 | 2020-11-12 07:10:10 | P      | https://google.com                                         |      1 |
| 35274909 | ark:/88435/2801pq60h   | 2020-11-12 07:10:02 | 2020-11-12 07:10:02 | P      | https://catalog.princeton.edu/catalog/62808#view           |      0 |
| 35274908 | ark:/99999/fk4030873c  | 2020-11-12 07:08:39 | 2020-11-12 07:08:39 | P      | https://google.com                                         |      1 |
| 35274907 | ark:/88435/5x21tp714   | 2020-11-12 07:03:15 | 2020-11-12 07:03:15 | P      | https://catalog.princeton.edu/catalog/7150814#view         |      0 |
| 35274906 | ark:/87602/m4/M166769  | 2020-11-12 07:00:41 | 2020-11-12 07:00:41 | R      | https://uc3-ezid-ui-prd.cdlib.org/id/ark:/87602/m4/M166769 |      0 |
+----------+------------------------+---------------------+---------------------+--------+------------------------------------------------------------+--------+
8 rows in set (0.00 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants