Skip to content
This repository has been archived by the owner on Nov 3, 2021. It is now read-only.

reduce the number of backup attempts #27

Closed
lotharschulz opened this issue Aug 25, 2016 · 15 comments
Closed

reduce the number of backup attempts #27

lotharschulz opened this issue Aug 25, 2016 · 15 comments

Comments

@lotharschulz
Copy link
Contributor

there are to many (zombie) back processes running at the same time:

  • bus instance:
root     12081  0.0  0.0  45796  1000 ?        S    Aug24   0:00      |       |   \_ CRON
root     12082  0.0  0.0   4500   620 ?        Ss   Aug24   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     12083  0.0  0.0   9656   852 ?        S    Aug24   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     12102  0.0  0.0  11276   560 ?        S    Aug24   0:00      |       |   |           \_ grep ghe-backup
root     12143  0.0  0.0  45796  1000 ?        S    Aug24   0:00      |       |   \_ CRON
root     12144  0.0  0.0   4500   624 ?        Ss   Aug24   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     12145  0.0  0.0   9656   852 ?        S    Aug24   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     12164  0.0  0.0  11276   560 ?        S    Aug24   0:00      |       |   |           \_ grep ghe-backup
root     12216  0.0  0.0  45796  1000 ?        S    Aug24   0:00      |       |   \_ CRON
root     12217  0.0  0.0   4500   624 ?        Ss   Aug24   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     12218  0.0  0.0   9656   848 ?        S    Aug24   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     12237  0.0  0.0  11276   564 ?        S    Aug24   0:00      |       |   |           \_ grep ghe-backup
root     13226  0.0  0.1  45796  1364 ?        S    07:26   0:00      |       |   \_ CRON
root     13227  0.0  0.0   4500   664 ?        Ss   07:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13228  0.0  0.1   9656  1512 ?        S    07:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13247  0.0  0.0  11276   720 ?        S    07:26   0:00      |       |   |           \_ grep ghe-backup
root     13288  0.0  0.1  45796  1364 ?        S    08:26   0:00      |       |   \_ CRON
root     13289  0.0  0.0   4500   660 ?        Ss   08:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13290  0.0  0.1   9656  1520 ?        S    08:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13309  0.0  0.0  11276   724 ?        S    08:26   0:00      |       |   |           \_ grep ghe-backup
root     13350  0.0  0.1  45796  1364 ?        S    09:26   0:00      |       |   \_ CRON
root     13351  0.0  0.0   4500   664 ?        Ss   09:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13352  0.0  0.1   9656  1516 ?        S    09:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13371  0.0  0.0  11276   728 ?        S    09:26   0:00      |       |   |           \_ grep ghe-backup
root     13412  0.0  0.1  45796  1364 ?        S    10:26   0:00      |       |   \_ CRON
root     13413  0.0  0.0   4500   664 ?        Ss   10:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13414  0.0  0.1   9656  1516 ?        S    10:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13433  0.0  0.0  11276   728 ?        S    10:26   0:00      |       |   |           \_ grep ghe-backup
root     13485  0.0  0.1  45796  1364 ?        S    11:26   0:00      |       |   \_ CRON
root     13486  0.0  0.0   4500   664 ?        Ss   11:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13487  0.0  0.1   9656  1512 ?        S    11:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13506  0.0  0.0  11276   724 ?        S    11:26   0:00      |       |   |           \_ grep ghe-backup
root     13547  0.0  0.1  45796  1364 ?        S    12:26   0:00      |       |   \_ CRON
root     13548  0.0  0.0   4500   664 ?        Ss   12:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13549  0.0  0.1   9656  1516 ?        S    12:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13568  0.0  0.0  11276   724 ?        S    12:26   0:00      |       |   |           \_ grep ghe-backup
root     13609  0.0  0.1  45796  1364 ?        S    13:26   0:00      |       |   \_ CRON
root     13610  0.0  0.0   4500   660 ?        Ss   13:26   0:00      |       |       \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     13611  0.0  0.1   9656  1516 ?        S    13:26   0:00      |       |           \_ bash /backup/backup-utils/bin/ghe-backup -v
root     13630  0.0  0.0  11276   728 ?        S    13:26   0:00      |       |               \_ grep ghe-backup
  • automata instancnce:
root     11015  0.0  0.0  11276   124 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
root     11132  0.0  0.0  45796   380 ?        S    Aug22   0:00              |   \_ CRON
root     11133  0.0  0.0   4500    96 ?        Ss   Aug22   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11134  0.0  0.0   9656   296 ?        S    Aug22   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     11153  0.0  0.0  11276   124 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
root     11255  0.0  0.0  45796   380 ?        S    Aug22   0:00              |   \_ CRON
root     11256  0.0  0.0   4500    92 ?        Ss   Aug22   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11257  0.0  0.0   9656   292 ?        S    Aug22   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     11276  0.0  0.0  11276   124 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
root     11379  0.0  0.0  45796   380 ?        S    Aug22   0:00              |   \_ CRON
root     11380  0.0  0.0   4500   100 ?        Ss   Aug22   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11381  0.0  0.0   9656   292 ?        S    Aug22   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     11400  0.0  0.0  11276   128 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
root     11505  0.0  0.0  45796   380 ?        S    Aug22   0:00              |   \_ CRON
root     11506  0.0  0.0   4500    96 ?        Ss   Aug22   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11507  0.0  0.0   9656   300 ?        S    Aug22   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     11526  0.0  0.0  11276   128 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
root     11641  0.0  0.0  45796   380 ?        S    Aug22   0:00              |   \_ CRON
root     11642  0.0  0.0   4500    96 ?        Ss   Aug22   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11643  0.0  0.0   9656   296 ?        S    Aug22   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     11662  0.0  0.0  11276   128 ?        S    Aug22   0:00              |   |           \_ grep ghe-backup
``
@lotharschulz
Copy link
Contributor Author

#28 merged

@lotharschulz
Copy link
Contributor Author

deployment not succesful:

docker/c6a169a6e9a8[797]: python3: can't open file '/delete-instuck-backups/delete-instuck-progress.py':

@lotharschulz
Copy link
Contributor Author

#29 should fix the issue caused the failed deployment

@lotharschulz
Copy link
Contributor Author

@lotharschulz
Copy link
Contributor Author

lotharschulz commented Aug 29, 2016

  • deployment to automata successful
  • manually started backup currently ongoing

@lotharschulz
Copy link
Contributor Author

  • manually started backup completed
  • waiting for next cron triggered backup

@lotharschulz
Copy link
Contributor Author

lotharschulz commented Aug 29, 2016

  • backup approach left an in-progress file
    looks like this is caused by
...
root       796  0.1  2.0 600936 20936 ?        Ssl  08:07   0:24 /usr/bin/docker daemon --storage-driver=aufs --raw-logs
root      1196  0.0  0.6 138448  6512 ?        Ssl  08:07   0:00  \_ docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc
root      9738  0.0  0.0 200464   664 ?        Sl   08:08   0:00      \_ docker-containerd-shim b3c6e6edd07641725c49e5da86cc0452f03c10d8befc75127a7efc81bdb64c39 /var/run/docker/libcontainerd/b3c6e6edd07641725c49e5da86cc0452f03c10d8befc75127a7efc81bdb64c39 docker-runc
root      9756  0.0  0.0  20972   304 ?        Ss   08:08   0:00      |   \_ /bin/bash /backup/final-docker-cmd.sh
root      9787  0.0  0.0  29000   496 ?        Ss   08:08   0:00      |       \_ cron
root     11383  0.0  0.1  45796  1080 ?        S    09:26   0:00      |       |   \_ CRON
root     11384  0.0  0.0   4500   548 ?        Ss   09:26   0:00      |       |       \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     11385  0.0  0.1   9664  1528 ?        S    09:26   0:00      |       |           \_ bash /backup/backup-utils/bin/ghe-backup -v
root     12310  0.0  0.1   9652  1508 ?        S    09:36   0:00      |       |               \_ bash /backup/backup-utils/share/github-backup-utils/ghe-backup-userdata hookshot
root     12343  0.0  0.1   9652  1496 ?        S    09:36   0:00      |       |                   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.d
root     12351  0.0  0.0   9652   464 ?        S    09:36   0:00      |       |                       \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zal
root     12352  0.0  0.0  11276   728 ?        S    09:36   0:00      |       |                           \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root      9788  0.0  0.0   7316   120 ?        S    08:08   0:00      |       \_ tail -F /var/log/ghe-prod-backup.log
root     10000  0.0  0.0 126732   628 ?        Sl   08:27   0:00      \_ docker-containerd-shim b3c6e6edd07641725c49e5da86cc0452f03c10d8befc75127a7efc81bdb64c39 /var/run/docker/libcontainerd/b3c6e6edd07641725c49e5da86cc0452f03c10d8befc75127a7efc81bdb64c39 docker-runc
root     10014  0.0  0.1  21168  1176 pts/4    Ss+  08:27   0:00          \_ bash
message+   810  0.0  0.0  39216   864 ?        Ss   08:07   0:00 dbus-daemon --system --fork
...

according to https://github.com/github/backup-utils/blob/master/share/github-backup-utils/ghe-rsync rsync does not catch error "24"

@lotharschulz
Copy link
Contributor Author

  • deploying to bus account as well

@lotharschulz
Copy link
Contributor Author

  • deployment to bus account successful
  • cron triggered backup approach will kick in around 2 hours 50 minutes

@lotharschulz
Copy link
Contributor Author

bus account shows several ghe-backup instances most likely due to vanished files

root      9830  0.0  0.0 200464   696 ?        Sl   Aug29   0:00      \_ docker-containerd-shim f4b342791d2ce043744d28c2bf694f2c4f81d3f7eb51b611d1bf562fd47af075 /var/run/docker/libcontainerd/f4b342791d2ce043744d28c2bf694f2c4f81d3f7eb51b611d1bf562fd47af075 docker-runc
root      9842  0.0  0.0  20972   312 ?        Ss   Aug29   0:00          \_ /bin/bash /backup/final-docker-cmd.sh
root      9877  0.0  0.0  29000   488 ?        Ss   Aug29   0:00              \_ cron
root     20299  0.0  0.0  45796   524 ?        S    Aug30   0:00              |   \_ CRON
root     20300  0.0  0.0   4500   192 ?        Ss   Aug30   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     20301  0.0  0.0   9664   420 ?        S    Aug30   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root      4732  0.0  0.0   9652   400 ?        S    Aug30   0:00              |   |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-backup-userdata hookshot
root      4765  0.0  0.0   9652   400 ?        S    Aug30   0:00              |   |               \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T092601/hoo
root      4772  0.0  0.0   9652   288 ?        S    Aug30   0:00              |   |                   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T092601
root      4775  0.0  3.5 103636 36168 ?        S    Aug30   0:03              |   |                   |   \_ rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T092601/hookshot
root      4778  0.0  0.2  48648  2624 ?        S    Aug30   0:00              |   |                   |   |   \_ ssh -p 22 -l admin -o StrictHostKeyChecking=no -p 122 -o BatchMode=yes github.bus.zalan.do -- nice -n 19 ionice -c 3 sudo -u git rsync --server --sender -vlogDtprze.iLsfx . /data/user/hookshot/
root      4792  0.0  2.5 196884 25672 ?        S    Aug30   0:00              |   |                   |   |   \_ rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T092601/hookshot
root      4776  0.0  0.0   9652   292 ?        S    Aug30   0:00              |   |                   |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T09
root      4777  0.0  0.0  11276   124 ?        S    Aug30   0:00              |   |                   |       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root      4773  0.0  0.0   9652   288 ?        S    Aug30   0:00              |   |                   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data/20160830T092601
root      4774  0.0  0.0  11276   124 ?        S    Aug30   0:00              |   |                       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root     17155  0.0  0.0  45796   536 ?        S    Aug30   0:00              |   \_ CRON
root     17156  0.0  0.0   4500   200 ?        Ss   Aug30   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     17157  0.0  0.0   9656   408 ?        S    Aug30   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     17176  0.0  0.0  11276   124 ?        S    Aug30   0:00              |   |           \_ grep ghe-backup
root     12718  0.0  0.0  45796   536 ?        S    Aug30   0:00              |   \_ CRON
root     12719  0.0  0.0   4500   200 ?        Ss   Aug30   0:00              |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     12720  0.0  0.0   9656   412 ?        S    Aug30   0:00              |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     12739  0.0  0.0  11276   124 ?        S    Aug30   0:00              |   |           \_ grep ghe-backup
root     21116  0.0  0.1  45796  1356 ?        S    09:26   0:00              |   \_ CRON
root     21117  0.0  0.0   4500   664 ?        Ss   09:26   0:00              |       \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     21118  0.0  0.1   9656  1500 ?        S    09:26   0:00              |           \_ bash /backup/backup-utils/bin/ghe-backup -v
root      9878  0.0  0.0   7316    88 ?        S    Aug29   0:06              \_ tail -F /var/log/ghe-prod-backup.log

lotharschulz pushed a commit that referenced this issue Sep 1, 2016
lotharschulz pushed a commit that referenced this issue Sep 1, 2016
@lotharschulz
Copy link
Contributor Author

lotharschulz commented Sep 2, 2016

root      1233  0.0  0.7 213236  7868 ?        Ssl  Aug29   0:06  \_ docker-containerd -l /var/run/docker/libcontainerd/docker-containerd.sock --runtime docker-runc
root     27548  0.0  0.2 200464  2680 ?        Sl   Sep01   0:00      \_ docker-containerd-shim f4b342791d2ce043744d28c2bf694f2c4f81d3f7eb51b611d1bf562fd47af075 /var/run/docker/libcontainerd/f4b342791d2ce043744d28c2bf694f2c4f81d3f7eb51b611d1bf562fd47af075 docker-runc
root     27576  0.0  0.0  20972   276 ?        Ss   Sep01   0:00      |   \_ /bin/bash /backup/final-docker-cmd.sh
root     27597  0.0  0.0  29000   492 ?        Ss   Sep01   0:00      |       \_ cron
root     25859  0.0  0.0  45796   612 ?        S    09:26   0:00      |       |   \_ CRON
root     25860  0.0  0.0   4500   248 ?        Ss   09:26   0:00      |       |   |   \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     25861  0.0  0.0   9664   592 ?        S    09:26   0:00      |       |   |       \_ bash /backup/backup-utils/bin/ghe-backup -v
root     15975  0.0  0.0   9652   988 ?        S    09:55   0:00      |       |   |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-backup-userdata hookshot
root     16008  0.0  0.0   9652   988 ?        S    09:55   0:00      |       |   |               \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.d
root     16015  0.0  0.0   9652   296 ?        S    09:55   0:00      |       |   |                   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zal
root     16018  0.0  0.7 103672  7612 ?        S    09:55   0:03      |       |   |                   |   \_ rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-data
root     16021  0.0  0.2  48176  2908 ?        S    09:55   0:00      |       |   |                   |   |   \_ ssh -p 22 -l admin -o StrictHostKeyChecking=no -p 122 -o BatchMode=yes github.bus.zalan.do -- nice -n 19 ionice -c 3 sudo -u git rsync --server --sender -vlo
root     16035  0.0  2.1 179496 21448 ?        S    09:55   0:00      |       |   |                   |   |   \_ rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zalan.do:/data/user/hookshot/ /data/ghe-production-
root     16019  0.0  0.0   9652   296 ?        S    09:55   0:00      |       |   |                   |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus
root     16020  0.0  0.0  11276   632 ?        S    09:55   0:00      |       |   |                   |       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root     16016  0.0  0.0   9652   292 ?        S    09:55   0:00      |       |   |                   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -avz -e ghe-ssh -p 122 --rsync-path=sudo -u git rsync --link-dest=../../current/hookshot github.bus.zal
root     16017  0.0  0.0  11276   632 ?        S    09:55   0:00      |       |   |                       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root     22941  0.0  0.1  45796  1360 ?        S    14:26   0:00      |       |   \_ CRON
root     22942  0.0  0.0   4500   664 ?        Ss   14:26   0:00      |       |       \_ /bin/sh -c /backup/backup-utils/bin/ghe-backup -v 1>> /var/log/ghe-prod-backup.log 2>&1
root     22943  0.0  0.1   9656  1524 ?        S    14:26   0:00      |       |           \_ bash /backup/backup-utils/bin/ghe-backup -v
root     22962  0.0  0.0  11276   720 ?        S    14:26   0:00      |       |               \_ grep ghe-backup
root     27598  0.0  0.0   7316    88 ?        S    Sep01   0:04      |       \_ tail -F /var/log/ghe-prod-backup.log
root@f4b342791d2c:/data/ghe-production-data# ll
total 88
drwxr-xr-x 14 root root 28672 Sep  2 14:26 ./
drwxrwxrwx  5 root root  4096 Feb 11  2016 ../
drwxr-xr-x 10 root root  4096 Aug 29 09:42 20160829T092601/
drwxr-xr-x 10 root root  4096 Aug 29 10:41 20160829T102601/
drwxr-xr-x 10 root root  4096 Aug 29 11:49 20160829T112601/
drwxr-xr-x 10 root root  4096 Aug 29 12:45 20160829T122601/
drwxr-xr-x 10 root root  4096 Aug 29 13:51 20160829T132601/
drwxr-xr-x 10 root root  4096 Aug 29 14:47 20160829T142601/
drwxr-xr-x 10 root root  4096 Aug 29 15:47 20160829T152601/
drwxr-xr-x 11 root root  4096 Aug 29 19:49 20160829T192601/
drwxr-xr-x 11 root root  4096 Sep  1 16:33 20160901T154417/
drwxr-xr-x 11 root root  4096 Sep  1 19:56 20160901T192601/
drwxr-xr-x  9 root root  4096 Sep  2 09:54 20160902T092601/
drwxr-xr-x  2 root root  4096 Sep  2 14:26 20160902T142601/
lrwxrwxrwx  1 root root    15 Sep  1 19:56 current -> 20160901T192601/
-rw-r--r--  1 root root    21 Sep  2 09:26 in-progress
root@f4b342791d2c:/data/ghe-production-data# ll 20160902T142601
total 36
drwxr-xr-x  2 root root  4096 Sep  2 14:26 ./
drwxr-xr-x 14 root root 28672 Sep  2 14:26 ../
-rw-r--r--  1 root root     0 Sep  2 14:26 incomplete
root@f4b342791d2c:/data/ghe-production-data# ll 20160901T192601
total 1350400
drwxr-xr-x 11 root root      4096 Sep  1 19:56 ./
drwxr-xr-x 14 root root     28672 Sep  2 14:26 ../
drwx------  6  500  500      4096 Sep  2  2015 alambic_assets/
drwxr-xr-x  2 root root      4096 Sep  1 19:28 audit-log/
-rw-r--r--  1 root root      5139 Sep  1 19:26 authorized-keys.json
drwxr-xr-x  2 root root      4096 Sep  1 19:26 benchmarks/
drwx------  3  601  601      4096 Feb 26  2016 elasticsearch/
-rw-r--r--  1 root root     17520 Sep  1 19:26 enterprise.ghl
drwxr-xr-x  3 root root      4096 Sep  1 19:55 git-hooks/
drwx------ 12  500  500      4096 Sep  1 03:47 hookshot/
-rw-r--r--  1 root root         0 Sep  1 19:26 manage-password+
-rw-r--r--  1 root root 648942490 Sep  1 19:27 mysql.sql.gz
drwx------ 10  500  500      4096 Oct 22  2015 pages/
-rw-r--r--  1 root root 733695063 Sep  1 19:28 redis.rdb
drwx------ 19  500  500      4096 Sep  1 19:28 repositories/
-rw-r--r--  1 root root     10240 Sep  1 19:26 saml-keys.tar
-rw-r--r--  1 root root     22630 Sep  1 19:26 settings.json
-rw-r--r--  1 root root     10240 Sep  1 19:26 ssh-host-keys.tar
drwx------ 18  500  500      4096 Mar 23 08:59 storage/
-rw-r--r--  1 root root         6 Sep  1 19:26 strategy
-rw-r--r--  1 root root         7 Sep  1 19:26 version
root@f4b342791d2c:/data/ghe-production-data# cat 20160901T192601/benchmarks/benchmark.*.log
ghe-backup-store-version took 0s
ghe-backup-settings took 1s
ghe-export-authorized-keys took 0s
ghe-export-ssh-host-keys took 0s
ghe-export-mysql took 102s
ghe-backup-redis took 64s
ghe-backup-es-audit-log took 2s
ghe-backup-es-hookshot took 1s
ghe-backup-repositories-rsync took 1457s
ghe-backup-userdata - pages took 21s
ghe-backup-pages-rsync took 21s
ghe-backup-userdata - alambic_assets took 3s
ghe-backup-userdata - storage took 17s
ghe-backup-userdata - hookshot took 81s
ghe-backup-userdata - git-hooks/repos took 1s
ghe-backup-es-rsync took 61s

@lotharschulz
Copy link
Contributor Author

lotharschulz pushed a commit that referenced this issue Sep 6, 2016
@lotharschulz
Copy link
Contributor Author

#27 (comment) deployed and backups manually kick off and currently running. current process list:

bus:

root        48  0.0  0.0  21088   436 ?        S    09:55   0:00  \_ bash /backup/backup-utils/bin/ghe-backup
root       535  0.0  0.0  21104   912 ?        S    09:58   0:00  |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-backup-repositories-rsync
root       664  0.0  0.1  21076  1676 ?        S    10:03   0:00  |       \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.
root       671  0.0  0.0  21076   540 ?        S    10:03   0:00  |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=* git
root       674  0.2  1.2 158624 13148 ?        S    10:03   0:04  |           |   \_ rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.bus.zalan.do:/data/user/repositories/ /data/ghe-
root       677 12.0  0.7  51080  7464 ?        S    10:03   2:52  |           |   |   \_ ssh -p 22 -l admin -o StrictHostKeyChecking=no -p 122 -o BatchMode=yes github.bus.zalan.do -- nice -n 19 ionice -c 3 sudo -u git rsync --server --sender -vlHogDtpre.iLsfx . /data/us
root       691  6.2  3.7 267036 38332 ?        S    10:03   1:29  |           |   |   \_ rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.bus.zalan.do:/data/user/repositories/ /data/
root       675  0.0  0.0  21076   528 ?        S    10:03   0:00  |           |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=*
root       676  0.0  0.1  14344  1052 ?        S    10:03   0:00  |           |       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root       672  0.0  0.0  21076   672 ?        S    10:03   0:00  |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -H --rsync-path=sudo -u git rsync --include-from=- --exclude=* git
root       673  0.0  0.0  14216   864 ?        S    10:03   0:00  |               \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root       734  0.0  0.1  37356  1564 ?        R+   10:26   0:00  \_ ps -auxwf

automata:

root        49  0.0  0.0  21088   600 ?        S    10:11   0:00  \_ bash /backup/backup-utils/bin/ghe-backup
root       566  0.0  0.1  21112  1052 ?        S    10:15   0:00  |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-backup-repositories-rsync
root       669  0.0  0.1  21076  1760 ?        S    10:24   0:00  |       \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.
root       676  0.0  0.0  21076   548 ?        S    10:24   0:00  |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=* git
root       679  1.6  2.6 165952 26656 ?        D    10:24   0:03  |           |   \_ rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.bus.zalan.do:/data/user/repositories/ /data/ghe-
root       682  0.0  0.3  47208  3608 ?        S    10:24   0:00  |           |   |   \_ ssh -p 22 -l admin -o StrictHostKeyChecking=no -p 122 -o BatchMode=yes github.bus.zalan.do -- nice -n 19 ionice -c 3 sudo -u git rsync --server --sender -vlogDtprze.iLsfx . /data/us
root       696  0.3  1.0 299116 10768 ?        S    10:24   0:00  |           |   |   \_ rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=* github.bus.zalan.do:/data/user/repositories/ /data/
root       680  0.0  0.0  21076   556 ?        S    10:24   0:00  |           |   \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=*
root       681  0.0  0.1  14344  1156 ?        S    10:24   0:00  |           |       \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root       677  0.0  0.0  21076   700 ?        S    10:24   0:00  |           \_ bash /backup/backup-utils/share/github-backup-utils/ghe-rsync -av -e ghe-ssh -p 122 --link-dest=../../current/repositories -z --rsync-path=sudo -u git rsync --include-from=- --exclude=* git
root       678  0.0  0.0  14216   968 ?        S    10:24   0:00  |               \_ grep -E -v ^(file has vanished: |rsync warning: some files vanished before they could be transferred)
root       700  0.0  0.1  37356  1564 ?        R+   10:28   0:00  \_ ps auxwf

@lotharschulz
Copy link
Contributor Author

The issue does not exist really anymore with the increase of the AWS instance size: we do now run r3.4xlarge. We are aware of https://help.github.com/enterprise/2.7/admin/guides/installation/installing-github-enterprise-on-aws/#recommended-instance-types .

screen shot 2016-09-14 at 16 55 11

@lotharschulz
Copy link
Contributor Author

lotharschulz commented Sep 14, 2016

The master just has enough capacity to handle all jobs and load and backup attempts.
Hence the crazy process list was an early indicator to upgrade the AWS instance size.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant