Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need 2x more space for backup #39

Closed
pschonmann opened this issue May 4, 2021 · 13 comments
Closed

Need 2x more space for backup #39

pschonmann opened this issue May 4, 2021 · 13 comments

Comments

@pschonmann
Copy link

Hello, im using docker volume-backup for backup some volumes from sentry-on-premise.
I stop all containers and then backup containers with that loop

for i in sentry-data sentry-redis sentry-zookeeper sentry-kafka sentry-clickhouse sentry-symbolicator; do
 docker run -v $i:/volume --rm loomchild/volume-backup backup -c zstd - > $BACKUP_DIR/volume-$i-$DATENOW.tar.zst
.
.
.

Backup running ok, but needs 2x space to make backup. Expecting when im packing to stdout not using too much space :( Is that bug or i need 2x space when doing backup ?

@loomchild
Copy link
Owner

loomchild commented May 5, 2021

Hm, I don't know why it needs more space, maybe it has something to do I with Docker, will try to investigate it further later.

However, one thing that drew my attention is that you are adding a timestamp to volume backup file name via DATENOW environment variable. Does it mean that the old backup file is preserved? Maybe this can explain the fact that you need more disk space? Could you share your entire script source?

@pschonmann
Copy link
Author

Old backup archives are deleted by cron after 2d. There is no problem. Problem is when i run volume-backup, probably need something like temp ?

When i run
docker logs i see stdout there.

#!/bin/bash
# https://github.com/getsentry/onpremise/issues/364

STAV_ZALOH="/var/tmp/docker_volume_backup.state"
DATENOW=$(date +\%F-\%H-\%M)
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin



if [ -f $STAV_ZALOH ]; then
  echo " EXISTUJE SOUBOR $STAV_ZALOH - KONCIM - KOUKNI CO OBSAHUJE ZA CHYBU"
  exit 1
fi

#TURN OFF DOCKERs BEFORE BACKUP - BECAUSE DATA WILL CORRUPT / INCONSISTENT

cd /root/onpremise-20.10.1/; docker-compose -f docker-compose.override.yml down --timeout 30
if [ $? -ne 0 ]; then
 echo "Nepodarilo se mi vypnout docker kontejnery pro zalohu. Bez toho nemuzu pokracovat - hrozi nekonzistence dat - po vyreseni smaz $STAV_ZALOH" > $STAV_ZALOH
 exit 1
fi

mkdir /data/docker-backup/$DATENOW
BACKUP_DIR="/data/docker-backup/$DATENOW"

for i in sentry-data sentry-postgres sentry-redis sentry-zookeeper sentry-kafka sentry-clickhouse sentry-symbolicator; do
# echo $i
 docker run -v $i:/volume --rm loomchild/volume-backup backup -c zstd - > $BACKUP_DIR/volume-$i-$DATENOW.tar.zst
 if [ $? -ne 0 ]; then
   echo "Neco se pokazilo u zalohy $i - koncim - Zkus spustit zalohu znovu - po vyreseni smaz $STAV_ZALOH" > $STAV_ZALOH
   exit 1
 else
   echo "Zalohovano $BACKUP_DIR/volume-$i-$DATENOW.tar.zst"
  fi
done

#Vsechno ok poustim docker zpet
cd /root/onpremise-20.10.1/; docker-compose -f docker-compose.override.yml up -d
 if [ $? -ne 0 ]; then
   echo "Neco se pokazilo u startu dockeru - nahod me rucne - po vyreseni smaz $STAV_ZALOH" > $STAV_ZALOH
   exit 1
  fi

# https://develop.sentry.dev/self-hosted/backup/
cd /root/onpremise-20.10.1/; docker-compose -f docker-compose.override.yml run --rm -T -e SENTRY_LOG_LEVEL=CRITICAL web export > $BACKUP_DIR/backup-$DATENOW.json

@loomchild
Copy link
Owner

loomchild commented May 12, 2021

Hey, sorry for delay, unfortunately I won't be able to help you much with this as there's nothing specific in the script that uses temporary storage and tar and compression commands shouldn't use it either.

You can try to:

  • use the alternative version of volume-backup command without piping, e.g.:

      docker run -v $i:/volume -v $BACKUP_DIR:/backup --rm loomchild/volume-backup backup -c zstd volume-$i-$DATENOW
    
  • add --tmpfs mount, as maybe docker stores some data on disk when piping

Could you also clarify how do you know that additional space is used during backup?
Is the double space being used only during each individual volume backup or for all volumes (in other words is space used n * v + v or 2 * n * v, where v is size of each volume)?

@pschonmann
Copy link
Author

This is graph where you can see that free space is doubled when backup.
obrazek

Ill check your advices soon and we will see.

@loomchild
Copy link
Owner

It looks normal that the backup file is taking more and more space while archiving, what is weird that the usage suddenly decreases after backup is complete (at 00:35 - 00:36). Is the /data/docker-backup directory is on a different disk / storage? Is it moved anywhere after backup is complete?

@pschonmann
Copy link
Author

Before i start i want to notice when i see logs for docker container running i see probably stdout. Final disk usage drop is when backup job is finished. ??? SOME TEMPS/LOGS are deleted ???

root@sentry:~# docker ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED              STATUS              PORTS               NAMES
799402a6f168        loomchild/volume-backup   "/usr/bin/dumb-init …"   About a minute ago   Up About a minute                       beautiful_montalcini
root@sentry:~# docker logs 799402a6f168 | head
(�/�XTQZ	�(:ih���e���c{$�Z�f�v{�x�#
��8��
��.R��?�WhB����A�`<H��L@d
                         �%��1hj0D�O�#��a�P,γ&?�-��y"Ok0,��"Yc
dI�3�$�@Y�by�3I�`@ 5iZ6=
             laI
,� S���	��{���	X�	Oϲg��L&�@j(�@�F�E"�M�`Oc�*s����lZ�M,1I0sH�@6Y�%YW$
I���"�,�&s(@c�VTa��4���
                       )�4�	��G�5p�ɴ*,��Q�*(����* �f�Bi"��R�5�,�ؑh7M���Pl��

@loomchild
Copy link
Owner

Interesting find. Yes, what you see in docker logs is the content of the stdout, so the zipped file.

It's possible that Docker stores it on disk somewhere while the container is running (we should research that).

My suggestion for next step: Please try executing volume-backup without using stdout/redirection with the command I mentioned above and see if the same storage usage spike occurs.

@loomchild
Copy link
Owner

From a very fast research (first hit on Google, I am on mobile now): https://sematext.com/blog/docker-logs-location/

Could you monitor whether the file /var/lib/docker/containers/<container_id>/<container_id>-json.log is growing during the backup and is deleted afterwards? (docker ps should give you the container_id)

@loomchild
Copy link
Owner

Better resource:
https://docs.docker.com/config/containers/logging/configure/

Alternatively to changing the command you can also disable logging driver when starting volume backup by adding --log-driver none option or local driver (as explained in above link):

Tip: use the “local” logging driver to prevent disk-exhaustion
By default, no log-rotation is performed. As a result, log-files stored by the default json-file logging driver logging driver can cause a significant amount of disk space to be used for containers that generate much output, which can lead to disk space exhaustion.

@pschonmann
Copy link
Author

Better resource:
https://docs.docker.com/config/containers/logging/configure/

Alternatively to changing the command you can also disable logging driver when starting volume backup by adding --log-driver none option or local driver (as explained in above link):

Seems promising. Ill check that in next night run

@loomchild
Copy link
Owner

Hi @pschonmann, did you see an improvement after making the change?

@pschonmann
Copy link
Author

Logging option to none fixed my problem. Thanks.

@loomchild
Copy link
Owner

That's great! Thank you for taking the time to investigate this, I will update the documentation for the others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants