Skip to content

Latest commit

 

History

History
239 lines (179 loc) · 10.8 KB

DOCUMENTATION.md

File metadata and controls

239 lines (179 loc) · 10.8 KB

OVERVIEW

Zaloha_Snapshot is an add-on script to Zaloha to create hardlink-based snapshots
of the backup directory (condition: hardlinks must be supported by the
underlying filesystem type).

This allows to create "Time Machine"-like backup solutions.

Zaloha_Snapshot has been created using the same technology and style as Zaloha
itself. Read Zaloha documentation to get acquainted with relevant terminology,
features, cautions and limitations.

On Linux/Unics, Zaloha_Snapshot runs natively. On Windows, Cygwin is needed.

Repository: https://github.com/Fitus/Zaloha2_Snapshot.sh

Repository of Zaloha: https://github.com/Fitus/Zaloha2.sh

MORE DETAILED DESCRIPTION

How do hardlink-based snapshots work: Assume a file exists in <backupDir>.
If Zaloha_Snapshot is invoked to create a snapshot directory (= <snapDir>)
of <backupDir>, it creates a hardlink in <snapDir> that points to the original
file in <backupDir>.

What happens at next run of Zaloha:

Scenario 1: No action occurs on the file in <backupDir> (because of no change
of the source file in <sourceDir>): The situation will stay as described above.
Please note that the physical storage space will be occupied only once (as the
hardlink takes very small additional space).

Scenario 2: The file in <backupDir> will be updated by Zaloha (due to change of
the source file in <sourceDir>): The update performed by Zaloha consists of
unlinking the hardlink (rm -f) and copying of the changed file to <backupDir>
(cp). The result will be that <snapDir> will contain the original file,
and <backupDir> will contain the updated file, both files now single-linked.

Please note that the --noUnlink option of Zaloha must NOT be used in order for
this to work.

Scenario 3: The file in <backupDir> will be removed by Zaloha (due to removal of
the source file in <sourceDir>): The removal (rm -f) will delete the file in
<backupDir>, but the hardlink (now single-linked file) in <snapDir> will stay.

Result after all three scenarios: <snapDir> still keeps the state of <backupDir>
at the time when it was created.

LIMITATIONS

First of all, the filesystem type of <backupDir> and of <snapDir> must support
hardlinks (e.g. the ext4 filesystem supports hardlinks).

Next, hardlinks must reside on the same storage device as the original file.
Also, practically: whole <backupDir> must reside on one single storage device
and whole <snapDir> must reside on the same storage device (= also all must
reside on same single storage device (device number)).

Then, Zaloha_Snapshot is incompatible with some operation modes of Zaloha:

 * as already stated above, the --noUnlink option must NOT be used

 * further, we say that the --revNew, --revNewAll and --revUp options are
   also not compatible, because they imply that there will be user activity on
   <backupDir>, which is inconsistent with the whole concept.

 * further, <backupDir> and all snapshot directories should be accompanied by
   Zaloha metadata directories. One key reason is that objects other than files
   and directories are kept in metadata only. The default location of the Zaloha
   metadata directory is <backupDir>/.Zaloha_metadata, which is a good location
   as it is inside of <backupDir>. Placing the Zaloha metadata directory to a
   different location (via the --metaDir option) would create hard-to-manage
   situations with the (potentially many) snapshot directories, so we define
   that this is not compatible either.

INVOCATION

Zaloha2_Snapshot.sh --backupDir=<backupDir> --snapDir=<snapDir> [ other opts ]

--backupDir=<backupDir> is mandatory. <backupDir> must exist, otherwise Zaloha
    throws an error.

--snapDir=<snapDir> is mandatory. <snapDir> must NOT exist, otherwise Zaloha
    throws an error.

--noExec        ... do not actually create the contents of <snapDir> (= the
    subdirectories and the hardlinks), only prepare the script (file 930).
    The prepared script will not contain the "set -e" instruction. This means
    that the script ignores individual failed commands and tries to do as much
    work as possible, which is a behavior different from the interactive regime,
    where the script halts on the first error.

--noSnapHdr     ... do not write header to the shellscript to create snapshot
    directory (file 930). This option can be used only together with the
    --noExec option. The header contains definitions used in the body of the
    script. Header-less script (i.e. body only) can be easily used with an
    alternative header that contains different definitions.

--saveSpace     ... compress the CSV metadata file 505 and, unless the option
    --noExec has been given, remove the shellscript to create snapshot
    directory (file 930) upon exit. Saving space is the more relevant issue
    the more snapshot directories exist.

--noProgress    ... suppress progress messages (no screen output).

--mawk          ... use mawk, the very fast AWK implementation based on a
                    bytecode interpreter. Without this option, awk is used,
                    which usually maps to GNU awk (but not always).

--lTest         ... (do not use in real operations) support for lint-testing
                    of AWK programs

--help          ... show Zaloha_Snapshot documentation (using the LESS program)
                    and exit

In case of failure: resolve the problem, remove an eventually existing <snapDir>
and re-run Zaloha_Snapshot with same parameters.

TESTING, DEPLOYMENT, INTEGRATION

See corresponding section in Zaloha documentation for general issues.

For Zaloha_Snapshot, it is important to verify that the overall concept works
on your environment under all three scenarios described in section More Detailed
Description above. The Simple Demo scripts from the repository contain a
relevant minimalistic test case.

RESTORE FROM A SNAPSHOT DIRECTORY

As a restore from a snapshot directory is a less likely scenario and the
shellscripts for the case of restore (scripts 800 through 870) occupy space,
Zaloha_Snapshot (unlike Zaloha) does not prepare these scripts.

In case of need, they should be prepared manually by running the AWK program 700
on the CSV metadata file 505:

  awk -f "<AWK program 700>"                  \
      -v backupDir="<snapDir>"                \
      -v restoreDir="<restoreDir>"            \
      -v remoteBackup=<0 or 1>                \
      -v backupUserHost="<backupUserHost>"    \
      -v remoteRestore=<0 or 1>               \
      -v restoreUserHost="<restoreUserHost>"  \
      -v scpExecOpt="<scpExecOpt>"            \
      -v cpRestoreOpt="<cpRestoreOpt>"        \
      -v f800="<script 800 to be created>"    \
      -v f810="<script 810 to be created>"    \
      -v f820="<script 820 to be created>"    \
      -v f830="<script 830 to be created>"    \
      -v f840="<script 840 to be created>"    \
      -v f850="<script 850 to be created>"    \
      -v f860="<script 860 to be created>"    \
      -v f870="<script 870 to be created>"    \
      -v noR800Hdr=<0 or 1>                   \
      -v noR810Hdr=<0 or 1>                   \
      -v noR820Hdr=<0 or 1>                   \
      -v noR830Hdr=<0 or 1>                   \
      -v noR840Hdr=<0 or 1>                   \
      -v noR850Hdr=<0 or 1>                   \
      -v noR860Hdr=<0 or 1>                   \
      -v noR870Hdr=<0 or 1>                   \
      "<CSV metadata file 505>"

Note 1: All filenames/paths should begin with a "/" (if absolute) or with a "./"
(if relative), and <snapDir> and <restoreDir> must end with a terminating "/".

Note 2: If any of the filenames/paths passed into AWK as variables (<snapDir>,
<restoreDir> and the <scripts 8xx to be created>) contain backslashes as "weird
characters", replace them by ///b. The AWK program 700 will replace ///b back
to backslashes inside.

SPECIAL AND CORNER CASES

Updates of ONLY the file attributes (owner, group, mode) by Zaloha: If Zaloha
operates with the --pUser, --pGroup and/or --pMode options, it updates
the attributes on <backupDir> to reflect <sourceDir>. In case when ONLY the
attributes are updated (also not the file itself (= no unlinking)), the
attributes are updated on all hardlinks in the snapshot directories.
(More precisely, this depends on the type of the underlying filesystem:
there are some filesystem types that allow hardlinks to the same file to have
different sets of attributes).

Hardlinks on <sourceDir> and hardlinks on <backupDir>: Let's summarize the
situation to leave no room for confusion:

<sourceDir> is a user-maintained directory where hardlinks between files may
exist. Without the --detectHLinksS option, Zaloha will treat each hardlink as
a separate regular file, and will synchronize each such file to <backupDir>.
With the --detectHLinksS option, Zaloha will treat only the first hardlink as
a file, and will synchronize that file to <backupDir>. The second, third etc
hardlinks will be treated as "hardlinks" and will be kept in metadata only
(the 505 file).

<backupDir>, on the other hand, must be a directory maintained solely by Zaloha
(user activity on <backupDir> is inconsistent with the concept of snapshots).
Zaloha never creates hardlinks on <backupDir>, also there should be none.
It is Zaloha_Snapshot that brings hardlinks into play, in the form that
snapshot directories contain hardlinks to files in <backupDir>.

HOW ZALOHA_SNAPSHOT WORKS INTERNALLY

Handling and checking of input parameters should be self-explanatory.

Zaloha_Snapshot then creates <snapDir> along with <snapDir>/.Zaloha_metadata
and then copies files 000, 100, 505 and 700 from <backupDir>/.Zaloha_metadata
into it.

The AWK program AWKSNAPCHECK then checks the 000 file and raises an error
if <backupDir> is maintained by an instance of Zaloha with options incompatible
with Zaloha_Snapshot (see section Limitations above).

The AWK program AWKSNAPSHOT then prepares a shellscript to create the contents
of the snapshot directory (the subdirectories and the hardlinks).

The prepared shellscript is then sourced to perform actual work (unless the
--noExec option is given).