Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possiblity to restore filesystems in parallel with TSM. #2232

Closed
wants to merge 1 commit into from
Closed

Add possiblity to restore filesystems in parallel with TSM. #2232

wants to merge 1 commit into from

Conversation

cookie33
Copy link

Relax-and-Recover (ReaR) Pull Request Template

Pull Request Details:
  • Type: Enhancement

  • Impact: Normal

  • Reference to related issue (URL):

  • How was this pull request tested?

A restore of a SLES12SP3 system was done with this version of the TSM restore script with the parallel mode set to true.

  • Brief description of the changes in this pull request:
  • Add a new parameter to differentiatie between normal (OLD) serial behaviour and parallel mode. Default is serial behaviour
  • make the TSM restore of a filesystem a function and call it either serial of in parallel

rear-asm4.log

@cookie33
Copy link
Author

Extra actions done before the restore worked on sles12sp3:

  • copy /usr/lib64/libsnapper.so* to rescue image after boot from it
  • copy /usr/lib64/libboost.so* to rescue image after boot from it
  • copy /usr/lib64/libbtrfs.so* to rescue image after boot from it
  • copy /usr/lib64/libstdc++* to rescue image after boot from it
  • set LD_LIBRARY_PATH before rear recover to:
/usr/lib/usr/lib64:/opt/tivoli/tsm/client/ba/bin:/opt/tivoli/tsm/client/api/bin64:/opt/tivoli/tsm/client/api/bin:/opt/tivoli/tsm/client/api/bin64/cit/bin

@gdha gdha added enhancement Adaptions and new features external tool The issue depends on other software e.g. third-party backup tools. labels Sep 13, 2019
@gdha gdha self-assigned this Sep 13, 2019
@gdha gdha requested a review from jsmeix September 13, 2019 06:59
@gdha
Copy link
Member

gdha commented Sep 13, 2019

Extra actions done before the restore worked on sles12sp3:
* copy /usr/lib64/libsnapper.so* to rescue image after boot from it
* copy /usr/lib64/libboost.so* to rescue image after boot from it
* copy /usr/lib64/libbtrfs.so* to rescue image after boot from it
* copy /usr/lib64/libstdc++* to rescue image after boot from it
* set LD_LIBRARY_PATH before rear recover to:
/usr/lib/usr/lib64:/opt/tivoli/tsm/client/ba/bin:/opt/tivoli/tsm/client/api/bin64:/opt/tivoli/tsm/client/api/bin:/opt/tivoli/tsm/client/api/bin64/cit/bin

A few things as remark and/or comments:

  • these libraries were automatically copied to the rescue image, right?
  • could you verify script /usr/share/rear/prep/TSM/default/400_prep_tsm.sh as it defines TSM_LD_LIBRARY_PATH=$TSM_LD_LIBRARY_PATH:$gsk_dir
  • perhaps you could write this variable to the $ROOTFS_DIR//etc/rear/rescue.conf file in above mentioned prep script:
echo "TSM_LD_LIBRARY_PATH=\"$TSM_LD_LIBRARY_PATH:$gsk_dir\"" >> $ROOTFS_DIR//etc/rear/rescue.conf
  • give it a try and if it works add it to the PR as we are not able to test the PR due to lack of HW

@gdha gdha self-requested a review September 13, 2019 07:00
@jsmeix jsmeix added this to the ReaR future milestone Sep 13, 2019
@jsmeix
Copy link
Member

jsmeix commented Sep 13, 2019

@schabrolles
could you please review this one because neither I nor @gdha
have TSM so that we cannot actually review it.

@@ -20,9 +20,10 @@ local backup_restore_log_suffix="restore.log"
# echo -n $CONFIG_APPEND_FILES (without double quotes) is used to avoid leading and trailing spaces and newlines:
test "$CONFIG_APPEND_FILES" && backup_restore_log_prefix=$backup_restore_log_prefix.$( echo -n $CONFIG_APPEND_FILES | tr -d -c '[:alnum:]/[:space:]' | tr -s '/[:space:]' ':_' )
local backup_restore_log_filespace=""
local dsmc_parallel="false"
Copy link
Member

@jsmeix jsmeix Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think such an option should be user controllable
via a new TSM_... config variable in default.conf
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L1586
e.g. something like

# Restore TSM filespaces in parallel by running
# for each TSM filespace a separated 'dsmc restore' process:
TSM_DSMC_RESTORE_PARALLEL="false"

and then use TSM_DSMC_RESTORE_PARALLEL
instead of dsmc_parallel in the code here.

}

for num in $TSM_RESTORE_FILESPACE_NUMS ; do
if test "$dsmc_parallel" == "true" ; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we do not use things like "$VAR" == "true" because
since a longer time we have the is_true and is_false functions
see usr/share/rear/lib/global-functions.sh
https://raw.githubusercontent.com/rear/rear/master/usr/share/rear/lib/global-functions.sh
how to use them.

In general have a look at
https://github.com/rear/rear/wiki/Coding-Style

@jsmeix
Copy link
Member

jsmeix commented Sep 13, 2019

@cookie33
I do not have TSM but out of curiosity
I wonder how the messages look on the terminal
while several dsmc restore processes are running in parallel.
Does that look somewhat confusing or perhaps even messed up?

@jsmeix
Copy link
Member

jsmeix commented Sep 13, 2019

I think each dsmc restore process needs its own
separated backup_restore_log_file because otherwise
the error handling does no longer work correctly.

Copy link
Member

@jsmeix jsmeix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think multiple dsmc restore processes need separated log files.

@schabrolles
Copy link
Member

@jsmeix, I’m currently in vacation till the end of the month. I will try to do my best in october, but will be busy with client on-site requests.

@jsmeix
Copy link
Member

jsmeix commented Sep 14, 2019

@schabrolles
take your time (this is an enhancement for "ReaR future")
and thank you in advance!

I am also not in the office currently and for some more weeks
so that I cannot do much for ReaR.
In particular I cannot try out or test anything for ReaR.
I expect to be back in the office at about beginning of October.
I also expect that I have to do first and foremost other stuff with higher priority.

@jsmeix
Copy link
Member

jsmeix commented Sep 14, 2019

@cookie33 @schabrolles
I am wondering about another possible generic issue with parallel restores.

In the section "Running Multiple Backups and Restores in Parallel" in
https://github.com/rear/rear/blob/master/doc/user-guide/11-multiple-backups.adoc
I wrote in particular (excerpt a bit modified here)

system recovery with multiple backups requires that
first and foremost the basic system is recovered
where all files must be restored that are needed
to ... [get] ... the basic system into a normal usable state

One reason is that in particular the tree of directories
of the basic system must have been restored as a precondition
that subsequent backup restore operations can succeed.

The concern is that subsequent backup restore operations may fail
or restore incorrectly when basic system directories are not yet there.

For example assume the files in /opt/mystuff/ are in a separated backup.
When the files of the basic system (in this example the /opt/ directory)
are restored in parallel with the separated backup of /opt/mystuff/
it may happen that the files in /opt/mystuff/ are restored before
the /opt/ directory was restored.

The concern is that it is not clear what the final result is in that case.

Perhaps it fails to restore the files in /opt/mystuff/ when /opt/ is not yet there?

Perhaps it does not fail to restore the files in /opt/mystuff/ when /opt/ is not yet there
but it creates the missing /opt/ directory with fallback owner/group/permissions/ACLs/...
that may differ from what /op/ had on the original system?

So the concern with multiple backup restores in parallel is
how to ensure that the final overall backup restore result
always matches exactly what there was on the original system.

Copy link
Member

@gdha gdha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cookie33 I fully agree with @jsmeix comments. Please make the necessary changes before @schabrolles comes back from holiday.

@gdha
Copy link
Member

gdha commented Feb 21, 2020

@schabrolles Could you please review this PR for a moment and give @cookie33 the feedback?

Copy link
Member

@schabrolles schabrolles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the general idea.
But as @jsmeix suggest, I would use a specific TSM_DSMC_RESTORE_FC_PARALLEL variable in configuration files and separated log for each FS restored.

@github-actions
Copy link

Stale pull request message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adaptions and new features external tool The issue depends on other software e.g. third-party backup tools. no-pr-activity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants