Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect against colons in pvdisplay output #2903

Merged
merged 1 commit into from Jan 8, 2023

Conversation

pcahyna
Copy link
Member

@pcahyna pcahyna commented Jan 2, 2023

Pull Request Details:
  • Type: Bug Fix

  • Impact: Normal

  • Reference to related issue (URL): fixes Support by-path PVs #1958

  • How was this pull request tested?

    • full back-up and recovery on RHEL 8.7 / x86_64 (BIOS) with SATA disk, LVM in default configuration
    • full back-up and recovery on RHEL 8.7 / x86_64 (BIOS) with SATA disk, LVM configured to show /dev/disk/by-path device symlinks instead of the usual /dev/sd* device nodes by using this line in /etc/lvm/lvm.conf:
      filter = [ "r|/dev/disk/by-path/.*-usb-|", "a|/dev/disk/by-path/pci-.*-nvme-|", "a|/dev/disk/by-path/pci-.*-scsi-|", "a|/dev/disk/by-path/pci-.*-ata-|", "a|/dev/disk/by-path/pci-.*-sas-|", "a|loop|", "r|.*|" ].
      This case breaks with the original code already during rear savelayout and is fixed by this change.
  • Brief description of the changes in this pull request:

LVM can be configured to show device names under /dev/disk/by-path in command output. These names often contain colons that separate fields like channel and target (for example /dev/disk/by-path/pci-*-scsi-0:0:1:0-*, similarly the pci-* part, which contains colon-separated PCI bus and device numbers). Since the "pvdisplay -c" output also uses colons as field separators and does not escape embedded colons in any way, embedded colons break parsing of this output.

As a fix, use the pipe character '|' as the field separator in pvdisplay output. (This would break if a PV device has a '|' in its name, but this is very much less likely than having a ':' .)

I did a survey of how similar problems are handled in other tools. Some Solaris tools (ipadm, dladm, zoneadm) have a special switch (-p) to request machine-parseable format. This format has colon-separated fields, and embedded colons are escaped by backslash to allow parsing by the shell "read" command (this escaping is the key functionality missing in the pvdisplay command). zfs has a -H option which uses an alternative parseable format: in addition to omitting headers it separates the fields by a single TAB character. This would be in principle ideal for our purpose (TABs are unlikely to occur in the values), but in our case the output is processed via an unquoted "echo" command, which would not preserve the TAB character. (This is needed to remove the two leading spaces in the output, not sure what their purpose is.) I believe that using | as the separator is the best choice in our situation.

Also, configure explicitly what fields to output - "pvdisplay -c" prints many fields, but I have not found documentation about what fields is it using exactly, so one had to guess what the output means. Using "pvdisplay -C" and selecting the fields explicitly is much clearer.

This also changes the PV size field to match documentation, the comment says that size is in bytes, but it actually was not in bytes. As nothing is actually using the PV size field, this inconsistency has not caused any problem in practice, and no code needs adjusting for the change.

Fix provided by a customer in https://bugzilla.redhat.com/show_bug.cgi?id=2091163 .

LVM can be configured to show device names under /dev/disk/by-path
in command output. These names often contain colons that separate fields
like channel and target (for example /dev/disk/by-path/pci-*-scsi-0:0:1:0-*,
similarly the pci-* part, which contains colon-separated PCI bus and
device numbers). Since the "pvdisplay -c" output also uses colons as
field separators and does not escape embedded colons in any way,
embedded colons break parsing of this output.

As a fix, use the pipe character '|' as the field separator in pvdisplay
output. (This would break if a PV device has a '|' in its name, but this
is very much less likely than having a ':' .)

Also, configure explicitly what fields to output - "pvdisplay -c"
prints many fields, but I have not found documentation about what fields
is it using exactly, so one had to guess what the output means. Using
"pvdisplay -C" and selecting the fields explicitly is much clearer.

This also changes the PV size field to match documentation, the comment
says that size is in bytes, but it actually was not in bytes. As nothing
is actually using the PV size field, this inconsistency has not caused
any problem in practice, and no code needs adjusting for the change.
@pcahyna pcahyna mentioned this pull request Jan 2, 2023
@pcahyna pcahyna requested a review from a team January 2, 2023 15:23
@jsmeix
Copy link
Member

jsmeix commented Jan 2, 2023

@pcahyna
while you are at that code
and when you like to do it:

I think when we have '|' as field separator
the whole parsing could be simplified and
made more straightforward as follows

# { echo '  /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part1|system|107340627968|7wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W7' ; \
    echo '  /dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part2 | system2 |  207340627962|2wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W2  ' ; } \
  | tr -d ' ' \
  | while IFS='|' read pdev vgrp size uuid ; \
    do echo -e "\npdev='$pdev'\nvgrp='$vgrp'\nsize='$size'\nuuid='$uuid'\n" ; \
    done

pdev='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part1'
vgrp='system'
size='107340627968'
uuid='7wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W7'


pdev='/dev/disk/by-path/pci-0000:03:00.0-scsi-0:0:1:0-part2'
vgrp='system2'
size='207340627962'
uuid='2wwpcO-KmNN-qsTE-7sp7-JBJS-vBdC-Zyt1W2'

I remove all space characters via tr -d ' '
because space is the field separator in disklayout.conf
according to "Disk layout file syntax" in
https://github.com/rear/rear/blob/master/doc/user-guide/06-layout-configuration.adoc
so spaces in values are forbidden anyway.

The

... | while IFS='|' read ... ; do ...

sets the non-standard IFS value only for the 'read' command
but it keeps the standard IFS value (space, tab, newline)
for the commands in the 'while' body:

# echo 'foo|bar' | while IFS='|' read a b ; do echo "a=$a" ; echo -n "IFS=$IFS" | od -a ; echo "b=$b" ; done

a=foo
0000000   I   F   S   =  sp  ht  nl
0000007
b=bar

@jsmeix jsmeix added this to the ReaR v2.8 milestone Jan 4, 2023
@pcahyna
Copy link
Member Author

pcahyna commented Jan 4, 2023

Hello @jsmeix , thank you for the suggestion, but I must say I am reluctant to make such changes to the layout code, if there is no bug to fix (except the bug fixed by the PR in the current form) .

@jsmeix
Copy link
Member

jsmeix commented Jan 5, 2023

@pcahyna
it is perfectly fine with me to only fix the actual issue here.

I had meant it only as a suggestion for further improvement.
Perhaps I may do that when I have to work on that code again.

@pcahyna pcahyna merged commit 80a6842 into rear:master Jan 8, 2023
11 of 13 checks passed
pcahyna added a commit to pcahyna/rear that referenced this pull request Feb 17, 2023
Protect against colons in pvdisplay output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support by-path PVs
3 participants