Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add excludeFromBackup on macOS #2159

Conversation

norio-nomura
Copy link
Contributor

@norio-nomura norio-nomura commented Jan 23, 2024

# Exclude from backup: "all", "disks" or "none"
# Specify which file in the instance's configuration to exclude from backup.
# Only available on macOS.
# 馃煝 Builtin default: "disks"
excludeFromBackup: null

Sets the NSURLIsExcludedFromBackupKey attribute to:

  • "all": the instance directory
  • "disks": basedisk and diffdisk files
  • "none": none

@jandubois
Copy link
Member

I'm curious what this setting does. I assumed NSURLIsExcludedFromBackupKey would only apply to files under ~/Documents and would prevent them from being backed up to iCloud. This normally does not apply to Lima VMs, which are stored under ~/.lima by default.

Is this flag also used by Time Machine? I know that the Time Machine exclusion list is stored in a plist file somewhere and can be edited with the Time Machine preference panel, our via the tmutil commandline tool. Would files with the NSURLIsExcludedFromBackupKey flag also show up in that exclusion list?

Just trying to understand the motivation for this PR, and the actual scope.

@norio-nomura
Copy link
Contributor Author

norio-nomura commented Jan 23, 2024

You can check the exclusion by using tmutil isexcluded.

With excludeFromBackup: "disks"

$ tmutil isexcluded ~/.lima/docker{,/*}
[Included]    /System/Volumes/Data/Users/norio/.lima/docker
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/basedisk
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/cidata.iso
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/default_ep.sock
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/default_fd.sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/diffdisk
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ha.pid
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ha.sock
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ha.stderr.log
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ha.stdout.log
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/launchd.stderr.log
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/launchd.stdout.log
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/lima.yaml
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/serialv.log
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/sock
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ssh.config
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/ssh.sock
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/vz-efi
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/vz-identifier
[Included]    /System/Volumes/Data/Users/norio/.lima/docker/vz.pid

@norio-nomura
Copy link
Contributor Author

With excludeFromBackup: "all"

$ tmutil isexcluded ~/.lima/docker{,/*}
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/basedisk
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/cidata.iso
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/default_ep.sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/default_fd.sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/diffdisk
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ha.pid
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ha.sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ha.stderr.log
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ha.stdout.log
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/launchd.stderr.log
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/launchd.stdout.log
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/lima.yaml
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/serialv.log
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ssh.config
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/ssh.sock
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/vz-efi
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/vz-identifier
[Excluded]    /System/Volumes/Data/Users/norio/.lima/docker/vz.pid

@afbjorklund

This comment was marked as off-topic.

Copy link
Member

@jandubois jandubois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a quick look, but didn't actually test it. Just some spelling nits.

pkg/start/start.go Outdated Show resolved Hide resolved
pkg/limayaml/validate.go Outdated Show resolved Hide resolved
```yml
# Exclude from backup: "all", "disks" or "none"
# Specify which file in the instance's configuration to exclude from backup.
# Only available on macOS.
# 馃煝 Builtin default: "disks"
excludeFromBackup: null
```

Sets the `NSURLIsExcludedFromBackupKey` attribute to:
- "all": the instance directory
- "disks": `basedisk` and `diffdisk` files
- "none": none

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>

Update pkg/start/start.go

Fix typo

Co-authored-by: Jan Dubois <jan@jandubois.com>

Update pkg/limayaml/validate.go

Fix typo

Co-authored-by: Jan Dubois <jan@jandubois.com>
@jandubois
Copy link
Member

  • "all": the instance directory
  • "disks": basedisk and diffdisk files
  • "none": none

I'm not sure if we need the disks setting; I think it could be just a boolean. The only file besides the disk worth backing up is lima.yaml; everything else is generated data.

So I think we still could do a boolean for excludeFromBackup:

  • true: only lima.yaml will be backed up
  • false: everything will be backed up

It feels simpler to me, but curious what others think.

@afbjorklund
Copy link
Contributor

afbjorklund commented Jan 23, 2024

Am I right in understanding that this only affects installations where you have changed the location, and only in the case where $LIMA_HOME has not been added to the backup exclude list? (I think it will be backed up on Linux, though*.)

* since Lima doesn't follow the XDG standard (except for ~/.cache dir)

@norio-nomura
Copy link
Contributor Author

true: only lima.yaml will be backed up

Excluding files other than lima.yaml from backup requires setting NSURLIsExcludedFromBackupKey for each file. I feel the balance is not good between the amount of code changes required and the backup reduction effect achieved by this. 馃

@jandubois
Copy link
Member

Am I right in understanding that this only affects installations where you have changed the location, and only in the case where $LIMA_HOME has not been added to the backup exclude list? (I think it will be backed up on Linux, though*.)

No, it seems to be effective for Time Machine as well, so would apply to any location. I was mistaken in thinking it only applies to iCloud backups from ~/Documents.

Excluding files other than lima.yaml from backup requires setting NSURLIsExcludedFromBackupKey for each file. I feel the balance is not good between the amount of code changes required and the backup reduction effect achieved by this. 馃

That is a good point. I agree because I suspect it would also pollute the exclusion list displayed in the Time Machine preference pane with too many entries.

@norio-nomura
Copy link
Contributor Author

I agree because I suspect it would also pollute the exclusion list displayed in the Time Machine preference pane with too many entries.

Files excluded from backup with NSURLIsExcludedFromBackupKey do not appear in the Time Machine preference.

@jandubois
Copy link
Member

I wonder if the backup status of instances should be included in limactl ls output. If yes, what would be a compact name for the column?

I have 2 more questions, that are related, but outside the scope of this PR:

  1. Should there be a way to exclude all of $LIMA_HOME from backups instead of setting it on each instance individually?

    This can be achieved with override.yaml and defaults.yaml, or better manually via tmutil or the preference dialog. So I think this just needs documentation, and no code.

    I suspect excluding all of $LIMA_HOME means that the excludeFromBackups settings of individual instances will be ignored, but don't actually know. If they are ignore, should we display a warning?

  2. Should there be a way to exclude data disks from backups? What should be the mechanism to specify that?

    I think we could add another subcommand limactl disk exclude-from-backups $INSTANCE or similar. The status should then be included in limactl disk ls output.

@jandubois
Copy link
Member

Files excluded from backup with NSURLIsExcludedFromBackupKey do not appear in the Time Machine preference.

So what happens if you set NSURLIsExcludedFromBackupKey on the directory to true, but on lima.yaml to false. Will lima.yaml be excluded because the directory is excluded, or will the setting on the file override the setting on the directory?

@norio-nomura
Copy link
Contributor Author

So what happens if you set NSURLIsExcludedFromBackupKey on the directory to true, but on lima.yaml to false.

As indicated in #2159 (comment) , the lima.yaml is excluded because the directory is excluded.

@norio-nomura
Copy link
Contributor Author

The individual NSURLIsExcludedFromBackupKey settings can be checked with xattr.
With excludeFromBackup: "disks":

$ xattr -r ~/.lima/docker 2>/dev/null|grep exclude
/Users/norio/.lima/docker/basedisk: com.apple.metadata:com_apple_backup_excludeItem
/Users/norio/.lima/docker/diffdisk: com.apple.metadata:com_apple_backup_excludeItem

With excludeFromBackup: "all":

$ xattr -r ~/.lima/docker 2>/dev/null|grep exclude
/Users/norio/.lima/docker: com.apple.metadata:com_apple_backup_excludeItem

@norio-nomura
Copy link
Contributor Author

I've come up with a better idea, so I'm closing this PR. I might reopen this if that doesn't go well.

norio-nomura added a commit to norio-nomura/lima that referenced this pull request Feb 3, 2024
Host provisioning scripts are executed every time before starting the instance.
- the working directory is the instance directory `{{.Dir}}`
- the `runtime.GOOS` is used to determine the host OS. e.g. `darwin` for macOS, `linux` for Linux, and `windows` for Windows.
- if `wait` is true and the script exits with a non-zero status, the instance start will be aborted.

`shell` and `script` can include these template variables:
- `{{.ScriptName}}` that represents the temporary script file path.
- `{{.Index}}` that represents the index in the list of host provisioning scripts (0-based).
- template variables available in `limactl list --format` command.

馃煝 Builtin default: null

e.g.
```yaml
hostProvision:
- debug: false      # change the temporary script location to {{.Dir}} and not delete it after execution. default: false
  hostOS: darwin    # string or []string. The script is executed only on the specified host OS.
  script: |         # passed to the shell as temporary file argument if exists
    xattr -w com.apple.metadata:com_apple_backup_excludeItem true {{.Dir}}/{basedisk,diffdisk}
  shell: bash       # default: null
  wait: true        # wait for the script to finish before starting the instance. default: true
```
If no shell is given, the default shell is selected based on the host OS. If the default shell is not located on the PATH, fallbacks to `sh` (when host OS is not windows) or `powershell` (when host OS is windows).

`shell` can be either:
1. Builtin / Explicitly supported keywords

| Keyword      | Command run internally                                 | Description                                    |
| ------------ | ------------------------------------------------------ | ---------------------------------------------- |
| `bash`       | `bash --noprofile --norc -eo pipefail {{.ScriptName}}` | The default shell when host OS is not windows. |
| `sh`         | `sh -e {{.ScriptName}}`                                |                                                |
| `pwsh`       | `pwsh -command ". '{{.ScriptName}}'"`                  | The default shell when host OS is windows.     |
| `powershell` | `powershell -command ". '{{.ScriptName}}'"`            |                                                |
| `cmd`        | `cmd /D /E:ON /V:OFF /S /C "CALL "{{.ScriptName}}""`   |                                                |

2. Template string: `command [...options] {{.ScriptName}} [...more_options]`
  `{{.ScriptName}}` is replaced with the temporary script file path

there are shorthand forms for the builtin shells:
```yaml
- bash: echo "executed by bash"                    # interpreted as {shell: bash, hostOS: [darwin, linux], script: ...}
- sh: echo "executed by sh"                        # interpreted as {shell: sh, hostOS: [darwin, linux], script: ...}
- pwsh: Write-Host "executed by pwsh"              # interpreted as {shell: pwsh, hostOS: [windows], script: ...}
- powershell: Write-Host "executed by powershell"  # interpreted as {shell: powershell, hostOS: [windows], script: ...}
- cmd: echo "executed by cmd"                      # interpreted as {shell: cmd, hostOS: [windows], script: ...}
```

e.g.
```yaml
- bash: | # Post a notification when an error by the hostProvision script is detected
    jq=/opt/homebrew/bin/jq && test -x $jq || exit 0
    tail -n0 -F ha.stderr.log | while read -r line; do
      msg=$(echo "$line"|$jq -er '
        select(.hostProvision and .hostProvision != {{.Index}})| # select log lines from other hostProvision scripts
        select(.level == "error")|                               # select error log lines
        .msg
      ') || continue
      osascript -e "on run argv" -e "display notification (item 1 of argv) with title \"Lima\"" -e "end run" "$msg"
      echo Posted a notification
    done
  debug: false
  hostOS: darwin
  wait: false
```

This PR is an alternative solution to lima-vm#2159.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
norio-nomura added a commit to norio-nomura/lima that referenced this pull request Feb 4, 2024
Host provisioning scripts are executed every time before starting the instance.
- the working directory is the instance directory `{{.Dir}}`
- the `runtime.GOOS` is used to determine the host OS. e.g. `darwin` for macOS, `linux` for Linux, and `windows` for Windows.
- if `wait` is true and the script exits with a non-zero status, the instance start will be aborted.

`shell` and `script` can include these template variables:
- `{{.ScriptName}}` that represents the temporary script file path.
- `{{.Index}}` that represents the index in the list of host provisioning scripts (0-based).
- template variables available in `limactl list --format` command.

馃煝 Builtin default: null

e.g.
```yaml
hostProvision:
- debug: false      # change the temporary script location to {{.Dir}} and not delete it after execution. default: false
  hostOS: darwin    # string or []string. The script is executed only on the specified host OS.
  script: |         # passed to the shell as temporary file argument if exists
    xattr -w com.apple.metadata:com_apple_backup_excludeItem true {{.Dir}}/{basedisk,diffdisk}
  shell: bash       # default: null
  wait: true        # wait for the script to finish before starting the instance. default: true
```
If no shell is given, the default shell is selected based on the host OS. If the default shell is not located on the PATH, fallbacks to `sh` (when host OS is not windows) or `powershell` (when host OS is windows).

`shell` can be either:
1. Builtin / Explicitly supported keywords

| Keyword      | Command run internally                                 | Description                                    |
| ------------ | ------------------------------------------------------ | ---------------------------------------------- |
| `bash`       | `bash --noprofile --norc -eo pipefail {{.ScriptName}}` | The default shell when host OS is not windows. |
| `sh`         | `sh -e {{.ScriptName}}`                                |                                                |
| `pwsh`       | `pwsh -command ". '{{.ScriptName}}'"`                  | The default shell when host OS is windows.     |
| `powershell` | `powershell -command ". '{{.ScriptName}}'"`            |                                                |
| `cmd`        | `cmd /D /E:ON /V:OFF /S /C "CALL "{{.ScriptName}}""`   |                                                |

2. Template string: `command [...options] {{.ScriptName}} [...more_options]`
  `{{.ScriptName}}` is replaced with the temporary script file path

there are shorthand forms for the builtin shells:
```yaml
- bash: echo "executed by bash"                    # interpreted as {shell: bash, hostOS: [darwin, linux], script: ...}
- sh: echo "executed by sh"                        # interpreted as {shell: sh, hostOS: [darwin, linux], script: ...}
- pwsh: Write-Host "executed by pwsh"              # interpreted as {shell: pwsh, hostOS: [windows], script: ...}
- powershell: Write-Host "executed by powershell"  # interpreted as {shell: powershell, hostOS: [windows], script: ...}
- cmd: echo "executed by cmd"                      # interpreted as {shell: cmd, hostOS: [windows], script: ...}
```

e.g.
```yaml
- bash: | # Post a notification when an error by the hostProvision script is detected
    jq=/opt/homebrew/bin/jq && test -x $jq || exit 0
    tail -n0 -F ha.stderr.log | while read -r line; do
      msg=$(echo "$line"|$jq -er '
        select(.hostProvision and .hostProvision != {{.Index}})| # select log lines from other hostProvision scripts
        select(.level == "error")|                               # select error log lines
        .msg
      ') || continue
      osascript -e "on run argv" -e "display notification (item 1 of argv) with title \"Lima\"" -e "end run" "$msg"
      echo Posted a notification
    done
  debug: false
  hostOS: darwin
  wait: false
```

This PR is an alternative solution to lima-vm#2159.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
norio-nomura added a commit to norio-nomura/lima that referenced this pull request Feb 4, 2024
Host provisioning scripts are executed every time before starting the instance.
- the working directory is the instance directory `{{.Dir}}`
- the `runtime.GOOS` is used to determine the host OS. e.g. `darwin` for macOS, `linux` for Linux, and `windows` for Windows.
- if `wait` is true and the script exits with a non-zero status, the instance start will be aborted.

`shell` and `script` can include these template variables:
- `{{.ScriptName}}` that represents the temporary script file path.
- `{{.Index}}` that represents the index in the list of host provisioning scripts (0-based).
- template variables available in `limactl list --format` command.

馃煝 Builtin default: null

e.g.
```yaml
hostProvision:
- debug: false      # change the temporary script location to {{.Dir}} and not delete it after execution. default: false
  hostOS: darwin    # string or []string. The script is executed only on the specified host OS.
  script: |         # passed to the shell as temporary file argument if exists
    xattr -w com.apple.metadata:com_apple_backup_excludeItem true {{.Dir}}/{basedisk,diffdisk}
  shell: bash       # default: null
  wait: true        # wait for the script to finish before starting the instance. default: true
```
If no shell is given, the default shell is selected based on the host OS. If the default shell is not located on the PATH, fallbacks to `sh` (when host OS is not windows) or `powershell` (when host OS is windows).

`shell` can be either:
1. Builtin / Explicitly supported keywords

| Keyword      | Command run internally                                 | Description                                    |
| ------------ | ------------------------------------------------------ | ---------------------------------------------- |
| `bash`       | `bash --noprofile --norc -eo pipefail {{.ScriptName}}` | The default shell when host OS is not windows. |
| `sh`         | `sh -e {{.ScriptName}}`                                |                                                |
| `pwsh`       | `pwsh -command ". '{{.ScriptName}}'"`                  | The default shell when host OS is windows.     |
| `powershell` | `powershell -command ". '{{.ScriptName}}'"`            |                                                |
| `cmd`        | `cmd /D /E:ON /V:OFF /S /C "CALL "{{.ScriptName}}""`   |                                                |

2. Template string: `command [...options] {{.ScriptName}} [...more_options]`
  `{{.ScriptName}}` is replaced with the temporary script file path

there are shorthand forms for the builtin shells:
```yaml
- bash: echo "executed by bash"                    # interpreted as {shell: bash, hostOS: [darwin, linux], script: ...}
- sh: echo "executed by sh"                        # interpreted as {shell: sh, hostOS: [darwin, linux], script: ...}
- pwsh: Write-Host "executed by pwsh"              # interpreted as {shell: pwsh, hostOS: [windows], script: ...}
- powershell: Write-Host "executed by powershell"  # interpreted as {shell: powershell, hostOS: [windows], script: ...}
- cmd: echo "executed by cmd"                      # interpreted as {shell: cmd, hostOS: [windows], script: ...}
```

e.g.
```yaml
- bash: | # Post a notification when an error by the hostProvision script is detected
    jq=/opt/homebrew/bin/jq && test -x $jq || exit 0
    tail -n0 -F ha.stderr.log | while read -r line; do
      msg=$(echo "$line"|$jq -er '
        select(.hostProvision and .hostProvision != {{.Index}})| # select log lines from other hostProvision scripts
        select(.level == "error")|                               # select error log lines
        .msg
      ') || continue
      osascript -e "on run argv" -e "display notification (item 1 of argv) with title \"Lima\"" -e "end run" "$msg"
      echo Posted a notification
    done
  debug: false
  hostOS: darwin
  wait: false
```

This PR is an alternative solution to lima-vm#2159.

Signed-off-by: Norio Nomura <norio.nomura@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants