New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debian: make builder more defensive against network problems #1424

Open
marmarek opened this Issue Nov 14, 2015 · 3 comments

Comments

Projects
None yet
4 participants
@marmarek
Member

marmarek commented Nov 14, 2015

When building in TorVM-connected DispVM, it's hard to get actual result because
a single apt-get error makes the whole build fail. It would be nice to force
apt-get to retry (even more) on network-related errors.

There is already -o Acquire::Retries=3, but apparently it isn't enough.

Example errors:

+ chroot /mnt/removable/chroot-jessie eatmydata apt-get -o Dpkg::Options::=--force-unsafe-io -o Acquire::Retries=3 -y install reprepro build-essential devscripts git git-buildpackage debhelper quilt libxen-dev python libpulse-dev libtool automake xorg-dev xutils-dev libxdamage-dev libxcomposite-dev libxt-dev libx11-dev
(...)
E: Failed to fetch http://http.debian.net/debian/pool/main/u/unzip/unzip_6.0-16_amd64.deb  Undetermined Error

E: Failed to fetch http://http.debian.net/debian/pool/main/w/wdiff/wdiff_1.2.2-1_amd64.deb  Undetermined Error

E: Failed to fetch http://http.debian.net/debian/pool/main/x/xdg-user-dirs/xdg-user-dirs_0.15-2_amd64.deb  Undetermined Error

E: Failed to fetch http://http.debian.net/debian/pool/main/r/reprepro/reprepro_4.16.0-1_amd64.deb  Undetermined Error

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
/mnt/removable/qubes-src/builder-debian/Makefile.debian:151: recipe for target '/mnt/removable/chroot-jessie/home/user/.prepared_base' failed
make[1]: *** [/mnt/removable/chroot-jessie/home/user/.prepared_base] Error 100
Makefile:193: recipe for target 'vmm-xen-vm' failed
make: *** [vmm-xen-vm] Error 1
+ exit 1
After this operation, 1252 kB of additional disk space will be used.
Get:1 http://http.debian.net/debian/ jessie/main python-setuptools all 5.5.1-1 [242 kB]
Get:2 http://http.debian.net/debian/ jessie/main dh-systemd all 1.22 [18.1 kB]
Err http://http.debian.net/debian/ jessie/main libsystemd-dev amd64 215-17+deb8u2
  Could not resolve 'debian.mirror.lhisp.com'
Err http://http.debian.net/debian/ jessie/main python-all amd64 2.7.9-1
  Could not resolve 'debian.mirror.lhisp.com'
Err http://http.debian.net/debian/ jessie/main libsystemd-daemon-dev amd64 215-17+deb8u2
  Could not resolve 'debian.mirror.lhisp.com'
Fetched 260 kB in 3s (85.9 kB/s)
E: Failed to fetch http://http.debian.net/debian/pool/main/s/systemd/libsystemd-dev_215-17+deb8u2_amd64.deb  Could not resolve 'debian.mirror.lhisp.com'

E: Failed to fetch http://http.debian.net/debian/pool/main/p/python-defaults/python-all_2.7.9-1_amd64.deb  Could not resolve 'debian.mirror.lhisp.com'

E: Failed to fetch http://http.debian.net/debian/pool/main/s/systemd/libsystemd-daemon-dev_215-17+deb8u2_amd64.deb  Could not resolve 'debian.mirror.lhisp.com'

E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
/mnt/removable/qubes-src/builder-debian/Makefile.debian:170: recipe for target 'dist-build-dep' failed
make[2]: *** [dist-build-dep] Error 123
--> build failed!
Makefile.generic:138: recipe for target 'packages' failed
make[1]: *** [packages] Error 1
Makefile:193: recipe for target 'core-qubesdb-vm' failed
make: *** [core-qubesdb-vm] Error 1
+ exit 1

@marmarek marmarek added this to the Release 3.1 milestone Nov 14, 2015

@marmarek

This comment has been minimized.

Show comment
Hide comment
@marmarek

marmarek Nov 14, 2015

Member

any ideas? @adrelanos

Member

marmarek commented Nov 14, 2015

any ideas? @adrelanos

@adrelanos

This comment has been minimized.

Show comment
Hide comment
@adrelanos

adrelanos Nov 14, 2015

Member

An auto retry feature. Or fully featured error handler.


Whonix has help-steps/pre. (snapshot) It includes a fully features errror handler.

  • auto retries the last failed command (configurable with --retry-max and --retry-wait)
  • interactive (configurable) error handler
  • supports --retry-before and --retry-after hooks
  • offers opening an interactive shell within the image on error (to do manual fix ups and continue later [useful for personal debug builds only] [only a time safer])
  • allows to attempt another manual retry (useful to retry if you know just your ISP did the usual 24 hour forced disconnect or stuff like that)
  • allows to continue, i.e. ignore the error [release quality builds should not have used this]
  • output includes a function back trace
  • output includes a process back trace

Here is a demo:

+ sudo -u user git submodule sync
Synchronizing submodule url for 'packages/anon-apt-sources-list'
...
Synchronizing submodule url for 'packages/anon-shared-helper-scripts'
++ errorhandlergeneral ERR
++ last_failed_exit_code=143
++ last_failed_bash_command='sudo -u "$user_name" git submodule sync'
++ true 'INFO: Middle of function errorhandlergeneral of ././build-steps.d/1100_prepare-build-machine.'
++ errorhandlerprocessshared ERR
++ last_script=././build-steps.d/1100_prepare-build-machine
++ trap_signal_type_previous=
++ '[' '' = '' ']'
++ trap_signal_type_previous=unset
++ trap_signal_type_last=ERR
++ whonix_build_error_counter=1
+++ benchmarktimeend 1447509137
++++ date +%s
+++ benchmarktimeend=1447509139
+++ benchmark_took_seconds=2
++++ convertsecs 2
++++ local h m s
++++ (( h=2/3600 ))
++++ true
++++ (( m=(2%3600)/60 ))
++++ true
++++ (( s=2%60 ))
++++ printf '%02d:%02d:%02d\n' 0 0 2
+++ echo 00:00:02
++ benchmark_took_time=00:00:02
++ processbacktracefunction
++ true 'INFO: BEGIN: processbacktracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : processbacktracefunction'
++ functiontracefunction
++ true 'INFO: BEGIN: functiontracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : functiontracefunction'
++ true '
############################################################
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
anon_dist_build_version: 
(whonix_build_error_counter: 1)
(benchmark: 00:00:02)
trap_signal_type_previous: unset
trap_signal_type_last    : ERR
process_backtrace_result:
1: : /sbin/init 
2: : konsole 
3: : /bin/bash 
4: : sudo ./whonix_build --flavor whonix-gateway -- --build --target raw 
5: : /bin/bash ./whonix_build --flavor whonix-gateway -- --build --target raw 
6: : /bin/bash ./help-steps/whonix_build_one --flavor whonix-gateway --build --target raw 
7: : /bin/bash ././build-steps.d/1100_prepare-build-machine 
function_trace_result:
main (line number: 341)
main (line number: 205)
errorhandlergeneral (line number: 311)
errorhandlerprocessshared (line number: 159)
last_failed_bash_command: sudo -u "$user_name" git submodule sync
last_failed_exit_code: 143
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
############################################################
'
++ '[' ERR = INT ']'
++ '[' ERR = TERM ']'
++ '[' ERR = ERR ']'
++ true 'INFO: trap_signal_type_last: ERR, considering auto retry...'
++ '[' '!' 1 = 0 ']'
++ '[' '' = '' ']'
++ whonix_build_auto_retry_counter=1
++ '[' -n 1 ']'
++ '[' -n 5 ']'
++ local first
++ read -r first _
++ '[' sudo = error_ ']'
++ '[' 1 -gt 1 ']'
++ true 'INFO: Auto retry attempt number: 1. Max retry attempts: 1 (--retry-max). Auto retry... '
++ whonix_build_auto_retry_counter=2
++ '[' '!' 5 = 0 ']'
++ true 'INFO: Waiting (--retry-wait) 5 seconds before auto retry... '
++ wait 8989
++ sleep 5
++ ignore_error=true
++ error_handler_do_retry=true
++ errorhandlerretry
++ '[' '!' '' = '' ']'
++ true 'INFO: Skipping whonix_build_dispatch_before_retry (--retry-before), because empty, ok.'
++ true 'INFO: Retrying last_failed_bash_command...: sudo -u "$user_name" git submodule sync '
++ retry_last_failed_bash_command_exit_code=0
++ eval sudo -u '"$user_name"' git submodule sync
+++ sudo -u user git submodule sync
Synchronizing submodule url for 'packages/anon-apt-sources-list'
...
Synchronizing submodule url for 'packages/tcp-timestamps-disable'
++ retry_last_failed_bash_command_exit_code=143
++ true
++ '[' 143 = 0 ']'
++ true 'INFO: Retry failed. exit code of last_failed_bash_command: 143 '
++ last_failed_exit_code=143
++ last_failed_bash_command='sudo -u "$user_name" git submodule sync'
++ '[' '!' '' = '' ']'
++ true 'INFO: Skipping whonix_build_dispatch_after_retry (--retry-after), because empty, ok.'
++ '[' 143 = 0 ']'
++ errorhandlerprocessshared 'NONE_(called_by_errorhandlerretry)'
++ last_script=././build-steps.d/1100_prepare-build-machine
++ trap_signal_type_previous=ERR
++ '[' ERR = '' ']'
++ trap_signal_type_last='NONE_(called_by_errorhandlerretry)'
++ whonix_build_error_counter=2
+++ benchmarktimeend 1447509137
++++ date +%s
+++ benchmarktimeend=1447509148
+++ benchmark_took_seconds=11
++++ convertsecs 11
++++ local h m s
++++ (( h=11/3600 ))
++++ true
++++ (( m=(11%3600)/60 ))
++++ true
++++ (( s=11%60 ))
++++ printf '%02d:%02d:%02d\n' 0 0 11
+++ echo 00:00:11
++ benchmark_took_time=00:00:11
++ processbacktracefunction
++ true 'INFO: BEGIN: processbacktracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : processbacktracefunction'
++ functiontracefunction
++ true 'INFO: BEGIN: functiontracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : functiontracefunction'
++ true '
############################################################
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
anon_dist_build_version: 
(whonix_build_error_counter: 2)
(benchmark: 00:00:11)
trap_signal_type_previous: ERR
trap_signal_type_last    : NONE_(called_by_errorhandlerretry)
process_backtrace_result:
1: : /sbin/init 
2: : konsole 
3: : /bin/bash 
4: : sudo ./whonix_build --flavor whonix-gateway -- --build --target raw 
5: : /bin/bash ./whonix_build --flavor whonix-gateway -- --build --target raw 
6: : /bin/bash ./help-steps/whonix_build_one --flavor whonix-gateway --build --target raw 
7: : /bin/bash ././build-steps.d/1100_prepare-build-machine 
function_trace_result:
main (line number: 341)
main (line number: 205)
errorhandlergeneral (line number: 311)
errorhandlerprocessshared (line number: 209)
errorhandlerretry (line number: 144)
errorhandlerprocessshared (line number: 159)
errorhandlerprocessshared (line number: 159)
errorhandlergeneral (line number: 311)
main (line number: 205)
main (line number: 341)
last_failed_bash_command: sudo -u "$user_name" git submodule sync
last_failed_exit_code: 143
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
############################################################
'
++ '[' 'NONE_(called_by_errorhandlerretry)' = INT ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = TERM ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = ERR ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = 'NONE_(called_by_errorhandlerretry)' ']'
++ true 'INFO: trap_signal_type_last: NONE_(called_by_errorhandlerretry), considering auto retry...'
++ '[' '!' 1 = 0 ']'
++ '[' 2 = '' ']'
++ '[' -n 1 ']'
++ '[' -n 5 ']'
++ local first
++ read -r first _
++ '[' sudo = error_ ']'
++ '[' 2 -gt 1 ']'
++ true 'INFO: Auto retried (--retry-max) already 1 times. No more auto retry. '
++ unset whonix_build_auto_retry_counter
++ ignore_error=false
++ answer=
++ '[' 'NONE_(called_by_errorhandlerretry)' = ERR ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = 'NONE_(called_by_errorhandlerretry)' ']'
++ true 'INFO: whonix_build_non_interactive: '
++ '[' '' = true ']'
++ '[' -t 0 ']'
++ true 'INFO: stdin connected to terminal, using interactive error handler.'
++ true 'ERROR in ././build-steps.d/1100_prepare-build-machine detected!
Please have a look above (the block within ###...), note the command that failed, last_failed_exit_code and its output (further above).
- Please enter c and press enter to ignore the error and continue building. (Recommended against!)
- Please press r and enter to retry.
- Please press s and enter to open an chroot interactive shell.
- Please press enter to cleanup and exit.'
++ read -p 'Answer? ' answer
Answer?

Member

adrelanos commented Nov 14, 2015

An auto retry feature. Or fully featured error handler.


Whonix has help-steps/pre. (snapshot) It includes a fully features errror handler.

  • auto retries the last failed command (configurable with --retry-max and --retry-wait)
  • interactive (configurable) error handler
  • supports --retry-before and --retry-after hooks
  • offers opening an interactive shell within the image on error (to do manual fix ups and continue later [useful for personal debug builds only] [only a time safer])
  • allows to attempt another manual retry (useful to retry if you know just your ISP did the usual 24 hour forced disconnect or stuff like that)
  • allows to continue, i.e. ignore the error [release quality builds should not have used this]
  • output includes a function back trace
  • output includes a process back trace

Here is a demo:

+ sudo -u user git submodule sync
Synchronizing submodule url for 'packages/anon-apt-sources-list'
...
Synchronizing submodule url for 'packages/anon-shared-helper-scripts'
++ errorhandlergeneral ERR
++ last_failed_exit_code=143
++ last_failed_bash_command='sudo -u "$user_name" git submodule sync'
++ true 'INFO: Middle of function errorhandlergeneral of ././build-steps.d/1100_prepare-build-machine.'
++ errorhandlerprocessshared ERR
++ last_script=././build-steps.d/1100_prepare-build-machine
++ trap_signal_type_previous=
++ '[' '' = '' ']'
++ trap_signal_type_previous=unset
++ trap_signal_type_last=ERR
++ whonix_build_error_counter=1
+++ benchmarktimeend 1447509137
++++ date +%s
+++ benchmarktimeend=1447509139
+++ benchmark_took_seconds=2
++++ convertsecs 2
++++ local h m s
++++ (( h=2/3600 ))
++++ true
++++ (( m=(2%3600)/60 ))
++++ true
++++ (( s=2%60 ))
++++ printf '%02d:%02d:%02d\n' 0 0 2
+++ echo 00:00:02
++ benchmark_took_time=00:00:02
++ processbacktracefunction
++ true 'INFO: BEGIN: processbacktracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : processbacktracefunction'
++ functiontracefunction
++ true 'INFO: BEGIN: functiontracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : functiontracefunction'
++ true '
############################################################
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
anon_dist_build_version: 
(whonix_build_error_counter: 1)
(benchmark: 00:00:02)
trap_signal_type_previous: unset
trap_signal_type_last    : ERR
process_backtrace_result:
1: : /sbin/init 
2: : konsole 
3: : /bin/bash 
4: : sudo ./whonix_build --flavor whonix-gateway -- --build --target raw 
5: : /bin/bash ./whonix_build --flavor whonix-gateway -- --build --target raw 
6: : /bin/bash ./help-steps/whonix_build_one --flavor whonix-gateway --build --target raw 
7: : /bin/bash ././build-steps.d/1100_prepare-build-machine 
function_trace_result:
main (line number: 341)
main (line number: 205)
errorhandlergeneral (line number: 311)
errorhandlerprocessshared (line number: 159)
last_failed_bash_command: sudo -u "$user_name" git submodule sync
last_failed_exit_code: 143
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
############################################################
'
++ '[' ERR = INT ']'
++ '[' ERR = TERM ']'
++ '[' ERR = ERR ']'
++ true 'INFO: trap_signal_type_last: ERR, considering auto retry...'
++ '[' '!' 1 = 0 ']'
++ '[' '' = '' ']'
++ whonix_build_auto_retry_counter=1
++ '[' -n 1 ']'
++ '[' -n 5 ']'
++ local first
++ read -r first _
++ '[' sudo = error_ ']'
++ '[' 1 -gt 1 ']'
++ true 'INFO: Auto retry attempt number: 1. Max retry attempts: 1 (--retry-max). Auto retry... '
++ whonix_build_auto_retry_counter=2
++ '[' '!' 5 = 0 ']'
++ true 'INFO: Waiting (--retry-wait) 5 seconds before auto retry... '
++ wait 8989
++ sleep 5
++ ignore_error=true
++ error_handler_do_retry=true
++ errorhandlerretry
++ '[' '!' '' = '' ']'
++ true 'INFO: Skipping whonix_build_dispatch_before_retry (--retry-before), because empty, ok.'
++ true 'INFO: Retrying last_failed_bash_command...: sudo -u "$user_name" git submodule sync '
++ retry_last_failed_bash_command_exit_code=0
++ eval sudo -u '"$user_name"' git submodule sync
+++ sudo -u user git submodule sync
Synchronizing submodule url for 'packages/anon-apt-sources-list'
...
Synchronizing submodule url for 'packages/tcp-timestamps-disable'
++ retry_last_failed_bash_command_exit_code=143
++ true
++ '[' 143 = 0 ']'
++ true 'INFO: Retry failed. exit code of last_failed_bash_command: 143 '
++ last_failed_exit_code=143
++ last_failed_bash_command='sudo -u "$user_name" git submodule sync'
++ '[' '!' '' = '' ']'
++ true 'INFO: Skipping whonix_build_dispatch_after_retry (--retry-after), because empty, ok.'
++ '[' 143 = 0 ']'
++ errorhandlerprocessshared 'NONE_(called_by_errorhandlerretry)'
++ last_script=././build-steps.d/1100_prepare-build-machine
++ trap_signal_type_previous=ERR
++ '[' ERR = '' ']'
++ trap_signal_type_last='NONE_(called_by_errorhandlerretry)'
++ whonix_build_error_counter=2
+++ benchmarktimeend 1447509137
++++ date +%s
+++ benchmarktimeend=1447509148
+++ benchmark_took_seconds=11
++++ convertsecs 11
++++ local h m s
++++ (( h=11/3600 ))
++++ true
++++ (( m=(11%3600)/60 ))
++++ true
++++ (( s=11%60 ))
++++ printf '%02d:%02d:%02d\n' 0 0 11
+++ echo 00:00:11
++ benchmark_took_time=00:00:11
++ processbacktracefunction
++ true 'INFO: BEGIN: processbacktracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : processbacktracefunction'
++ functiontracefunction
++ true 'INFO: BEGIN: functiontracefunction'
++ '[' -o xtrace ']'
++ set +x
++ true 'INFO: END  : functiontracefunction'
++ true '
############################################################
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
anon_dist_build_version: 
(whonix_build_error_counter: 2)
(benchmark: 00:00:11)
trap_signal_type_previous: ERR
trap_signal_type_last    : NONE_(called_by_errorhandlerretry)
process_backtrace_result:
1: : /sbin/init 
2: : konsole 
3: : /bin/bash 
4: : sudo ./whonix_build --flavor whonix-gateway -- --build --target raw 
5: : /bin/bash ./whonix_build --flavor whonix-gateway -- --build --target raw 
6: : /bin/bash ./help-steps/whonix_build_one --flavor whonix-gateway --build --target raw 
7: : /bin/bash ././build-steps.d/1100_prepare-build-machine 
function_trace_result:
main (line number: 341)
main (line number: 205)
errorhandlergeneral (line number: 311)
errorhandlerprocessshared (line number: 209)
errorhandlerretry (line number: 144)
errorhandlerprocessshared (line number: 159)
errorhandlerprocessshared (line number: 159)
errorhandlergeneral (line number: 311)
main (line number: 205)
main (line number: 341)
last_failed_bash_command: sudo -u "$user_name" git submodule sync
last_failed_exit_code: 143
ERROR in ././build-steps.d/1100_prepare-build-machine detected!
############################################################
'
++ '[' 'NONE_(called_by_errorhandlerretry)' = INT ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = TERM ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = ERR ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = 'NONE_(called_by_errorhandlerretry)' ']'
++ true 'INFO: trap_signal_type_last: NONE_(called_by_errorhandlerretry), considering auto retry...'
++ '[' '!' 1 = 0 ']'
++ '[' 2 = '' ']'
++ '[' -n 1 ']'
++ '[' -n 5 ']'
++ local first
++ read -r first _
++ '[' sudo = error_ ']'
++ '[' 2 -gt 1 ']'
++ true 'INFO: Auto retried (--retry-max) already 1 times. No more auto retry. '
++ unset whonix_build_auto_retry_counter
++ ignore_error=false
++ answer=
++ '[' 'NONE_(called_by_errorhandlerretry)' = ERR ']'
++ '[' 'NONE_(called_by_errorhandlerretry)' = 'NONE_(called_by_errorhandlerretry)' ']'
++ true 'INFO: whonix_build_non_interactive: '
++ '[' '' = true ']'
++ '[' -t 0 ']'
++ true 'INFO: stdin connected to terminal, using interactive error handler.'
++ true 'ERROR in ././build-steps.d/1100_prepare-build-machine detected!
Please have a look above (the block within ###...), note the command that failed, last_failed_exit_code and its output (further above).
- Please enter c and press enter to ignore the error and continue building. (Recommended against!)
- Please press r and enter to retry.
- Please press s and enter to open an chroot interactive shell.
- Please press enter to cleanup and exit.'
++ read -p 'Answer? ' answer
Answer?

@marmarek marmarek referenced this issue in QubesOS/qubes-gui-agent-linux Nov 8, 2016

Merged

Exclude Trolltech.conf in Xenial build #7

@unman

This comment has been minimized.

Show comment
Hide comment
@unman

unman Apr 16, 2017

Member

@andrewdavidwong Confirmed this issue still arises in 3.2 milestone

Member

unman commented Apr 16, 2017

@andrewdavidwong Confirmed this issue still arises in 3.2 milestone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment