Skip to content

Proposal: Generate an error file #326

@nan-yu

Description

@nan-yu

Objective

The goal of this feature is to export the git-sync errors to a file which can be shared by other sidecar containers.

Motivation

The current git-sync process outputs the error information to standard out, which is inaccessible from outside the container. Users have to dump the logs using kubectl logs in order to check the error details in the git-sync process. This proposal provides users the capability to check the errors directly from other sidecar containers.

Goals

  • Share the most recent git-sync error with other sidecar containers.

Non-Goals

  • This proposal does not intend to redirect all logging information to a file.
  • The error file only keeps the most recent error. It is not supposed to be used as the error history.

High-level Design

There will be a new flag to indicate whether to generate the error file. If the flag is turned on, the error file may or may not exist depending on the sync status.
If sync succeeds, the file will be removed. Otherwise, the file will be created or overwritten with the most recent error message.
To guarantee atomic update to the error file, we always write to a temp file first and then rename that to the error file. This enables the client to know the file has been updated even if the contents are the same.
The client will poll on the error file to get the error message on failures.

Implementation details

Add a flag for --error-file

var flErrorFile = flag.String("error-file", envString("GIT_SYNC_ERROR_FILE", ""), "the path to the error file where to dump the most recent error details")

When to export the error information

1. Before calling os.Exit() with a non-zero code

There are some common steps if the process exits abnormally with a non-zero code: output the error to standard error, print the usage information (optional), export the error to the error file and exit the process. Therefore, we extract these steps into a function.

// handleError prints the error to the standard error, prints the usage if the `printUsage` flag is true,
// exports the error to the error file and exits the process with the exit code.
func handleError(exitCode int, printUsage bool, format string, a ...interface{}) {
	fmt.Fprintf(os.Stderr, format, a...)
	if printUsage {
		flag.Usage()
	}
	exportError(fmt.Sprintf(format, a...))
	os.Exit(exitCode)
}

Below are the cases when the process exits with a non-zero code:

log.Error(err, "too many failures, aborting", "failCount", failCount)
exportError(err.Error())
os.Exit(1)
log.Error(err, "can't tell if rev is a git hash, exiting", "rev", *flRev)
exportError(err.Error())
os.Exit(1)

2. Before retry

log.Error(err, "unexpected error syncing repo, will retry")
exportError(err.Error())
log.V(0).Info("waiting before retrying", "waitTime", waitTime(*flWait))
cancel()

When to clean up the error file

If the sync is successful, we need to clean up the error file generated from the previous sync if it exists. There are three cases to clean up the file:

1. If it is a one-time sync

if *flOneTime {
	deleteErrorFile()
	os.Exit(0)
}

2. If --rev is a commit hash

else if isHash {
	log.V(0).Info("rev appears to be a git hash, no further sync needed", "rev", *flRev)
	deleteErrorFile()
	sleepForever()
}

3. Before the next sync

failCount = 0
deleteErrorFile()
log.V(1).Info("next sync", "wait_time", waitTime(*flWait))

Validation, export and delete functions

The validation function for the new flag:

// validateErrorPath validates if the parent directory of `--error-file` exits and creates one if not exits.
// It also checks if the parent directory has the read and write permission.
func validateErrorPath(errPath string) {
	errDir := filepath.Dir(errPath)
	if errDirInfo, err := os.Stat(errDir); err != nil {
		fmt.Printf("The error parent path %s doesn't exist. Attempt to create it\n", errDir)
		if err = os.MkdirAll(errDir, 0755); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't create directory: %v\n", err)
			os.Exit(1)
		}
	} else if !errDirInfo.IsDir() {
		fmt.Fprintf(os.Stderr, "ERROR: The error parent path %s is not a directory\n", errDir)
		os.Exit(1)
	} else if err = syscall.Access(errDir, syscall.O_RDWR); err != nil {
		fmt.Fprintf(os.Stderr, "ERROR: can't access %s: %v\n", errDir, err)
		os.Exit(1)
	}
}

The export function:

// exportError writes the error content to the error file.
func exportError(content string) {
	if *flErrorFile != "" {
		tmpFile, err := ioutil.TempFile(os.TempDir(), "err-")
		if err != nil {
			fmt.Fprintf(os.Stderr,"Cannot create temporary file: %v\n", err)
			os.Exit(1)
		}

		if _, err = tmpFile.WriteString(content); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't write to temporary file: %v\n", err)
			os.Exit(1)
		}
		defer func() {
			if err := tmpFile.Close(); err != nil {
				fmt.Fprintf(os.Stderr, "ERROR: can't close temporary file: %v\n", err)
			}
		}()

		if err := os.Rename(tmpFile.Name(), *flErrorFile); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't rename to error file: %v\n", err)
			os.Exit(1)
		}
	}
}

The delete function to clean up the error file:

// deleteErrorFile deletes the error file.
func deleteErrorFile() {
	if *flErrorFile != "" {
		if _, err := os.Stat(*flErrorFile); err != nil {
			if os.IsNotExist(err) {
				return
			} else {
				fmt.Fprintf(os.Stderr, "ERROR: can't check the status of the error file: %v\n", err)
				os.Exit(1)
			}
		}

		if err := os.Remove(*flErrorFile); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't delete the error file: %v\n", err)
			os.Exit(1)
		}
	}
}

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions