Proposal: Generate an error file

# Objective
The goal of this feature is to export the git-sync errors to a file which can be shared by other sidecar containers.

# Motivation
The current git-sync process outputs the error information to standard out, which is inaccessible from outside the container. Users have to dump the logs using `kubectl logs` in order to check the error details in the git-sync process. This proposal provides users the capability to check the errors directly from other sidecar containers.

# Goals
- Share the most recent git-sync error with other sidecar containers.

# Non-Goals
- This proposal does not intend to redirect all logging information to a file.
- The error file only keeps the most recent error. It is not supposed to be used as the error history.

# High-level Design
There will be a new flag to indicate whether to generate the error file. If the flag is turned on, the error file may or may not exist depending on the sync status.
If sync succeeds, the file will be removed. Otherwise, the file will be created or overwritten with the most recent error message.
To guarantee atomic update to the error file, we always write to a temp file first and then rename that to the error file. This enables the client to know the file has been updated even if the contents are the same.
The client will poll on the error file to get the error message on failures.

# Implementation details
## Add a flag for `--error-file`
```
var flErrorFile = flag.String("error-file", envString("GIT_SYNC_ERROR_FILE", ""), "the path to the error file where to dump the most recent error details")
```

## When to export the error information
### 1. Before calling `os.Exit()` with a non-zero code
There are some common steps if the process exits abnormally with a non-zero code: output the error to standard error, print the usage information (optional), export the error to the error file and exit the process. Therefore, we extract these steps into a function.
```
// handleError prints the error to the standard error, prints the usage if the `printUsage` flag is true,
// exports the error to the error file and exits the process with the exit code.
func handleError(exitCode int, printUsage bool, format string, a ...interface{}) {
	fmt.Fprintf(os.Stderr, format, a...)
	if printUsage {
		flag.Usage()
	}
	exportError(fmt.Sprintf(format, a...))
	os.Exit(exitCode)
}
```

Below are the cases when the process exits with a non-zero code:
- [Flag initialization errors](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L253)
  If any flag is invalid, export the error before exiting by calling `handleError(1, true|false, "ERROR: can't ....: %v\n", err)`

- [Unhandled pid1 errors](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L239): exit via `handleError(127, false, "ERROR: unhandled pid1 error: %v\n", err)`

- Errors in [`syncRepo`](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L456)
```
log.Error(err, "too many failures, aborting", "failCount", failCount)
exportError(err.Error())
os.Exit(1)
```

- Errors in [`revIsHash`](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L480)
```
log.Error(err, "can't tell if rev is a git hash, exiting", "rev", *flRev)
exportError(err.Error())
os.Exit(1)
```

### 2. Before [retry](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L460)
```
log.Error(err, "unexpected error syncing repo, will retry")
exportError(err.Error())
log.V(0).Info("waiting before retrying", "waitTime", waitTime(*flWait))
cancel()
```

## When to clean up the error file
If the sync is successful, we need to clean up the error file generated from the previous sync if it exists. There are three cases to clean up the file:
### 1. If it is a [one-time](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L475) sync
```
if *flOneTime {
	deleteErrorFile()
	os.Exit(0)
}
```

### 2. If `--rev` is a [commit hash](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L482)
```
else if isHash {
	log.V(0).Info("rev appears to be a git hash, no further sync needed", "rev", *flRev)
	deleteErrorFile()
	sleepForever()
}
```

### 3. Before the [next sync](https://github.com/kubernetes/git-sync/blob/release-3.x/cmd/git-sync/main.go#L489)
```
failCount = 0
deleteErrorFile()
log.V(1).Info("next sync", "wait_time", waitTime(*flWait))
```

## Validation, export and delete functions
The validation function for the new flag:
```
// validateErrorPath validates if the parent directory of `--error-file` exits and creates one if not exits.
// It also checks if the parent directory has the read and write permission.
func validateErrorPath(errPath string) {
	errDir := filepath.Dir(errPath)
	if errDirInfo, err := os.Stat(errDir); err != nil {
		fmt.Printf("The error parent path %s doesn't exist. Attempt to create it\n", errDir)
		if err = os.MkdirAll(errDir, 0755); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't create directory: %v\n", err)
			os.Exit(1)
		}
	} else if !errDirInfo.IsDir() {
		fmt.Fprintf(os.Stderr, "ERROR: The error parent path %s is not a directory\n", errDir)
		os.Exit(1)
	} else if err = syscall.Access(errDir, syscall.O_RDWR); err != nil {
		fmt.Fprintf(os.Stderr, "ERROR: can't access %s: %v\n", errDir, err)
		os.Exit(1)
	}
}
```

The export function:
```
// exportError writes the error content to the error file.
func exportError(content string) {
	if *flErrorFile != "" {
		tmpFile, err := ioutil.TempFile(os.TempDir(), "err-")
		if err != nil {
			fmt.Fprintf(os.Stderr,"Cannot create temporary file: %v\n", err)
			os.Exit(1)
		}

		if _, err = tmpFile.WriteString(content); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't write to temporary file: %v\n", err)
			os.Exit(1)
		}
		defer func() {
			if err := tmpFile.Close(); err != nil {
				fmt.Fprintf(os.Stderr, "ERROR: can't close temporary file: %v\n", err)
			}
		}()

		if err := os.Rename(tmpFile.Name(), *flErrorFile); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't rename to error file: %v\n", err)
			os.Exit(1)
		}
	}
}
```

The delete function to clean up the error file:
```
// deleteErrorFile deletes the error file.
func deleteErrorFile() {
	if *flErrorFile != "" {
		if _, err := os.Stat(*flErrorFile); err != nil {
			if os.IsNotExist(err) {
				return
			} else {
				fmt.Fprintf(os.Stderr, "ERROR: can't check the status of the error file: %v\n", err)
				os.Exit(1)
			}
		}

		if err := os.Remove(*flErrorFile); err != nil {
			fmt.Fprintf(os.Stderr, "ERROR: can't delete the error file: %v\n", err)
			os.Exit(1)
		}
	}
}
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: Generate an error file #326

Objective

Motivation

Goals

Non-Goals

High-level Design

Implementation details

Add a flag for `--error-file`

When to export the error information

1. Before calling `os.Exit()` with a non-zero code

2. Before retry

When to clean up the error file

1. If it is a one-time sync

2. If `--rev` is a commit hash

3. Before the next sync

Validation, export and delete functions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Generate an error file #326

Description

Objective

Motivation

Goals

Non-Goals

High-level Design

Implementation details

Add a flag for --error-file

When to export the error information

1. Before calling os.Exit() with a non-zero code

2. Before retry

When to clean up the error file

1. If it is a one-time sync

2. If --rev is a commit hash

3. Before the next sync

Validation, export and delete functions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add a flag for `--error-file`

1. Before calling `os.Exit()` with a non-zero code

2. If `--rev` is a commit hash