Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add health check endpoint #18465

Merged
merged 16 commits into from
May 4, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions docs/content/doc/installation/on-kubernetes.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,47 @@ helm install gitea gitea-charts/gitea
```

If you would like to customize your install, which includes kubernetes ingress, please refer to the complete [Gitea helm chart configuration details](https://gitea.com/gitea/helm-chart/)

## Health check endpoint

Gitea comes with a health check endpoint `/api/healthz`, you can configure it in kubernetes like this:
ttys3 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
livenessProbe:
httpGet:
path: /api/healthz
port: http
initialDelaySeconds: 200
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 10
```

a successful health check response will respond with http code `200`, here's example:

```
HTTP/1.1 200 OK


{
"status": "pass",
"description": "Gitea: Git with a cup of tea",
"checks": {
"cache:ping": [
{
"status": "pass",
"time": "2022-02-19T09:16:08Z"
}
],
"database:ping": [
{
"status": "pass",
"time": "2022-02-19T09:16:08Z"
}
]
}
}
```

for more information, please reference to kubernetes documentation [Define a liveness HTTP request](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-http-request)
44 changes: 44 additions & 0 deletions docs/content/doc/installation/on-kubernetes.zh-tw.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,47 @@ helm install gitea gitea-charts/gitea
```

若您想自訂安裝(包括使用 kubernetes ingress),請前往完整的 [Gitea helm chart configuration details](https://gitea.com/gitea/helm-chart/)

##運行狀況檢查終端節點

Gitea 附帶了一個運行狀況檢查端點 `/api/healthz`,你可以像這樣在 kubernetes 中配置它:

```yaml
livenessProbe:
httpGet:
path: /api/healthz
port: http
initialDelaySeconds: 200
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 1
failureThreshold: 10
```

成功的運行狀況檢查回應將使用 HTTP 代碼 `200` 進行回應,下面是示例:

```
HTTP/1.1 200 OK


{
"status": "pass",
"description": "Gitea: Git with a cup of tea",
"checks": {
"cache:ping": [
{
"status": "pass",
"time": "2022-02-19T09:16:08Z"
}
],
"database:ping": [
{
"status": "pass",
"time": "2022-02-19T09:16:08Z"
}
]
}
}
```

有關更多信息,請參考kubernetes文檔[定義一個存活態 HTTP請求接口](https://kubernetes.io/zh/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/)
38 changes: 26 additions & 12 deletions modules/cache/cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
package cache

import (
"errors"
"fmt"
"strconv"

Expand Down Expand Up @@ -34,25 +35,38 @@ func NewContext() error {
if conn, err = newCache(setting.CacheService.Cache); err != nil {
return err
}
const testKey = "__gitea_cache_test"
const testVal = "test-value"
if err = conn.Put(testKey, testVal, 10); err != nil {
err = Ping()
if err != nil {
wxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
wxiaoguang marked this conversation as resolved.
Show resolved Hide resolved
return err
}
val := conn.Get(testKey)
if valStr, ok := val.(string); !ok || valStr != testVal {
// If the cache is full, the Get may not read the expected value stored by Put.
// Since we have checked that Put can success, so we just show a warning here, do not return an error to panic.
log.Warn("cache (adapter:%s, config:%s) doesn't seem to work correctly, set test value '%v' but get '%v'",
setting.CacheService.Cache.Adapter, setting.CacheService.Cache.Conn,
testVal, val,
)
}
}

return err
}

// Ping checks if the cache service works or not, it not, it returns an error
func Ping() error {
ttys3 marked this conversation as resolved.
Show resolved Hide resolved
if conn == nil {
return errors.New("cache not available")
}
var err error
const testKey = "__gitea_cache_test"
const testVal = "test-value"
if err = conn.Put(testKey, testVal, 10); err != nil {
return err
}
val := conn.Get(testKey)
if valStr, ok := val.(string); !ok || valStr != testVal {
// If the cache is full, the Get may not read the expected value stored by Put.
// Since we have checked that Put can success, so we just show a warning here, do not return an error to panic.
log.Warn("cache (adapter:%s, config:%s) doesn't seem to work correctly, set test value '%v' but get '%v'",
setting.CacheService.Cache.Adapter, setting.CacheService.Cache.Conn,
testVal, val,
)
}
return nil
}

6543 marked this conversation as resolved.
Show resolved Hide resolved
// GetCache returns the currently configured cache
func GetCache() mc.Cache {
return conn
Expand Down
142 changes: 142 additions & 0 deletions routers/web/healthcheck/check.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
// Copyright 2022 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

package healthcheck

import (
"net/http"
"os"
"time"

"code.gitea.io/gitea/models/db"
"code.gitea.io/gitea/modules/cache"
"code.gitea.io/gitea/modules/json"
"code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/setting"
)

type status string

const (
// pass healthy (acceptable aliases: "ok" to support Node's Terminus and "up" for Java's SpringBoot)
// fail unhealthy (acceptable aliases: "error" to support Node's Terminus and "down" for Java's SpringBoot), and
// warn healthy, with some concerns.
//
// ref https://datatracker.ietf.org/doc/html/draft-inadarei-api-health-check#section-3.1
// status: (required) indicates whether the service status is acceptable
// or not. API publishers SHOULD use following values for the field:
// The value of the status field is case-insensitive and is tightly
// related with the HTTP response code returned by the health endpoint.
// For "pass" status, HTTP response code in the 2xx-3xx range MUST be
// used. For "fail" status, HTTP response code in the 4xx-5xx range
// MUST be used. In case of the "warn" status, endpoints MUST return
// HTTP status in the 2xx-3xx range, and additional information SHOULD
// be provided, utilizing optional fields of the response.
pass status = "pass"
fail status = "fail"
warn status = "warn"
)

func (s status) ToHTTPStatus() int {
if s == pass || s == warn {
return http.StatusOK
}
return http.StatusFailedDependency
}

type checks map[string][]componentStatus

// response is the data returned by the health endpoint, which will be marshaled to JSON format
type response struct {
Status status `json:"status"`
Description string `json:"description"` // a human-friendly description of the service
Checks checks `json:"checks"` // The Checks Object
}

// componentStatus presents one status of a single check object
// an object that provides detailed health statuses of additional downstream systems and endpoints
// which can affect the overall health of the main API.
type componentStatus struct {
Status status `json:"status"`
Time string `json:"time"` // the date-time, in ISO8601 format
Output string `json:"output,omitempty"` // this field SHOULD be omitted for "pass" state.
}

// Check is the health check API handler
func Check(w http.ResponseWriter, r *http.Request) {
rsp := response{
Status: pass,
Description: "Gitea: Git with a cup of tea",
Checks: make(checks),
}

statuses := make([]status, 0)
statuses = append(statuses, checkDatabase(rsp.Checks))
statuses = append(statuses, checkCache(rsp.Checks))

for _, s := range statuses {
if s != pass {
rsp.Status = fail
break
}
}

data, _ := json.MarshalIndent(rsp, "", " ")
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(rsp.Status.ToHTTPStatus())
_, _ = w.Write(data)
}

// database checks gitea database status
func checkDatabase(checks checks) status {
st := componentStatus{}
if err := db.GetEngine(db.DefaultContext).Ping(); err != nil {
st.Status = fail
st.Time = getCheckTime()
log.Error("database ping failed with error: %v", err)
} else {
st.Status = pass
st.Time = getCheckTime()
}

if setting.Database.UseSQLite3 && st.Status == pass {
if !setting.EnableSQLite3 {
st.Status = fail
st.Time = getCheckTime()
log.Error("SQLite3 health check failed with error: %v", "this Gitea binary is built without SQLite3 enabled")
} else {
if _, err := os.Stat(setting.Database.Path); err != nil {
st.Status = fail
st.Time = getCheckTime()
log.Error("SQLite3 file exists check failed with error: %v", err)
}
}
}

checks["database:ping"] = []componentStatus{st}
return st.Status
}

// cache checks gitea cache status
func checkCache(checks checks) status {
if !setting.CacheService.Enabled {
return pass
}

st := componentStatus{}
if err := cache.Ping(); err != nil {
st.Status = fail
st.Time = getCheckTime()
log.Error("cache ping failed with error: %v", err)
} else {
st.Status = pass
st.Time = getCheckTime()
}
checks["cache:ping"] = []componentStatus{st}
return st.Status
}

func getCheckTime() string {
return time.Now().UTC().Format(time.RFC3339)
}
3 changes: 3 additions & 0 deletions routers/web/web.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"code.gitea.io/gitea/routers/web/dev"
"code.gitea.io/gitea/routers/web/events"
"code.gitea.io/gitea/routers/web/explore"
"code.gitea.io/gitea/routers/web/healthcheck"
"code.gitea.io/gitea/routers/web/org"
"code.gitea.io/gitea/routers/web/repo"
"code.gitea.io/gitea/routers/web/user"
Expand Down Expand Up @@ -150,6 +151,8 @@ func Routes(sessioner func(http.Handler) http.Handler) *web.Route {
rw.WriteHeader(200)
})

routes.Get("/api/healthz", healthcheck.Check)

// Removed: toolbox.Toolboxer middleware will provide debug information which seems unnecessary
common = append(common, context.Contexter())

Expand Down