Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healthcheck endpoint in OpenCGA #1015

Closed
lawrencegripper opened this issue Jan 15, 2019 · 10 comments
Closed

Healthcheck endpoint in OpenCGA #1015

lawrencegripper opened this issue Jan 15, 2019 · 10 comments

Comments

@lawrencegripper
Copy link
Contributor

lawrencegripper commented Jan 15, 2019

Having a healthcheck as an endpoint would be very useful and it would make things easier. After some thoughts it seems a good idea to make this new status private, so we propose to implement this in the endpoint:

_admin/status_

Note: we will keep (and improve) current meta/status with very basic information such as https://www.githubstatus.com/, for instance: {catalog: OK, storage: warning, ...}

New admin/status Endpoint
This new healtchcheck endpoint will provide detailed information about the whole cluster installation. We are thinking of something like this:

version: 1.4.0,
commit: "",
installationDir: "",
javaVersion: "",

catalog: {
    installation: {
        installed: false,           // installation CLI executed succesfully
        incidents: [
            "conf/configuration.yml not found"
        ]
    },

    database: {
        name: "MongoDB 3.6.6",
        driver: "mongodb-java-driver-3.6.4",
        running: true,              // MongoDB is running
        writable: false,            // OpenCGA can connect and write into Catalog
        incidents: [
            "MongoDB driver is 3.6.4", 
            "We cannot write into database"
        ]
    },

    sessions: {
        path: "/A/B/C",
        readable: true,
        writable: true
    },

    search: {
        ...
    }
},

storage: {
    ...
},

server: {
    name: "Tomcat 8.0.53",
    running: true,
    accessToInstallationDir: true   // checks if it can read configuration files
    accessToSessions: true,         // checks if OpenCGA can read and write in the 'sessions' folder
},
@wbari
Copy link
Contributor

wbari commented Feb 8, 2019

AZ requirements :

1: Single status WS
2: 500 error code in case of failure

@wbari
Copy link
Contributor

wbari commented Feb 8, 2019

#1136

@wbari
Copy link
Contributor

wbari commented Feb 8, 2019

merged code changes for AZ. Existing meta/status WS will return status of CatalogMongoDB, StorageDb (MongoDB or HBase , based on storageEngineId in configuration) and Solr ( optionally can be deactivated from configuration ) and in case of error return status code 500.

@lawrencegripper
Copy link
Contributor Author

Awesome, looking forward to giving this a test next week!

@marrobi
Copy link
Contributor

marrobi commented Feb 12, 2019

@wbari solr seems to be good now, but getting:

{"apiVersion":"v1","time":-1,"warning":"","error":"No storageEngineId is set in configuration","queryOptions":{"metadata":true,"skipCount":true,"limit":2000},"response":[{"id":"Status","dbTime":0,"numResults":-1,"numTotalResults":-1,"warningMsg":"Future errors will ONLY be shown in the QueryResponse
body","errorMsg":"DEPRECATED: No storageEngineId is set in configuration","resultType":"","result":[]}]}

@marrobi
Copy link
Contributor

marrobi commented Feb 13, 2019

{"apiVersion":"v1","time":885,"warning":"","error":"","queryOptions":{"metadata":true,"skipCount":true,"limit":2000},"response":[{"id":"Status","dbTime":0,"numResults":-1,"numTotalResults":-1,"warningMsg":"","errorMsg":"","resultType":"","result":[{"VariantStorage hadoop":"OK","Solr":"OK","CatalogMongoDB":"OK"}]}]} 👍

@martinpeck
Copy link
Contributor

martinpeck commented Feb 13, 2019

Actions:

  • move config set to github (currently in Confluence)
  • allow configurable time interval
  • allow/consider caching the health check response
  • check success of CloudInit scripts

@martinpeck
Copy link
Contributor

related to PR #1150

@wbari
Copy link
Contributor

wbari commented Feb 13, 2019

created tasks for each items ( except last one which lawrencegripper mentioned already handled in TF

@wbari wbari closed this as completed Feb 13, 2019
@wbari wbari reopened this Feb 13, 2019
@lawrencegripper
Copy link
Contributor Author

lawrencegripper commented Feb 14, 2019

I agree this is good to close off now - Thanks @wbari!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants