Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dasgoclient does not fail on error #19

Closed
smuzaffar opened this issue May 23, 2018 · 5 comments
Closed

dasgoclient does not fail on error #19

smuzaffar opened this issue May 23, 2018 · 5 comments

Comments

@smuzaffar
Copy link

During CMSSW IB validation tests we noticed that dasgoclient query failed but dasgoclient did not exit with non-zero code. For example, we get error like [a] but dasgoclient process exited with ZERO exit code.

These types of failure are now bit frequent (happened twice in last 10 days). First do we know the reason why upstream server is failing and second can we make dasgoclient fail in such cases?

[a]

      "file": [
        {
          "error": "DBS unable to unmarshal the data into DAS record, api=file4DatasetRunLumi, data=<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>502 Proxy Error</title>\n</head><body>\n<h1>Proxy Error</h1>\n<p>The proxy server received an invalid\r\nresponse from an upstream server.<br />\r\nThe proxy server could not handle the request <em><a href=\"/auth/complete/dbs/prod/global/DBSReader/files/\">GET&nbsp;/auth/complete/dbs/prod/global/DBSReader/files/</a></em>.<p>\nReason: <strong>Error reading from remote server</strong></p></p>\n</body></html>\n, error=invalid character '<' looking for beginning of value"
        }
      ],
@smuzaffar
Copy link
Author

FYI @vkuznet

@vkuznet
Copy link
Collaborator

vkuznet commented May 23, 2018

Shahzad,
this is upstream error and not particular dasgoclient error. Therefore I suggest I'll introduce different error codes to address different usecases, e.g. 1 for dasgoclient specific error, 2 for DBS upstream error, 3 for Phedex, etc. I suggest to add -errorCodes flag which will printout all error codes then.

Regarding your question about cause and frequency of this type of error. I said for many years that instabilities of our cmsweb data-services are increasing. This particular error comes from DBS and then wrapper by our frontend. The obvious cause is increased load on DBS server(s). I shown already that throughput of our python based services is so-so. Therefore I'm not surprised. Of course it would be nice to notify DBS maintainer (Yuyi) about it and provide timestamp/queries that she can look-up and correlate them in DBS/frontend logs.

@vkuznet
Copy link
Collaborator

vkuznet commented May 23, 2018

ok, here is first implementation (commit: ff21d73)

$ ./dasgoclient -errorCodes
DAS error codes:
1 DAS error
2 DBS upstream error
3 PhEDEx upstream error
4 ReqMgr upstream error
5 RunRegistry upstream error
6 McM upstream error
7 Dashboard upstream error
8 SiteDB upstream error
9 CondDB upstream error
10 Combined error
11 MongoDB error
12 DAS proxy error
13 DAS query error

$ ./dasgoclient -query="bla=1"
ERRO[0000] DAS QL error                                  Query="bla = 1" idx=0 msg="Wrong DAS key: bla"
Unable to parse DAS query, no select keys are found <DASQuery="" inst= hash= time=0001-01-01 00:00:00>

# here the last error code 13 which is DASQueryError
$ echo $?
13

$ ./dasgoclient -query="run=160915"
160915

# even though DAS returns valid result here (from DBS) it fails to query RunRegistry data-service (I didn't setup proper ssh tunnel to its url), therefore it return error code 5 (RunRegistryError)
$ echo $?
5

Let me know your feedback.

@vkuznet
Copy link
Collaborator

vkuznet commented May 29, 2018

I changed -errorCodes to more appropriate -exitCodes.

@smuzaffar
Copy link
Author

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants