-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to use DBSError reason/srvCode instead of if/else exception block #11375
Conversation
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vkuznet Valentin, I have a general comment on this implementation.
We should be replacing all those checks of string errors by the actual server error code.
In other words, we will still have the if/elif/else logic, but it will be replaced by server code errors. Example:
srvCode = dbsError.getServerCode()
if srvCode == xxx: # block already exists
mark call as success
elif srvCode == yyy: # missing parent
mark call as failure
else:
mark call as failure, unexpected error
90ae94b
to
e4b328d
Compare
@amaltaro , ok, how about now? I added all existing codes to the code and made appropriate message. Please review. |
I think we should check only for the expected error codes, which AFAICT they are:
we should ditch all the others. From what I understand, the transaction on the server side will never yield many of those errors that you implemented. If we see that we are getting hit by another one of those, then we keep adding add. But I would start it with a very small set and stick to that if possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above.
@amaltaro , I'm sorry but I disagree with your assessment. The codes I added are possible to hit in DBSUploadPoller, e.g. in order to insert block/dataset an acquisition or processing era should be registered, etc. In other words it is possible due to existing data racing HTTP conditions #11106 that two HTTP requests can overlap and one of the errors I mentioned and implemented may occur in request. In my view it is better to have full coverage now rather than later add yet another code to cover rare case. Please re-consider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valentin, I see two better alternatives then, such that we can remove all this DBS error handling here and rely more on the DBSError class that you recently provided. They are:
a) either rely on the DBSError.getMessage() to get a very short and specific error message (hopefully it would be similar to what you defined here). Of course, we would have to enrich the error message with something like "Temporary failure for block {block}. Error: {dbsError.getMessage()}"...
b) we define a new method in DBSError that would map the "getServerCode" output to such very short error messages.
Of course, we still need to check if the error is meant to be considered a success (block already exists) or error (other failures). What do you think?
@amaltaro , from good design principle the DBSError class only parses errors and provides details. It has no logic about DMWM workflows or data injection, therefore we should not enrich it with such logic. If block exists in DBS and someone tries to insert it again, it is an error from DBS side, period. But from DMWM point of view it is not since the block exists and we can proceed safely. Therefore, both of your proposals do not appeal to me, and I still think the proper logic is correctly defined in DBSUploadPoller which knows details about data injection logic. |
I have to argue that that long if/elif statement is doing exactly what you mentioned, parsing the error details provided by the DBS server, hence a good design practice would be to encapsulate that same logic into its own class (DBSError). If you don't want to rewrite the error message, as currently provided in this PR, then I think we can use the output of |
Alan, you should not be upset with if/else because it walks through all possible codes. If you want we can put this into separate function for code clarify. What I'm saying that this function or if/else block does not belong to DBSError generic class because they represent logic of data processing and not code themselves. It is WMCore code which needs this logic and not another way around. The purpose of DBSError is to extract error codes, and get message, but it is not about placing calls to DBS. If we'll put this if/else block into DBSError then we'll need to parse URLs of the calls, while here we know the |
@amaltaro , I do not want to put this PR aside, please re-read my reply and let me know how would you like to proceed. If you still think that if/else code should be put in DBSError class I'll move it there, but it will not provide details of input API parameters (as I explained in my reply this will require parsing URLs rather taking input parameters from the code as it is right now). |
Valentin, it looks like you misunderstood my previous message. I am not saying that the DBSError class should be in charge of making calls to DBS and/or parsing any URLs. Can you please remind me if it's the
In my opinion, the current proposal should be refactored to something like (very short version...):
In other words, if the DBS server is already providing a clear and meaningful error message - as those that you extensively defined in the if/elif/else statement - then we should simply use that instead of redefining them in this component. |
ok, here is what DBS server returns, I only show you relevant parts:
Therefore
In case of failed block the |
Thanks for providing this extra information. Yes, then I agree with your statement above. |
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's much better now, thanks!
I took the opportunity to make a few extra suggestions along the code. Please have a look.
bbdb136
to
2f6de0e
Compare
Jenkins results:
|
@amaltaro I made necessary changes, please review. I also saw one unit test failure but I doubt it is related to these changes and I think it is rather unstable one. Please check and advise further. |
2f6de0e
to
3e412bb
Compare
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it looks good to me.
test this please |
Jenkins results:
|
Fixes #11365
Status
not-tested
Description
Replace
if/else
exception block withDBSError
reason and server codeIs it backward compatible (if not, which system it affects?)
MAYBE
Related PRs
it is continuation of effort started in #11173 and #10962
External dependencies / deployment changes
yes, it requires new dbs3-client, version 4.0.12