Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handle for DBS3Upload errors due to "Error: concurrency error" #11167

Closed
amaltaro opened this issue May 27, 2022 · 1 comment · Fixed by #11176
Closed

Better handle for DBS3Upload errors due to "Error: concurrency error" #11167

amaltaro opened this issue May 27, 2022 · 1 comment · Fixed by #11176

Comments

@amaltaro
Copy link
Contributor

Impact of the bug
WMAgent

Describe the bug
Real fix has to come with a (minor) redesign on how data gets injected into DBS Server, to be addressed in this GH ticket:
#11106

Meanwhile, we should do something to minimize those horrible errors and tracebacks in the DBS3Upload component [1].

How to reproduce it
Concurrently inject multiple blocks for the same dataset.

Expected behavior
My suggestion is to catch block injections failing with Error: concurrency error and log a simple and friendly error message, explaining that the same block will be retried in the next cycle.
In other words, we should stop dumping that traceback and long/nested error messages in the logs for such well known scenarios.

Additional context and error message
[1]

2022-05-27 14:21:45,736:139988738467584:ERROR:DBSUploadPoller:Error trying to process block /BuToDStarX_ToD0Pi_inclusive_SoftQCD_TuneCP5_13TeV-pythia8-evtgen/RunIISummer20UL18NanoAODv9-Custom_RDStarPU_BParking_106X_upgrade2018_realistic_v
16_L1v1-v1/NANOAODSIM#b55338fc-46d1-4e97-b65f-1d09a9769c1f through DBS. Error: [{"error":{"reason":"DBSError Code:110 Description:DBS DB insert record error Function:dbs.bulkblocks.insertFilesViaChunks Message: Error: concurrency error","
message":"7cf3dee6307fdaa1f72d0dc03551fd0e6665f32114d752c37819c238cef7a231 unable to insert files, error DBSError Code:110 Description:DBS DB insert record error Function:dbs.bulkblocks.insertFilesViaChunks Message: Error: concurrency err
or","function":"dbs.bulkblocks.InsertBulkBlocksConcurrently","code":110},"http":{"method":"POST","code":400,"timestamp":"2022-05-27 19:21:45.67710595 +0000 UTC m=+191328.489143316","path":"/dbs/prod/global/DBSWriter/bulkblocks","user_agen
t":"DBSClient/Unknown/","x_forwarded_host":"dbs-prod.cern.ch","x_forwarded_for":"137.138.157.32","remote_addr":"137.138.63.204:39950"},"exception":400,"type":"HTTPError","message":"DBSError Code:110 Description:DBS DB insert record error 
Function:dbs.bulkblocks.InsertBulkBlocksConcurrently Message:7cf3dee6307fdaa1f72d0dc03551fd0e6665f32114d752c37819c238cef7a231 unable to insert files, error DBSError Code:110 Description:DBS DB insert record error Function:dbs.bulkblocks.i
nsertFilesViaChunks Message: Error: concurrency error Error: nested DBSError Code:110 Description:DBS DB insert record error Function:dbs.bulkblocks.insertFilesViaChunks Message: Error: concurrency error"}]
Traceback (most recent call last):
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/wmagentpy3/2.0.2.patch1/lib/python3.8/site-packages/WMComponent/DBS3Buffer/DBSUploadPoller.py", line 92, in uploadWorker
    dbsApi.insertBulkBlock(blockDump=block)
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-client/4.0.7/lib/python3.8/site-packages/dbs/apis/dbsClient.py", line 606, in insertBulkBlock
    result =  self.__callServer("bulkblocks", data=blockDump, callmethod='POST' )
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-client/4.0.7/lib/python3.8/site-packages/dbs/apis/dbsClient.py", line 448, in __callServer
    self.__parseForException(http_error)
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-client/4.0.7/lib/python3.8/site-packages/dbs/apis/dbsClient.py", line 492, in __parseForException
    raise http_error
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-client/4.0.7/lib/python3.8/site-packages/dbs/apis/dbsClient.py", line 445, in __callServer
    self.http_response = method_func(self.url, method, params, data, request_headers)
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-pycurl/3.17.7-comp/lib/python3.8/site-packages/RestClient/RestApi.py", line 42, in post
    return http_request(self._curl)
  File "/data/srv/wmagent/v2.0.2.patch1/sw/slc7_amd64_gcc630/cms/py3-dbs3-pycurl/3.17.7-comp/lib/python3.8/site-packages/RestClient/RequestHandling/HTTPRequest.py", line 62, in __call__
    raise HTTPError(effective_url, http_code, http_response.msg, http_response.raw_header, http_response.body)
RestClient.ErrorHandling.RestClientExceptions.HTTPError: HTTP Error 400: Bad Request
@amaltaro
Copy link
Contributor Author

amaltaro commented Jun 9, 2022

Since I know you are working on it, I am moving this to the Project "In progress" column. Please do the same in the future when you pick an issue to work on @vkuznet

vkuznet added a commit to vkuznet/WMCore that referenced this issue Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants