New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean runTimeouts when deleting a run #409
Comments
Just in case anyone stumbles across this issue, the problem is that the director will constantly complain that it can't check runtimeouts. Here's a sample:
The workaround for now is to run this against the db: |
@tico24 - Thanks for sharing the workaround! Not sure what's going on, but still getting the errors after try to run the workaround command. We have set sorry-cypress using helm/charts in order to spin up the services and this morning after deleting and recreating everything, this error started to popping up. Currently we have 3 mongodb services:
Installed version: Tried to run MongoDB server version: 4.4.6
{ "acknowledged" : true, "deletedCount" : 0 } Weird.. didn't delete anything, even though the director is screaming a bunch of errors: [run-timeout] Error checking run timeout for runId: b26788a178d75552779886c0905a26dc, task id: 6109794f7fc1664b04cb0da0 │
│ AppError: AppError │
│ at allRunSpecsCompleted (/app/packages/director/dist/execution/mongo/runs/run.controller.js:181:11) │
│ at runMicrotasks (<anonymous>) │
│ at processTicksAndRejections (internal/process/task_queues.js:95:5) │
│ at async maybeSetRunCompleted (/app/packages/director/dist/execution/mongo/runCompletion/runCompletion.js:17:7) │
│ at async checkRunCompletionOnTimeout (/app/packages/director/dist/execution/mongo/runCompletion/runCompletion.js:31:7) │
│ at async /app/packages/director/dist/execution/mongo/runCompletion/runCompletion.js:56:7 { │
│ code: 'RUN_NOT_EXISTS' │
│ }
MongoError: not master and slaveOk=false │
│ at MessageStream.messageHandler2 (/app/packages/mongo/dist/index.js:17055:24) │
│ at MessageStream.emit (events.js:376:20) │
│ at MessageStream.emit (domain.js:470:12) │
│ at processIncomingData (/app/packages/mongo/dist/index.js:16773:16) │
│ at MessageStream._write (/app/packages/mongo/dist/index.js:16704:9) │
│ at writeOrBuffer (internal/streams/writable.js:358:12) │
│ at MessageStream.Writable.write (internal/streams/writable.js:303:10) │
│ at Socket.ondata (internal/streams/readable.js:745:22) │
│ at Socket.emit (events.js:376:20) │
│ at Socket.emit (domain.js:470:12) { │
│ topologyVersion: { processId: 61096dfc8a9b8936e9aceb3d, counter: 3 }, │
│ operationTime: Timestamp2 { _bsontype: 'Timestamp', low_: 1, high_: 1628020667 }, │
│ ok: 0, │
│ code: 13435, │
│ codeName: 'NotPrimaryNoSecondaryOk', │
│ '$clusterTime': { │
│ clusterTime: Timestamp2 { _bsontype: 'Timestamp', low_: 1, high_: 1628020667 }, │
│ signature: { hash: [Binary2], keyId: 0 } │
│ }, │
│ [Symbol(errorLabels)]: Set(1) { 'RetryableWriteError' } │
│ } │
│ [run-timeout] Checking run timeouts. Tried to run MongoDB server version: 4.4.6
uncaught exception: WriteCommandError({
"topologyVersion" : {
"processId" : ObjectId("61096dfc8a9b8936e9aceb3d"),
"counter" : NumberLong(3)
},
"operationTime" : Timestamp(1628020187, 1),
"ok" : 0,
"errmsg" : "not master",
"code" : 10107,
"codeName" : "NotWritablePrimary",
"$clusterTime" : {
"clusterTime" : Timestamp(1628020187, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}) :
WriteCommandError({
"topologyVersion" : {
"processId" : ObjectId("61096dfc8a9b8936e9aceb3d"),
"counter" : NumberLong(3)
},
"operationTime" : Timestamp(1628020187, 1),
"ok" : 0,
"errmsg" : "not master",
"code" : 10107,
"codeName" : "NotWritablePrimary",
"$clusterTime" : {
"clusterTime" : Timestamp(1628020187, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
})
WriteCommandError@src/mongo/shell/bulk_api.js:417:48
executeBatch@src/mongo/shell/bulk_api.js:915:23
Bulk/this.execute@src/mongo/shell/bulk_api.js:1163:21
DBCollection.prototype.deleteMany@src/mongo/shell/crud_api.js:432:17 Do you guys have another thoughts or suggestions? Thanks! |
Your uri is most likely wrong. Make sure you're using the correct database. |
@tim-sendible @mongodb-0:/$ mongo --host mongodb://mongodb-headless.cypress.svc.cluster.local:27017
MongoDB shell version v4.4.6
connecting to: mongodb://mongodb-headless.cypress.svc.cluster.local:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("9b4b2843-ce71-4189-9739-ad86408ed6b1") }
MongoDB server version: 4.4.6
---
The server generated these startup warnings when booting:
2021-08-03T16:24:56.821+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
---
---
Enable MongoDB's free cloud-based monitoring service, which will then receive and display
metrics about your deployment (disk utilization, CPU, operation statistics, etc).
The monitoring data will be available on a MongoDB website with a unique URL accessible to you
and anyone you share the URL with. MongoDB may use this information to make product
improvements and to suggest MongoDB products and deployment options to you.
To enable free monitoring, run the following command: db.enableFreeMonitoring()
To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
rs0:PRIMARY> exit
bye Another quick test: I have no name!@mongodb-0:/$ mongo mongodb://mongodb-headless.cypress.svc.cluster.local:27017 --eval 'db.getCollectionNames()'
MongoDB shell version v4.4.6
connecting to: mongodb://mongodb-headless.cypress.svc.cluster.local:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("0f5709bf-ce2c-4d22-be0e-41b17076e8da") }
MongoDB server version: 4.4.6
[ ]
I have no name!@mongodb-0:/$ Tried to delete all mongo services and recreate them again, but I'm getting a bunch of mongodb error on the director side: [run-timeout] Checking run timeouts... │
│ (node:1) UnhandledPromiseRejectionWarning: MongoError: not master and slaveOk=false │
│ at MessageStream.messageHandler2 (/app/packages/mongo/dist/index.js:17055:24) │
│ at MessageStream.emit (events.js:376:20) │
│ at MessageStream.emit (domain.js:470:12) │
│ at processIncomingData (/app/packages/mongo/dist/index.js:16773:16) │
│ at MessageStream._write (/app/packages/mongo/dist/index.js:16704:9) │
│ at writeOrBuffer (internal/streams/writable.js:358:12) │
│ at MessageStream.Writable.write (internal/streams/writable.js:303:10) │
│ at Socket.ondata (internal/streams/readable.js:745:22) │
│ at Socket.emit (events.js:376:20) │
│ at Socket.emit (domain.js:470:12) │
│ (node:1) Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). This morning we got some weird error on the mongo replicaSet, where it was blocking the mongodb to spin up... after delete the RS and recreate it, at least the mongodb starts to "work"... but not 100% though. Running out of options 🤣 |
@everton-nasc I don't really know how to help here - director is just trying to connect using the mongoDB credentials you've provided and looking at the error it seems like Mongo configuration error I would try to follow the next steps:
|
@agoldis I got it fixed, deleting all mongo statefulset along with their volumes and re-applied the templates again. At least the master and slave has started. There's another error showing up, where the groups parameter is not working properly. It is splitting up the same test spec execution through the build-ids. I'll send a msg on the Slack about it. Thanks! |
Summary
runTimeouts
is not getting cleaned when deleting a run, causing log error messagesThe text was updated successfully, but these errors were encountered: