-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to connect with database: JC_BAD_CONNECTION #235
Comments
Thank you for your usage and report.
I guess an error has occurred on the server. Ex. IO_ERROR |
Sure I'll, Thanks. Can you give more specific examples please. What kind of IO errors it could be. Read/write permission error or something like that? |
Log files in $GS_LOG says: 2020-01-26T15:53:07.742Z xxx 11745 ERROR EVENT_ENGINE [1042:CM_PLATFORM_ERROR] Listener error (engine=TRANSACTION_SERVICE, reason=Platform error (errno=24, description=Too many open files)) and also 2020-01-26T15:53:07.744Z xxx 11751 ERROR CLUSTER_OPERATION [1042:CM_PLATFORM_ERROR] [ERROR] SetError requested (reason=Prepare checkpoint failed. reason=(Failed to open (path=data/gs_log_1_1307.log, reason=Platform error (errno=24, description=Too many open files))) What does this mean, Too many open files ? Above errors are repeating for a while and then it gives following error 2020-01-26T15:53:07.746Z xxx 11751 WARNING CLUSTER_OPERATION [180060:CLM_STATUS_TO_SUBMASTER] Cluster status change to SUB_MASTER 2020-01-26T15:57:08.109Z xxx 11734 INFO SYSTEM_SERVICE [50912:SC_TRACE_STATS] clusterStatus=SUB_CLUSTER, nodeStatus=ABNORMAL, It changes to sub master and then nodeStatus change to ABNORMAL What is the root cause for this issue. when it says too many open files which files does it means.? |
Thank you for error information.
Let me confirm some points.
ulimit -n |
and then once query is done leaves the connection open. (Is there any recommended approach to handle db connections, or any pooling solution available for griddb like HikariCP)
|
If you call GridStoreFactory#getGridStore() but don't call GridStore#close(), If so, please use the following method (A) or (B): GridStore store = GridStoreFactory.getInstance().getGridStore(props); ... store.close(); (B) Reuse GridStore object this.store = GridStoreFactory.getInstance().getGridStore(props); ts = this.store.putTimeSeries(name, XXX.class);// or store.getTimeSeries(name, XXX.class) ... ts.close() ts = this.store.putTimeSeries(name, XXX.class);// or store.getTimeSeries(name, XXX.class) ... ts.close() ... Java client library has a connection pool. And it might be better to increase the file descriptor limit. |
Thank you very much for the directions. |
Also from the above methods you suggested. Which one is recommended if i frequently make calls to the db(Usually front-end make db requests for data sampling query per every 5 seconds) |
This is an error message of errno=24 in Linux OS Errors. #define EMFILE 24 /* Too many open files */ http://man7.org/linux/man-pages/man3/errno.3.html "files" means file descriptors and includes things like sockets. |
I often use the case (B) when queries are repeated in a thread or a function. Note: GridStore object isn't thread safety. So, if you use multi-thread, please assign a GridStore object for each thread. |
But my server doesn't have any open files. Is this referring to database connections in this case? |
My application runs a sampling query periodically and it was working fine without any issue for around 30 hours. Then suddenly it gave the following error
Caused by: com.toshiba.mwcloud.gs.common.GSConnectionException: [145028:JC_BAD_CONNECTION] Failed to update by notification (address=/239.0.0.1:31999, reason=Receive timed out)
at com.toshiba.mwcloud.gs.subnet.NodeResolver.updateMasterInfo(NodeResolver.java:815)
at com.toshiba.mwcloud.gs.subnet.NodeResolver.prepareConnectionAndClusterInfo(NodeResolver.java:522)
at com.toshiba.mwcloud.gs.subnet.NodeResolver.getPartitionCount(NodeResolver.java:205)
at com.toshiba.mwcloud.gs.subnet.GridStoreChannel$5.execute(GridStoreChannel.java:2106)
at com.toshiba.mwcloud.gs.subnet.GridStoreChannel.executeStatement(GridStoreChannel.java:1675)
... 38 more
Caused by: java.net.SocketTimeoutException: Receive timed out
Why is this happening? What is the cause.
This is the output of
gs_stat -u admin/admin
{
"checkpoint": {
"archiveLog": 0,
"backupOperation": 0,
"duplicateLog": 0,
"endTime": 1580053987745,
"mode": "NORMAL_CHECKPOINT",
"normalCheckpointOperation": 1470,
"pendingPartition": 0,
"periodicCheckpoint": "ACTIVE",
"requestedCheckpointOperation": 0,
"startTime": 1580053987741
},
"cluster": {
"activeCount": 0,
"autoGoal": "ACTIVE",
"clusterName": "defaultCluster",
"clusterRevisionId": "4e9be62e-7911-48a4-8d93-19af09be7a15",
"clusterRevisionNo": 17651,
"clusterStatus": "SUB_CLUSTER",
"designatedCount": 1,
"loadBalancer": "ACTIVE",
"nodeList": [
{
"address": "10.128.0.2",
"port": 10040
}
],
"nodeStatus": "ABNORMAL",
"notificationMode": "MULTICAST",
"partitionStatus": "INITIAL",
"startupTime": "2020-01-25T15:20:31.377Z",
"syncCount": 0
},
"currentTime": "2020-01-26T17:20:39Z",
"performance": {
"backupCount": 0,
"batchFree": 0,
"bufferHashCollisionCount": 0,
"checkpointFileAllocateSize": 5443584,
"checkpointFileFlushCount": 0,
"checkpointFileFlushTime": 0,
"checkpointFileSize": 5439488,
"checkpointFileUsageRate": 0.927710843373494,
"checkpointMemory": 196608,
"checkpointMemoryLimit": 1073741824,
"checkpointWriteSize": 270139392,
"checkpointWriteTime": 214,
"currentCheckpointWriteBufferSize": 0,
"currentTime": 1580059239771,
"expirationDetail": {
"autoExpire": false,
"erasableExpiredTime": "1970-01-01T00:00:00.000Z",
"latestExpirationCheckTime": "1970-01-01T00:00:00.000Z"
},
"logFileFlushCount": 8832,
"logFileFlushTime": 38224,
"numBackground": 0,
"numConnection": 2,
"numNoExpireTxn": 0,
"numSession": 0,
"numTxn": 0,
"ownerCount": 128,
"peakProcessMemory": 86626304,
"processMemory": 86626304,
"recoveryReadSize": 262144,
"recoveryReadTime": 0,
"recoveryReadUncompressTime": 0,
"storeCompressionMode": "NO_BLOCK_COMPRESSION",
"storeDetail": {
"batchFreeMapData": {
"storeMemory": 0,
"storeUse": 0,
"swapRead": 0,
"swapWrite": 0
},
"batchFreeRowData": {
"storeMemory": 0,
"storeUse": 0,
"swapRead": 0,
"swapWrite": 0
},
"mapData": {
"storeMemory": 131072,
"storeUse": 131072,
"swapRead": 0,
"swapWrite": 0
},
"metaData": {
"storeMemory": 131072,
"storeUse": 131072,
"swapRead": 0,
"swapWrite": 0
},
"rowData": {
"storeMemory": 4784128,
"storeUse": 4784128,
"swapRead": 0,
"swapWrite": 0
}
},
"storeMemory": 5046272,
"storeMemoryLimit": 1073741824,
"storeTotalUse": 5046272,
"swapRead": 0,
"swapReadSize": 0,
"swapReadTime": 0,
"swapReadUncompressTime": 0,
"swapWrite": 0,
"swapWriteCompressTime": 0,
"swapWriteSize": 0,
"swapWriteTime": 0,
"syncReadSize": 0,
"syncReadTime": 0,
"syncReadUncompressTime": 0,
"totalBackupLsn": 0,
"totalLockConflictCount": 0,
"totalOtherLsn": 0,
"totalOwnerLsn": 110220,
"totalReadOperation": 4733,
"totalRowRead": 2325894,
"totalRowWrite": 55108,
"totalWriteOperation": 55108,
"txnDetail": {
"totalBackgroundOperation": 0
}
},
"recovery": {
"progressRate": 1
},
"version": "4.3.0-36424 CE"
}
The text was updated successfully, but these errors were encountered: