Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong HttpCode of HealthResource #27

Closed
atellwu opened this issue May 29, 2019 · 0 comments
Closed

Wrong HttpCode of HealthResource #27

atellwu opened this issue May 29, 2019 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@atellwu
Copy link
Contributor

atellwu commented May 29, 2019

Describe the bug

The HTTP code that HealthResource returns when it is healthy must be 200, otherwise 500 should be returned.

@atellwu atellwu added the bug Something isn't working label May 29, 2019
@atellwu atellwu self-assigned this May 29, 2019
atellwu pushed a commit to atellwu/sofa-registry that referenced this issue May 29, 2019
atellwu pushed a commit to atellwu/sofa-registry that referenced this issue May 29, 2019
Synex-wh added a commit that referenced this issue Sep 26, 2019
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* fix fetch datum word cache

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly
atellwu added a commit that referenced this issue Feb 3, 2020
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service (#45)

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* Session&Data increase WordCache use

* code optimize

* WordCache: registerId do not add WordCache

* fix fetch datum word cache

* fix NotifyFetchDatumHandler npe

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* fix version,and merge new

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix version and fix callback executor,fix log error

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly

* refactor providerdata process

* Memory optimization:Datum.processDatum

* add session notify test

* copy from mybank:
1. Update Subscriber: support for push context
2. increase queueSize of checkPushExecutor
3. fix the isolation function of Gzone and Rzone

* Modify the deny policy of accessDataExecutor of SessionServer

* remove useless code

* fix call back

* fix meta methodhandle cache

* fix push confirm success

* Change the communication between session and data to multi connection

* resolve compile error

* fix processor

* BoltClient: the creation of ConnectionEventAdapter should be inheritable

* fix currentTimeMillis product from source

* add client Invalid check task

* use multiple RpcClient instances instead of one RpcClient with multiple connections,and start a heartbeat thread to ensure connection pool because bolt does not maintain the number of connection pools

* refactor TaskListener and use map instead of list in DefaultTaskListenerManager; refactor getSingleTaskDispatcher()

* DataChangeRequestHandler:optimize performance

* refactor: Heartbeat between session and data

* fix: Synex-wh#20 (review)

* update

* BoltClient use just one RpcClient;
remove heartbeat between session and data;

* SyncDataCallback reduce ThreadSize for saving cpu

* reduce NOTIFY_SESSION_CALLBACK_EXECUTOR threadSize

* fix version in DataChangeFetchTask

* 1. filter out the unPubs of datum when first put, Otherwise, "syncData" or "fetchData" get Datum may contains unPubs, which will result something error
2. add jul-to-slf4j for some lib which use jul log, e.g. hessian

* fix meta mem

* fix test case

* fix temp case

* fix syncConfigRetryInterval 60s

* fix format

Co-authored-by: wukezhu <atell@qq.com>
Synex-wh added a commit that referenced this issue Feb 14, 2020
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service (#45)

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* Session&Data increase WordCache use

* code optimize

* WordCache: registerId do not add WordCache

* fix fetch datum word cache

* fix NotifyFetchDatumHandler npe

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* fix version,and merge new

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix version and fix callback executor,fix log error

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly

* refactor providerdata process

* Memory optimization:Datum.processDatum

* add session notify test

* copy from mybank:
1. Update Subscriber: support for push context
2. increase queueSize of checkPushExecutor
3. fix the isolation function of Gzone and Rzone

* Modify the deny policy of accessDataExecutor of SessionServer

* remove useless code

* fix call back

* fix meta methodhandle cache

* fix push confirm success

* Change the communication between session and data to multi connection

* resolve compile error

* fix processor

* BoltClient: the creation of ConnectionEventAdapter should be inheritable

* fix currentTimeMillis product from source

* add client Invalid check task

* use multiple RpcClient instances instead of one RpcClient with multiple connections,and start a heartbeat thread to ensure connection pool because bolt does not maintain the number of connection pools

* refactor TaskListener and use map instead of list in DefaultTaskListenerManager; refactor getSingleTaskDispatcher()

* DataChangeRequestHandler:optimize performance

* refactor: Heartbeat between session and data

* fix: Synex-wh#20 (review)

* update

* BoltClient use just one RpcClient;
remove heartbeat between session and data;

* SyncDataCallback reduce ThreadSize for saving cpu

* reduce NOTIFY_SESSION_CALLBACK_EXECUTOR threadSize

* 1. filter out the unPubs of datum when first put, Otherwise, "syncData" or "fetchData" get Datum may contains unPubs, which will result something error
2. add jul-to-slf4j for some lib which use jul log, e.g. hessian

* update for idc sync:
1. add a interface DatumStorage and implemented by LocalDatumStorage
2. remove Sync from BackUpNotifier
3. add RemoteDataServerChangeEvent

* 1. NotifyProvideDataChange support multiple nodeTypes
2. refactor provideData code of DataServer, just like SessionServer
3. remove GetChangeListRequestHandler to enterprise version because it's about multiple data centers

* use getClientRegion() instead of getSessionServerRegion() for push

* bugfix LocalDatumStorage#getVersions

* bugfix DataDigestResource api

* bugfix DataDigestResource api

* fix BoltClient: remove unnecessary code

* give more thread for getOtherDataCenterNodeAndUpdate, because otherwise it would rejected if too much task

* grefresh for keep connect other dataServers: should use dataServerCache but not DataServerNodeFactory

* revert "delay cache invalid in DataChangeFetchTask&DataChangeFetchCloudTask",because if the old datum is not invalid, the new subscriber will get the old datum directly from the cache

* bugfix MetaStoreService&DataStoreService: "return" -> "continue"

* fix Memory waste of ServerDataBox

* revert MetaDigestResource api

* Request add method "getTimeout"

* bugfix: remove @ConditionalOnMissingBean for fetchDataHandler

* fix compile error

* RequestException: limit message size

* bugfix: empty dataServerList cause NPE because calculateOldConsistentHash return null

* trigger github ci

* trigger github ci

* fix ut

* update version to 5.4.0

Co-authored-by: Synex-wh <241809311@qq.com>
dzdx pushed a commit that referenced this issue Dec 13, 2021
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* fix fetch datum word cache

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly
dzdx pushed a commit that referenced this issue Dec 13, 2021
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service (#45)

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* Session&Data increase WordCache use

* code optimize

* WordCache: registerId do not add WordCache

* fix fetch datum word cache

* fix NotifyFetchDatumHandler npe

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* fix version,and merge new

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix version and fix callback executor,fix log error

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly

* refactor providerdata process

* Memory optimization:Datum.processDatum

* add session notify test

* copy from mybank:
1. Update Subscriber: support for push context
2. increase queueSize of checkPushExecutor
3. fix the isolation function of Gzone and Rzone

* Modify the deny policy of accessDataExecutor of SessionServer

* remove useless code

* fix call back

* fix meta methodhandle cache

* fix push confirm success

* Change the communication between session and data to multi connection

* resolve compile error

* fix processor

* BoltClient: the creation of ConnectionEventAdapter should be inheritable

* fix currentTimeMillis product from source

* add client Invalid check task

* use multiple RpcClient instances instead of one RpcClient with multiple connections,and start a heartbeat thread to ensure connection pool because bolt does not maintain the number of connection pools

* refactor TaskListener and use map instead of list in DefaultTaskListenerManager; refactor getSingleTaskDispatcher()

* DataChangeRequestHandler:optimize performance

* refactor: Heartbeat between session and data

* fix: Synex-wh#20 (review)

* update

* BoltClient use just one RpcClient;
remove heartbeat between session and data;

* SyncDataCallback reduce ThreadSize for saving cpu

* reduce NOTIFY_SESSION_CALLBACK_EXECUTOR threadSize

* fix version in DataChangeFetchTask

* 1. filter out the unPubs of datum when first put, Otherwise, "syncData" or "fetchData" get Datum may contains unPubs, which will result something error
2. add jul-to-slf4j for some lib which use jul log, e.g. hessian

* fix meta mem

* fix test case

* fix temp case

* fix syncConfigRetryInterval 60s

* fix format

Co-authored-by: wukezhu <atell@qq.com>
dzdx pushed a commit that referenced this issue Dec 13, 2021
* fix temp push

* update version 5.2.1-SNAPSHOT

* fix test case

* fix jetty version,and fix rest api for dataInfoIds

* fix hashcode test

* fix working to init bug

* fix start task log

* fix Watcher can't get providate data,retry and finally return new

* add data server list api

* add server list api

* remove log

* fix isssue 21

* add query by id function

* fix issue 22

* delay client off process and sync data process to working status

* fix data connet meta error

* fix inject NotifyDataSyncHandler

* fix start log

* add send sub log

* fix subscriber to send log

* bugfix: #27

* bugfix: #27

* feature: Add monitoring logs #29

* feature: Add monitoring logs #29
(1) bugfix CommonResponse
(2) format

* bugfix: During meta startup, leader may not register itself #30

* bugfix: Sometimes receive "Not leader" response from leader in OnStartingFollowing() #31

* temp add

* add renew request

* data snapshot module

* add calculate digest service

* fix word cache clientid

* data renew module

* data renew/expired module

* add renew datuem request

* add WriteDataAcceptor

* session renew/expired module

* 1. bugfix ReNewDatumHandler: getByConnectId -> getOwnByConnectId
2. reactor DatumCache from static to instance

* add blacklist wrapper and filter

* upgrade jraft version to 1.2.5

* blacklist ut

* add clientoff delay time

* bugfix: The timing of snapshot construction is not right

* rename: ReNew -> Renew

* fix blacklist test case

* rename: unpub -> unPub

* add threadSize and queueSize limit

* bugfix: revert SessionRegistry

* fix sub fetch retry all error,and reset datainfoid version

* fix client fast chain breakage data can not be cleaned up”

* (1) remove logback.xml DEBUG level;
(2) dataServerBootstrapConfig rename;
(3) print conf when startup

* update log

* fix update zero version,and fix log

* add clientOffDelayMs default value

* fix clientOffDelayMs

* Task(DatumSnapshot/Pub/UnPub) add retry strategy

* bugfix DataNodeServiceImpl: retryTimes

* (1)cancelDataTaskListener duplicate
(2)bugfix DataNodeServiceImpl and SessionRegistry

* refactor datum version

* add hessian black list

* bugfix: log "retryTimes"

* bugfix DatumLeaseManager:  Consider the situation of connectId lose after data restart; ownConnectId should calculate dynamically

* add jvm blacklist api

* fix file name

* some code optimization

* data:refactor snapshot

* fix jetty version

* bugfix DatumLeaseManager: If in a non-working state, cannot clean up because the renew request cannot be received at this time.

* remove SessionSerialFilterResource

* WriteDataProcessor add TaskEvent log; Cache print task update

* data bugfix: snapshot must notify session

* fix SubscriberPushEmptyTask default implement

* merge new

* fix protect

* 1. When the pub of connectId is 0, no clearance action is triggered.
2. Print map. size regularly
3. Delete the log: "ConnectId (% s) expired, lastRenewTime is% s, pub. size is 0"

* DataNodeExchanger: print but ignore if from renew module, cause renew request is too much

* reduce log of renew

* data bugfix: Data coverage is also allowed when versions are equal. Consistent with session design.

* DatumCache bugfix: Index coverage should be updated after pubMap update

* DatumSnapshotHandler: limit print; do not call dataChangeEventCenter.onChange if no diff

* bugfix unpub npe (pub maybe already clean by DatumLeaseManager);LIMITED_LIST_SIZE_FOR_PRINT change to 30

* some code refactor

* add code comment

* fix data working to init,and fix empty push version

* consider unpub is isWriteRequest, Reduce Snapshot frequency

* RefreshUpdateTime is at the top, otherwise multiple snapshot can be issued concurrently

* update config: reduce retryTimes, increase delayTime, the purpose is to reduce performance consumption

* put resume() in finally code block, avoid lock leak

* modify renewDatumWheelTaskDelay and datumTimeToLiveSec

* When session receives a connection and generates renew tasks, it randomly delays different times to avoid everyone launching renew at the same time.

* data: add executor for handler
session: bugfix snapshot
session: refactor wheelTimer of renew to add executor

* add get data log

* snapshot and lastUpdateTimestamp: Specific to dataServerIP

* 1. DataServer: RenewDatumHandler must return GenericResponse but not CommonResponse, or else session will class cast exception
2. No need to update timestamp after renew
3. snapshot: Need to specify DataServerIP

* add logs

* 1. dataServer: reduce log of snapshotHandler
2. update logs

* dataServer: renew logic should delay for some time after status is WORKING, cause Data is processed asynchronously after synchronization from other DataServer

* bugfix bean; update log

* ignore renew request log

* fix UT

* fix .travis.yml

* fix version 5.3.0-SNAPSHOT

* fix online notify connect error

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service

* fix push confirm error,and fix datum update version,pub threadpool config,add accesslimit service (#45)

* add switch renew and expire

* implement renew enable/disable switch

* fix data client exechange log

* fix datum fetch connect error

* bugfix CacheService: set version zero when first sub and get datum error

* fix clean task for fetch

* bugfix DatumCache: Forget to clean up the index in datumCache.putSnapshot

* Session&Data increase WordCache use

* code optimize

* WordCache: registerId do not add WordCache

* fix fetch datum word cache

* fix NotifyFetchDatumHandler npe

* fix test case time

* fix test cast

* fix test case

* fix tast case

* fix ut case: StopPushDataSwitchTest

* ut case:renew module

* fix ut case:TempPublisherTest

* fix version,and merge new

* bugfix ut case: increase sleep time

* fix ut case:RenewTest

* fix version and fix callback executor,fix log error

* fix ut case:RenewTest format

* fix pom version

* fix ut case:do not run parallelly

* refactor providerdata process

* Memory optimization:Datum.processDatum

* add session notify test

* copy from mybank:
1. Update Subscriber: support for push context
2. increase queueSize of checkPushExecutor
3. fix the isolation function of Gzone and Rzone

* Modify the deny policy of accessDataExecutor of SessionServer

* remove useless code

* fix call back

* fix meta methodhandle cache

* fix push confirm success

* Change the communication between session and data to multi connection

* resolve compile error

* fix processor

* BoltClient: the creation of ConnectionEventAdapter should be inheritable

* fix currentTimeMillis product from source

* add client Invalid check task

* use multiple RpcClient instances instead of one RpcClient with multiple connections,and start a heartbeat thread to ensure connection pool because bolt does not maintain the number of connection pools

* refactor TaskListener and use map instead of list in DefaultTaskListenerManager; refactor getSingleTaskDispatcher()

* DataChangeRequestHandler:optimize performance

* refactor: Heartbeat between session and data

* fix: Synex-wh#20 (review)

* update

* BoltClient use just one RpcClient;
remove heartbeat between session and data;

* SyncDataCallback reduce ThreadSize for saving cpu

* reduce NOTIFY_SESSION_CALLBACK_EXECUTOR threadSize

* 1. filter out the unPubs of datum when first put, Otherwise, "syncData" or "fetchData" get Datum may contains unPubs, which will result something error
2. add jul-to-slf4j for some lib which use jul log, e.g. hessian

* update for idc sync:
1. add a interface DatumStorage and implemented by LocalDatumStorage
2. remove Sync from BackUpNotifier
3. add RemoteDataServerChangeEvent

* 1. NotifyProvideDataChange support multiple nodeTypes
2. refactor provideData code of DataServer, just like SessionServer
3. remove GetChangeListRequestHandler to enterprise version because it's about multiple data centers

* use getClientRegion() instead of getSessionServerRegion() for push

* bugfix LocalDatumStorage#getVersions

* bugfix DataDigestResource api

* bugfix DataDigestResource api

* fix BoltClient: remove unnecessary code

* give more thread for getOtherDataCenterNodeAndUpdate, because otherwise it would rejected if too much task

* grefresh for keep connect other dataServers: should use dataServerCache but not DataServerNodeFactory

* revert "delay cache invalid in DataChangeFetchTask&DataChangeFetchCloudTask",because if the old datum is not invalid, the new subscriber will get the old datum directly from the cache

* bugfix MetaStoreService&DataStoreService: "return" -> "continue"

* fix Memory waste of ServerDataBox

* revert MetaDigestResource api

* Request add method "getTimeout"

* bugfix: remove @ConditionalOnMissingBean for fetchDataHandler

* fix compile error

* RequestException: limit message size

* bugfix: empty dataServerList cause NPE because calculateOldConsistentHash return null

* trigger github ci

* trigger github ci

* fix ut

* update version to 5.4.0

Co-authored-by: Synex-wh <241809311@qq.com>
@dzdx dzdx closed this as completed Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants