Skip to content

HA may not avaiable when slave's commitLog not match master #3388

@qsrg

Description

@qsrg

version:4.3.2,4.4.0
When the machine of slave broke down for some days,start up slave after the machine repaired。On this condition,master cannot send commitLog data to slave,this cause SYNC_MASTER return SLAVE_NOT_AVAIABLE when send message to this broker。search in master's broker.log,can find ‘Slave fall behind master:xxx '。

After analysis,I find the slave's commitLog offset does not existed in master's commitLog ,this cause master getCommitLogData by a out of range offset return null,then just send heartbeat

some resource code as follow:

SelectMappedBufferResult selectResult =
                        HAConnection.this.haService.getDefaultMessageStore().getCommitLogData(this.nextTransferFromWhere);
                    if (selectResult != null) {
                        int size = selectResult.getSize();
                        if (size > HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize()) {
                            size = HAConnection.this.haService.getDefaultMessageStore().getMessageStoreConfig().getHaTransferBatchSize();
                        }

                        long thisOffset = this.nextTransferFromWhere;
                        this.nextTransferFromWhere += size;

                        selectResult.getByteBuffer().limit(size);
                        this.selectMappedBufferResult = selectResult;

                        // Build Header
                        this.byteBufferHeader.position(0);
                        this.byteBufferHeader.limit(headerSize);
                        this.byteBufferHeader.putLong(thisOffset);
                        this.byteBufferHeader.putInt(size);
                        this.byteBufferHeader.flip();

                        this.lastWriteOver = this.transferData();
                    } else {
                        service.getWaitNotifyObject().wakeupAll();
                        HAConnection.this.haService.getWaitNotifyObject().allWaitForRunning(100);
                    }

I think when master received slaveRequestOffset,calculate of nextTransferFromWhere should check slaveRequestOffset is legal,not just

this.nextTransferFromWhere = HAConnection.this.slaveRequestOffset;

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions