Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed handling errors for client requests #8518

Merged
merged 1 commit into from
Nov 11, 2020

Conversation

merlimat
Copy link
Contributor

Motivation

ClientCnx is maintaining different maps for different types of requests. The problem is that on handleError() it's only checking on the main map, the one containing requests for creating producers and consumers, but it's ignoring the error on other kind of requests.
For example, an error returned by broker on the GetLastMessageId request, will result in calls to hasMessageAvailable() to get stuck, since the error is ignored and the future never completed:

18:11:09.083 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@602] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received error from server: XXX 
18:11:09.084 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@612] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received unknown request id from server: 3

Modifications

There's no good reason to keep 1 map per kind of request. Instead, we should use a single map.

@merlimat merlimat added type/bug The PR fixed a bug or issue reported a bug release/2.6.3 labels Nov 11, 2020
@merlimat merlimat added this to the 2.7.0 milestone Nov 11, 2020
@merlimat merlimat self-assigned this Nov 11, 2020
@@ -465,7 +443,7 @@ protected void handleGetLastMessageIdSuccess(CommandGetLastMessageIdResponse suc
log.debug("{} Received success GetLastMessageId response from server: {}", ctx.channel(), success.getRequestId());
}
long requestId = success.getRequestId();
CompletableFuture<MessageIdData> requestFuture = pendingGetLastMessageIdRequests.remove(requestId);
CompletableFuture<MessageIdData> requestFuture = (CompletableFuture<MessageIdData>) pendingRequests.remove(requestId);
Copy link
Contributor

@jerrypeng jerrypeng Nov 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there is a cast exception, shouldn't we handle it? If the client somehow is returned a wrong request id that is for a different request type, a cast exception may happen here. If it does, the requestFuture will never get completed and may cause the client api call to block forever.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection will get automatically closed and therefore the future will be completed there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see

@sijie sijie merged commit bb45b8c into apache:master Nov 11, 2020
@merlimat merlimat deleted the handle-request-errors branch November 11, 2020 18:57
codelipenghui pushed a commit that referenced this pull request Nov 12, 2020
`ClientCnx` is maintaining different maps for different types of requests. The problem is that on `handleError()` it's only checking on the main map, the one containing requests for creating producers and consumers, but it's ignoring the error on other kind of requests.
For example, an error returned by broker on the `GetLastMessageId` request, will result in calls to `hasMessageAvailable()` to get stuck, since the error is ignored and the future never completed:

```
18:11:09.083 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@602] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received error from server: XXX
18:11:09.084 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@612] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received unknown request id from server: 3
```

There's no good reason to keep 1 map per kind of request. Instead, we should use a single map.

(cherry picked from commit bb45b8c)
@codelipenghui
Copy link
Contributor

cherry-picked to branch-2.6

codelipenghui pushed a commit to streamnative/pulsar-archived that referenced this pull request Nov 13, 2020
`ClientCnx` is maintaining different maps for different types of requests. The problem is that on `handleError()` it's only checking on the main map, the one containing requests for creating producers and consumers, but it's ignoring the error on other kind of requests.
For example, an error returned by broker on the `GetLastMessageId` request, will result in calls to `hasMessageAvailable()` to get stuck, since the error is ignored and the future never completed:

```
18:11:09.083 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@602] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received error from server: XXX
18:11:09.084 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@612] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received unknown request id from server: 3
```

There's no good reason to keep 1 map per kind of request. Instead, we should use a single map.

(cherry picked from commit bb45b8c)
(cherry picked from commit eb5e81b)
flowchartsman pushed a commit to flowchartsman/pulsar that referenced this pull request Nov 17, 2020
### Motivation

`ClientCnx` is maintaining different maps for different types of requests. The problem is that on `handleError()` it's only checking on the main map, the one containing requests for creating producers and consumers, but it's ignoring the error on other kind of requests. 
For example, an error returned by broker on the `GetLastMessageId` request, will result in calls to `hasMessageAvailable()` to get stuck, since the error is ignored and the future never completed: 

```
18:11:09.083 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@602] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received error from server: XXX 
18:11:09.084 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@612] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received unknown request id from server: 3
```

### Modifications

There's no good reason to keep 1 map per kind of request. Instead, we should use a single map.
merlimat added a commit to merlimat/pulsar that referenced this pull request Dec 19, 2020
### Motivation

`ClientCnx` is maintaining different maps for different types of requests. The problem is that on `handleError()` it's only checking on the main map, the one containing requests for creating producers and consumers, but it's ignoring the error on other kind of requests. 
For example, an error returned by broker on the `GetLastMessageId` request, will result in calls to `hasMessageAvailable()` to get stuck, since the error is ignored and the future never completed: 

```
18:11:09.083 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@602] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received error from server: XXX 
18:11:09.084 [pulsar-client-io-1-1:org.apache.pulsar.client.impl.ClientCnx@612] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xa378924b, L:/127.0.0.1:58267 - R:localhost/127.0.0.1:6650] Received unknown request id from server: 3
```

### Modifications

There's no good reason to keep 1 map per kind of request. Instead, we should use a single map.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release/2.6.3 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants