Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventHubClient (with transportType=Websockets) creation times out when running on java-x86_64 (mixed mode) #385

Closed
SreeramGarlapati opened this issue Sep 24, 2018 · 14 comments
Assignees
Labels

Comments

@SreeramGarlapati
Copy link
Contributor

Issue raised by @dak0ta - #264

we had an older squid that wasn't working great with HTTP CONNECT tunnelling so I setup a newer one.

New one works fine with CURL to test, but when trying out the sample EventHubs code we get

Exception in thread "main" com.microsoft.azure.eventhubs.TimeoutException: Opening MessagingFactory timed out.
    at com.microsoft.azure.eventhubs.impl.MessagingFactory$1.run(MessagingFactory.java:125)
    at com.microsoft.azure.eventhubs.impl.Timer$ScheduledTask.onEvent(Timer.java:49)
    at com.microsoft.azure.eventhubs.impl.DispatchHandler.onTimerTask(DispatchHandler.java:9)
    at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:233)
    at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
    at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324)
    at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291)
    at com.microsoft.azure.eventhubs.impl.MessagingFactory$RunReactor.run(MessagingFactory.java:491)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1153)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.lang.Thread.run(Thread.java:785)

Squid logs show a request to the eventhubs on 443, but I don't think that's what this request here is.

1537807898.328 59999 X.X.X.X TCP_TUNNEL/200 4011 CONNECT XXXXXXXXXXXX.servicebus.windows.net:443 - HIER_DIRECT/40.86.225.142 -

Is this first request succeeding then a subsequent AMQP request failing? Or is this first one actually failing?

adding our squid proxy expert - @lee0c - can you please help @dak0ta - with the Squid proxy configuration to get EventHubs java library talking to it.

@SreeramGarlapati SreeramGarlapati self-assigned this Sep 24, 2018
@SreeramGarlapati
Copy link
Contributor Author

SreeramGarlapati commented Sep 24, 2018

@dak0ta - yes, this is the first request and it failed (timed out). Here's how to tell that:

This (59999) part of the squid proxy log

1537807898.328 59999 X.X.X.X TCP_TUNNEL/200 4011 CONNECT XXXXXXXXXXXX.servicebus.windows.net:443 - HIER_DIRECT/40.86.225.142 -

shows the time taken for the request - which is ~60s - which is the default timeout configured on EventHubClient => eventually, EventHubClient instance closed this socket as the request timed out.

I have 2 ideas to make progress:

Suggestion-1

please double check if the CURL request that succeeded is going via port 443 ?

With these symptoms - I would definitely double check the restrictions on outbound connections for the environment where the proxy server is running - allow for port: 443 and for processname=SQUID_PROCESS_NAME - to begin with.

Suggestion-2

To test this one out - we set up an Azure VM running Ubuntu 18.04 with Squid installed. Please see what's different from this configuration:

Here's just what we had to do - to get this to work (when we tested this a while back with Squid proxy):

  1. In Azure, add a network rule to allow inbound traffic on port 3128, Squid’s default port (which can be changed if needed). In /etc/squid/ the squid.conf file manages all the configuration settings. add a few lines to this to add access from your computer.
# Pre-existing lines defining possible access control lists 
#acl localnet … 
 
# Add a line defining the network you will access the proxy from 
acl my_network src 131.107.0.0/16 
  1. Here “my_network” is the name, “src” defines that requests coming from this range of addresses fall into this ACL, and then the range of addresses is defined.
# Find this preexisting line 
http_access deny all 
 
# And add access for the acl you just created directly before that line 
http_access allow my_network 
http_access deny all 

Here's how our successful proxy CONNECT frames looked like:

1536184900.355 9289 131.107.174.191 TCP_TUNNEL/200 7816 CONNECT EVENTHUBS_NAMESPACE.servicebus.windows.net:443 - HIER_DIRECT/52.151.58.121 -

@apraovjr
Copy link

apraovjr commented Sep 24, 2018

squid.txt
@dak0ta - Please rename the file back to .conf as it wasnt allowing me to upload a conf file.

In this file , on line 981, acl my_network src 131.107.0.0/16 is added which is IP range of local machine

and on line 1176 http_access allow my_network which basically allows access for that network.

I used above squid configuration and was able to send/receive messages through the proxy

@ChrisCollinsIBM
Copy link

ChrisCollinsIBM commented Sep 25, 2018

Thanks for the info, so I've tried these settings and still running into the same issue. I've even gone as far as just setting http_access allow all and dropped the rest of the rules just to rule out permissions.

Here's a connect from CURL, followed by a connect from the Azure test app followed by curl again from another host:

1537886275.651   1333 172.X.X.X TCP_TUNNEL/200 4299 CONNECT XXXXXX.servicebus.windows.net:443 - HIER_DIRECT/40.86.225.142 -
1537886516.000  60001 9.1.X.X TCP_TUNNEL/200 4011 CONNECT XXXXXX.servicebus.windows.net:443 - HIER_DIRECT/40.86.225.142 -
1537891429.947   1160 9.2.X.X TCP_TUNNEL/200 6180 CONNECT XXXXXX.servicebus.windows.net:443 - HIER_DIRECT/40.86.225.142 -

So the request made it in to the proxy and wasn't denied, but the one from the client software is timing out. I can play with settings a bit more but this is definitely odd.

@SreeramGarlapati
Copy link
Contributor Author

@dak0ta - can you bump up the logging level to understand what went on in this squid request.
if squid doesn't receive the response from EventHubs service - it wouldn't tunnel it back to the client. And if client doesn't receive any response - it wouldn't have anything to act on. So, it would timeout.
So, next logical step is, to understand why EventHubs service did not respond for this specific request - one approach here could be - to find the difference between succeeding curl requests and this one.
@apraovjr , @lee0c - any more ideas?

@apraovjr
Copy link

@SreeramGarlapati I agree with your approach, unless we get more information in logs, it would be tricky to debug what exactly is going on

@ChrisCollinsIBM
Copy link

ChrisCollinsIBM commented Sep 25, 2018

So part of that is a me problem for sure, I bumped up the debugging in Squid but I didn't seem to get any more logging. Squid docs say it goes to stderr but I didn't see the logs in any of the usual suspects... I'll chase that down.

What I did do is get the test client run with java.net.debug=all on to have a look at what was requested and what the response was.

So there's an initial tunnel request, certificate exchange, TLS negotiation, etc...

Then there's the main request/response:

Request [
GET https://XXXXXXXXXX.servicebus.windows.net:443/$servicebus/websocket HTTP/1.1
Connection: Upgrade,Keep-Alive
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: Vo7CB9/XXXXXXXXB7oX7A==
Sec-WebSocket-Protocol: AMQPWSB10
Host: XXXXXXXXXX.servicebus.windows.net:443

-Small amount of binary data-
]

Response [
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Server: Microsoft-HTTPAPI/2.0
Sec-WebSocket-Protocol: AMQPWSB10
Connection: Upgrade
Sec-WebSocket-Accept: JDMgv6JDpsdiXXXXXXXXSv+SnPdA=
Date: Tue, 25 Sep 2018 18:25:20 GMT

-Small amount of binary data-
]

After this there's a 60 second hang and the connection is closed.

@ChrisCollinsIBM
Copy link

ChrisCollinsIBM commented Sep 25, 2018

At a glance everything on the server side appears to following https://tools.ietf.org/html/rfc6455, but it seems the test Client doesn't actually post the data once the HTTP/1.1 101 Switching Protocols is complete.

This data was captured from the JVM of the machine running the eventhubs test code (to post an event) - so this request definitely left that machine and made it to the eventhubs endpoint via the proxy and the response had to have made it back as well otherwise it wouldn't be visible in the javax.net.debug output.

So what's the next steps?

I'm not bound to Squid but I'm guessing many of our customers will also be using it, and there's no reason it shouldn't work from a feature perspective.

Thanks in advance for your assistance with this issue!

@SreeramGarlapati
Copy link
Contributor Author

@dak0ta - the client is blocked on response for 101. Then, this is really just websocket layer. Can you please run your testClient without any proxy settings but, keeping the setting TransportType=AmqpWebScokets on the EventHub connection string. I am curious - if has anything to do with squid proxy.

@ChrisCollinsIBM
Copy link

ChrisCollinsIBM commented Sep 26, 2018

Looks to be the same:

Padded plaintext before ENCRYPTION:  len = 320
0000: 47 45 54 20 68 74 74 70  73 3a 2f 2f 71 72 61 64  GET.https...XXXX
0010: 61 72 72 6f 61 64 6d 61  70 65 68 2e 73 65 72 76 XXXXXXXXXX.serv
0020: 69 63 65 62 75 73 2e 77  69 6e 64 6f 77 73 2e 6e  icebus.windows.n
0030: 65 74 3a 34 34 33 2f 24  73 65 72 76 69 63 65 62  et.443..serviceb
0040: 75 73 2f 77 65 62 73 6f  63 6b 65 74 20 48 54 54  us.websocket.HTT
0050: 50 2f 31 2e 31 0d 0a 43  6f 6e 6e 65 63 74 69 6f  P.1.1..Connectio
0060: 6e 3a 20 55 70 67 72 61  64 65 2c 4b 65 65 70 2d  n..Upgrade.Keep.
0070: 41 6c 69 76 65 0d 0a 55  70 67 72 61 64 65 3a 20  Alive..Upgrade..
0080: 77 65 62 73 6f 63 6b 65  74 0d 0a 53 65 63 2d 57  websocket..Sec.W
0090: 65 62 53 6f 63 6b 65 74  2d 56 65 72 73 69 6f 6e  ebSocket.Version
00a0: 3a 20 31 33 0d 0a 53 65  63 2d 57 65 62 53 6f 63  ..13..Sec.WebSoc
00b0: 6b 65 74 2d 4b 65 79 3a  20 56 48 69 32 4e 51 4e  ket.Key..VHi2NQN
00c0: 30 33 78 6a 5a 67 51 64  55 56 71 73 6f 43 77 3d  03xjZgQdUVqsoCw.
00d0: 3d 0d 0a 53 65 63 2d 57  65 62 53 6f 63 6b 65 74  ...Sec.WebSocket
00e0: 2d 50 72 6f 74 6f 63 6f  6c 3a 20 41 4d 51 50 57  .Protocol..AMQPW
00f0: 53 42 31 30 0d 0a 48 6f  73 74 3a 20 71 72 61 64  SB10..Host..XXXX
0100: 61 72 72 6f 61 64 6d 61  70 65 68 2e 73 65 72 76  XXXXXXXXX.serv
0110: 69 63 65 62 75 73 2e 77  69 6e 64 6f 77 73 2e 6e  icebus.windows.n
0120: 65 74 3a 34 34 33 0d 0a  0d 0a f0 90 ad 74 fc e7  et.443.......t..
0130: de e5 c5 be 13 df 9e 06  f3 9c 2e 84 66 1d 01 01  ............f...

Padded plaintext after DECRYPTION:  len = 32
0000: 48 41 c2 f6 ca 73 04 68  c4 bf c5 c7 31 15 6b c5  HA...s.h....1.k.
0010: f3 d9 f6 74 ed 0a 0a 0a  0a 0a 0a 0a 0a 0a 0a 0a  ...t............

Padded plaintext after DECRYPTION:  len = 256
0000: 54 54 50 2f 31 2e 31 20  31 30 31 20 53 77 69 74  TTP.1.1.101.Swit
0010: 63 68 69 6e 67 20 50 72  6f 74 6f 63 6f 6c 73 0d  ching.Protocols.
0020: 0a 55 70 67 72 61 64 65  3a 20 77 65 62 73 6f 63  .Upgrade..websoc
0030: 6b 65 74 0d 0a 53 65 72  76 65 72 3a 20 4d 69 63  ket..Server..Mic
0040: 72 6f 73 6f 66 74 2d 48  54 54 50 41 50 49 2f 32  rosoft.HTTPAPI.2
0050: 2e 30 0d 0a 53 65 63 2d  57 65 62 53 6f 63 6b 65  .0..Sec.WebSocke
0060: 74 2d 50 72 6f 74 6f 63  6f 6c 3a 20 41 4d 51 50  t.Protocol..AMQP
0070: 57 53 42 31 30 0d 0a 43  6f 6e 6e 65 63 74 69 6f  WSB10..Connectio
0080: 6e 3a 20 55 70 67 72 61  64 65 0d 0a 53 65 63 2d  n..Upgrade..Sec.
0090: 57 65 62 53 6f 63 6b 65  74 2d 41 63 63 65 70 74  WebSocket.Accept
00a0: 3a 20 73 56 6f 63 6c 30  64 46 57 33 57 37 65 7a  ..sVocl0dFW3W7ez
00b0: 4d 37 45 78 2b 61 5a 32  56 42 70 52 59 3d 0d 0a  M7Ex.aZ2VBpRY...
00c0: 44 61 74 65 3a 20 57 65  64 2c 20 32 36 20 53 65  Date..Wed..26.Se
00d0: 70 20 32 30 31 38 20 31  31 3a 33 39 3a 33 39 20  p.2018.11.39.39.
00e0: 47 4d 54 0d 0a 0d 0a 1a  b3 55 9e f2 04 f1 93 02  GMT......U......
00f0: 71 e6 d0 f8 73 49 39 66  c4 7d 5b 04 04 04 04 04  q...sI9f........

And then a timeout with the same stack trace.

@SreeramGarlapati SreeramGarlapati changed the title how to get through squid proxy? EventHubClient (with transportType=Websockets) creation times out in some cases Sep 26, 2018
@ChrisCollinsIBM
Copy link

Were you able to re-create this at all @SreeramGarlapati?

@SreeramGarlapati
Copy link
Contributor Author

No.
is it possible for you send me details on which namespace you are trying to talk to: sreeramg@microsoft.com.
I want to understand - if there is anything in the service which can reproduce this on our end.

@SreeramGarlapati SreeramGarlapati changed the title EventHubClient (with transportType=Websockets) creation times out in some cases EventHubClient (with transportType=Websockets) creation times out when running on java-x86_64 (mixed mode) Sep 28, 2018
@SreeramGarlapati
Copy link
Contributor Author

@dak0ta - can you please build using Azure/qpid-proton-j-extensions#9 - and acknowledge if this fix unblocks you?
pls. use skipTests flag (mvn package -DskipTests).

@SreeramGarlapati
Copy link
Contributor Author

@dak0ta - I revamped the release notes for proxy support to help with setting proxy properties; appreciate your eye & any comments!

@JamesBirdsall
Copy link
Contributor

This is now a legacy client in maintenance mode and we will not be making enhancements. You can find the new Java client azure-messaging-eventhubs at https://github.com/Azure/azure-sdk-for-java/tree/master/sdk/eventhubs/azure-messaging-eventhubs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants