-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NBitcoin doesn't connect to the Bitcoin P2P network only on OSX under 8GB memory #345
Comments
@NicolasDorier I would like to fix this but i need some help. The point is that i cannot configure the @nopara73's test application in order to get the logs, it seems System.Diagnostics classes cannot be configured in dotnet core 2.0 (take a look at this and this). Troubleshooting without logs is a bit hard, i am blind dealing with this. Do you have any recomendation? I started writting another logging mechanism to replace System.Diagnostics classes but it doesn't look right. (Just as a side note: dotnet core doesn't provide a console listener neither - i can be worng) |
As alternative strategies I have a few things I can think of:
|
I think NBitcoin should use ILogger from .NETStandard. I am unsure when they started doing it. |
I will take a look to improve logging on NBitcoin. |
@nopara73 right now nbitcoin is using AppVeyor. I am not against adding travis support with running all tests on mac os as well. |
Add a way to extend log right now... |
@lontivero can you use this f1dcd9a I added an extension point for logs for environment without TraceSource. I need to eventually drop TraceSource completely. |
@dev0tion if I remember well you are using OSX. Did you ever experienced it? |
Oh as I see Stratis excludes these tests from CI. Is it possible it never even worked? https://github.com/stratisproject/StratisBitcoinFullNode/blob/master/build.sh
|
@nopara73 I fixed a bug in the netcore version. You can now inject a logger through By default it is this one, but you can make your own to print into console logs of NBitcoin.
|
Logging was @lontivero's request. @lontivero Can you progress with this logging? I am not sure it helps me. I'm trying to go on another route, trying to get an OSX virtualmachine, if possible, so I'll be able to debug into the code. |
@nopara73 yeah we exclude integration tests from the builds on Travis as it's sometimes unreliable. Better run it on a real Mac and see what you get 😰 |
By "unreliable" I guess you mean that some tests are flaky (sometimes pass, sometimes fail). The ideal thing to do in this situation is finding which of the tests are flaky and disable them, not disable the whole test suite. |
Raw code should never fail. But code that communicates with a server can fail sometimes. Code that communicates with a peer to peer network is even more unreliable. Anyway I got an OSX virtualmachine set up, so let's see. |
TLDR: The AddressManager fails on OSX High Sierra under 8GB memory. It doesn't fail with 8GB. @NicolasDorier @lontivero Just to further complicate the issue: it works fine with my virtual machine: OSX High Sierra, the same what the CI uses. (OS X 10.12.6) Any idea what's going on? Update I lowered the requirements to match exactly of the CI. And it failed this time. I'll try to repeat it see if I get consistent results and report back: Increased the CPU count and memory back: it works again (tested 2 times.) I think we can say it's consistent now. Let's figure out if it's CPU or memory. |
I guess by Raw code you mean unit tests. If yes, I agree.
An integration test (instead of unit), which should be placed in a different test suite.
If the server is not part of the test fixture setup, yes. Ideally, integration tests should set up and bootstrap all the elements they will need themselves. When this doesn't happen, it's when flaky tests start to appear. |
The summary of my testing of under what conditions this bug is present: OSX under 8GB memory. |
what @nopara73 could you check logs ? |
@NicolasDorier I cannot find the logs. Where should I look for them? |
inject |
Meh, I don't get DI. Could you make a PR? I guess it's a one liner: https://github.com/nopara73/NBitcoinBitcoinP2pOsxBug/blob/master/NBitcoinP2pOsxBug/Program.cs |
@NicolasDorier @lontivero Here you go I've found the problem, not sure about the solution yet. An exception is thrown here: https://github.com/MetacoSA/NBitcoin/blob/master/NBitcoin/Protocol/Node.cs#L561 var node = Node.Connect(network, addr.Endpoint, param2); Which is catched and swallowed here: https://github.com/MetacoSA/NBitcoin/blob/master/NBitcoin/Protocol/Node.cs#L569-L572 catch(SocketException)
{
parameters.ConnectCancellation.WaitHandle.WaitOne(500);
} With this, a node.Connect() function never returns. The exception is: "no socket buffer space available." |
It seems that this catch block is swallowing too much. The SocketException class has an ErrorCode property that can make error handling more specific. The commit that introduced the call to WaitOne(500) should have made sure it only did it for the specific errorCode that it was expecting, instead of all error codes. Any chance this errorCode can be reproduced again? |
@knocte The full exception, this exception is thrown by a million times:
|
@wintercooled tried it on 4GB Sierra High and it worked for him. Maybe this problem is specific for virtual machines? |
In my short experience with OSX i could see that it imposes more strict limits in many aspects. I had to deal with ulimit (max opened files) days ago. In this case the problem seems to be the sockets send/receive buffers. I've changed hardcoded smaller buffer size values and the @nopara73 test now pass green in travis. I must make the test with the appropiate |
I can make the tests pass simply changing the NodeConnectionParameters parameters (below my change): - var connectionParameters = new NodeConnectionParameters();
+ var connectionParameters = new NodeConnectionParameters()
+ {
+ ReceiveBufferSize = 100 * 5000,
+ SendBufferSize = 100 * 1000
+ }; The article Mac OSX Tuning says that
The article also says that the default new values for Settings for OSX Yosemite/Mavericks/Sierra are:
in NBitcoin, the default values are hardcoded as follow: ReceiveBufferSize = 1000 * 5000;
SendBufferSize = 1000 * 1000; IMO, even when NBitcoin works perfectly well and can be parametrized easily, I think those defaults values should be modify in a way that osx programmers don't have to deal with this. I'm not sure what the best strategy is (changing the values to fit the known max limits or let the default values or other) This is not a blocker for us. |
This is a good suggestion. |
@lontivero can you check that setting both to 1048576 works fine? |
Made a commit which should fix it, can you try just to be sure ? |
Can you also check if MacOS is throwing exception if it is above that I can catch? |
@lontivero last question, can you also try
with size being above the threshold? |
@NicolasDorier it works with the the documented max values. Here you can see my commit and the travis CI build is still green About you last question, I am working with linux right now (this env is new for me) and i don't have ILSpy but I am 99.99% sure that |
I am wondering if an exception is thrown so I can attempt to cap the value and retry at Node.cs level in case it is higher. |
I have no way to test it. I just believe that what @nopara73 said it accurate and it is swallowing the exceptions. And yes, I was right, |
Thanks for your testing, I am publishing new NBitcoin version now. |
Ok pushed |
Closing this, reopen if problem persist |
Update. The AddressManager fails on OSX High Sierra under 8GB memory. It doesn't fail with or over 8GB.
I created a repository that reproduces this issue with a few lines of code in the
Program.cs
.Repository: https://github.com/nopara73/NBitcoinBitcoinP2pOsxBug
I also added travis CI integration that shows it runs on Linux, but it doesn't run on OSX. (It runs on Windows, too.)
Travis builds: https://travis-ci.org/nopara73/NBitcoinBitcoinP2pOsxBug/builds/344928253
The text was updated successfully, but these errors were encountered: