New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket.Select() method doesn't work correctly in linux. #4631

Closed
sirentek opened this Issue Nov 22, 2015 · 33 comments

Comments

Projects
None yet
@sirentek
Copy link

sirentek commented Nov 22, 2015

I use linux mint mate 17.2 which is ubuntu and debian based.
I cannot connect to Ngpsql database from Linux client. But I can connect from Windows client.

Because:
After the following line is called
=>Socket.Select(null, write, error, perIpTimeout);

In Linux:
The list write doesn't have any elements.
As a result it enters the if (!write.Any()) block and creates "Timeout exception" there.

In Windows:
The list write has the element. It works.

Here is a sample code to produce it. It is a console application that works on CoreClr.

// Program.cs
using System;
// using Npgsql;
using System.Net;
using System.Net.Sockets;
using System.Collections.Generic;
using System.Linq;

namespace Sample
{
    public class Program
    {
        public void Main()
        {
            Console.WriteLine("App started..");

            string Host = "192.168.1.72";
            int Port = 5432;

            int perIpTimeout = -1;
            var ips = Dns.GetHostAddressesAsync(Host).Result;
            var ep = new IPEndPoint(ips[0], Port);
            var socket = new Socket(ep.AddressFamily, SocketType.Stream, ProtocolType.Tcp)
            {
                Blocking = false
            };


            try
            {
                try
                {
                    socket.Connect(ep);
                }
                catch (SocketException e)
                {
                    if (e.SocketErrorCode != SocketError.WouldBlock)
                    {
                        throw;
                    }
                }
                var write = new List<Socket> { socket };
                var error = new List<Socket> { socket };
                Socket.Select(null, write, error, perIpTimeout);
                var errorCode = (int)socket.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.Error);
                if (errorCode != 0)
                {
                    throw new SocketException((int)socket.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.Error));
                }
                if (!write.Any())
                {
            // When it is Linux, it will enter this block here!
                    Console.WriteLine(
                        $"Timeout after {new TimeSpan(perIpTimeout * 10).TotalSeconds} seconds when connecting to {ips[0]}");
                    try { socket.Dispose(); }
                    catch
                    {
                        // ignored
                    }

                    // if i == 0
                    Console.WriteLine("i==0 exception");
                    throw new TimeoutException();

                    // continue;
                }
                socket.Blocking = true;

                return;
            }
            catch (TimeoutException) { throw; }
            catch
            {
                try { socket.Dispose(); }
                catch
                {
                    // ignored
                }

                Console.WriteLine("Failed to connect to " + ips[0]);

                // if (i == 0)

                Console.WriteLine("i==0 exception 2");
                throw;

            }
        }
    }

    // TODO: Remove. Will be fixed in the next release of EF.
    public class Startup
    {
        public void Configure() { }
    }
}

@davidsh davidsh added this to the 1.0.0-rc2 milestone Nov 22, 2015

@davidsh

This comment has been minimized.

Copy link
Member

davidsh commented Nov 22, 2015

@sirentek

This comment has been minimized.

Copy link

sirentek commented Nov 23, 2015

Here is an other application which makes the stack trace more clear:

           try
            {
                try
                {
                    socket.Connect(ep);
                }
                catch (SocketException e)
                {
                    => The socket exception is catched here!

                    Console.WriteLine("Excpetion Message=" + e.Message);
                    Console.WriteLine("Excpetion Stack=" + e.StackTrace);

                    Console.WriteLine("Socket Error Code=" + e.SocketErrorCode.ToString());

                    if (e.SocketErrorCode != SocketError.WouldBlock)
                    {
                        throw;
                    }
                }
                var write = new List<Socket> { socket };
                var error = new List<Socket> { socket };
                Socket.Select(null, write, error, perIpTimeout);
                var errorCode = (int)socket.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.Error);
                if (errorCode != 0)
                {
                    Console.WriteLine("error code is not zero!");
                    throw new SocketException((int)socket.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.Error));
                }
                if (!write.Any())
                {
                    Console.WriteLine(
                        $"Timeout after {new TimeSpan(perIpTimeout * 10).TotalSeconds} seconds when connecting to {ips[0]}");
                    try { socket.Dispose(); }
                    catch
                    {
                        // ignored
                    }

                    // if i == 0
                    Console.WriteLine("i==0 exception");
                    throw new TimeoutException();

                    // continue;
                }
                socket.Blocking = true;

                return;
            }
            catch (TimeoutException) { throw; }
            catch
            {
                try { socket.Dispose(); }
                catch
                {
                    // ignored
                }

                Console.WriteLine("Failed to connect to " + ips[0]);

                // if (i == 0)

                Console.WriteLine("i==0 exception 2");
                throw;

            }

Here is my output from the above program:

Excpetion Message=Unknown error 10035
Excpetion Stack= at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
at SirenTek.Program.Main(String[] args)
Socket Error Code=WouldBlock
Timeout after -1E-06 seconds when connecting to 192.168.1.72
i==0 exception
System.TimeoutException: The operation has timed out.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Nov 23, 2015

Socket.DoConnect method creates an exception. That is why Socket.Select() method behaves differently in linux.

@pgavlin

This comment has been minimized.

Copy link
Contributor

pgavlin commented Nov 23, 2015

A fix is in progress. FWIW, the exception above is expected and is not related to the later failure. The later failure is due to broken code that translates between CoreFX data structures and platform data structures.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Nov 23, 2015

@pgavlin Thank you for the detailed information. Do you think it will take much time to fix it ?
(much time is equal to 1 month for me.) I am excited to see a fix is going on!

@pgavlin

This comment has been minimized.

Copy link
Contributor

pgavlin commented Nov 23, 2015

There should be a fix in our master branch today or tomorrow. I am not sure exactly how long it will take to propagate to the NuGet feeds once it's checked in--perhaps @joshfree can comment on that.

@stephentoub

This comment has been minimized.

Copy link
Member

stephentoub commented Nov 23, 2015

I am not sure exactly how long it will take to propagate to the NuGet feeds once it's checked in

Nightly builds out of master should show up at https://www.myget.org/gallery/dotnet-core the next morning.

@joshfree

This comment has been minimized.

Copy link
Member

joshfree commented Nov 23, 2015

RC2 daily builds are at https://myget.org/gallery/dotnet-core as @stephentoub said above. We'll only be pushing new bits to nuget.org for an RC1-servicing event (think really bad blocking bug for RC1 adopters) or when RC2 ships early next year.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Nov 23, 2015

These are quite good news. Postgresql is the only db that has provider for ef and which works under linux. This makes the fix very important for me and probably for others. Many thanks and kind regards!

@sirentek

This comment has been minimized.

Copy link

sirentek commented Nov 24, 2015

I have tested System.Net.Sockets -Version 4.1.0-rc2-23523 from https://www.myget.org/gallery/dotnet-core in linux. I got the same errors again. I think this version doesn't include the fix. I am looking forward to see the newer version.

@stephentoub

This comment has been minimized.

Copy link
Member

stephentoub commented Nov 24, 2015

@sirentek, Pat's fix hasn't gone in yet. This issue will be updated when it does, so you don't need to check in the meantime.

@pgavlin

This comment has been minimized.

Copy link
Contributor

pgavlin commented Nov 30, 2015

Closed by #4652.

@PeteX

This comment has been minimized.

Copy link

PeteX commented Dec 17, 2015

In case anyone comes here and is having trouble with this, I've built a new System.Native.so which you may find helpful. It's built from the tree as of rc1-update1, so it should be compatible with the rest of that release, but it also includes this one patch. You can download it from http://tmp.chown.org.uk/System.Native.so and you should install it in ~/.dnx/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin .

I'm running 64bit Ubuntu Trusty. If you have a different system I doubt this file will work, and you'll probably have to build it yourself.

With this fix, the Postgres driver and EntityFramework7.Npgsql work as expected, which is great news.

@roji

This comment has been minimized.

Copy link
Contributor

roji commented Dec 17, 2015

That's great @PeteX, thanks!

@sirentek

This comment has been minimized.

Copy link

sirentek commented Dec 18, 2015

@PeteX I have installed your System.Native library in my OS (Linux Mint 17.2 Mate).
Although It didn't work for me, I appreciate your work.

@rodrigo-a-moreno

This comment has been minimized.

Copy link

rodrigo-a-moreno commented Dec 28, 2015

Thanks @PeteX, It works perfect.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 1, 2016

@freakZoid How did you have it worked ? You just copy-pasted the System.Native.so file to your ~/.dnx/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin directory and that was all ??

@Rembel

This comment has been minimized.

Copy link

Rembel commented Jan 2, 2016

@sirentek I have replaced file and after reboot It works perfect.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 2, 2016

@Rembel thanks for the reply. I tried again after reboot and I still have the same exceptions mentioned above.

@thmulvany

This comment has been minimized.

Copy link

thmulvany commented Jan 7, 2016

@PeteX , thank you so much for this! It works. My app can talk to AWS RDS PostgreSQL!

I take it you have submitted a corefx pull request with these changes :)

Anyone ( @sirentek ) wanting to use it can just add the file to your project root and then use docker COPY like this:
COPY ./System.Native.so /opt/DNX_BRANCH/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin

Be sure and also have these in your project.json or else this new file from Pete won't help at all since there are really several problems with getting a .NET Core (non-Mono) app using EF7 + NpgSql EF7 provider working

"runtime.linux.System.Net.NetworkInformation": "4.1.0-beta-23516",
"runtime.unix.System.Net.Security": "4.0.0-beta-23516"

I was told some magic addition to project.json would help resolve correct System.Net.* files

"Microsoft.NETCore.Platforms": "1.0.1-beta-23516"

but it did NOT, so you can exclude this.

I have even tried to
dnu restore --runtime coreclr50 -s https://www.myget.org/F/aspnetvnext/api/v2/ -f https://www.nuget.org/api/v2/
(restoring from RC2 feed) and
dnu build to --framework dnxcore50
to no avail so don't bother trying this.

Hope this all gets ironed out soon or in RC2 and does not fall through the cracks.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 7, 2016

@thmulvany , I will give it a try tonight and share the results.

@PeteX

This comment has been minimized.

Copy link

PeteX commented Jan 7, 2016

Hello @thmulvany, glad it worked for you. There's no need for a pull request because the fix is already in RC2. All I did was backport it to RC1 and build the resulting tree. I'm looking forward to RC2 coming out, so I can stop using this hack.

If my binary isn't working for you, it's not too hard to build the fixed tree yourself. Clone the corefx tree, check out the v1.0.0-rc1 tag, then cherry-pick the relevant commit (git cherry-pick -m1 339dcf2 I think). I seem to remember that the changes apply cleanly. Follow the normal build instructions, but at the end, take only System.Native.so—unless you want to run your own build.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 7, 2016

I have added System.Native.so to rc1-update1 runtime bin folder.
Added the following 2 lines to project.json:
"runtime.linux.System.Net.NetworkInformation": "4.1.0-beta-23516",
"runtime.unix.System.Net.Security": "4.0.0-beta-23516"
As a result it didn't work..

I tried runtime coreclr rc-2-16357 and again it didn't work for me.
I didn't try building the tree myself.

Thanks for the information (@PeteX and @thmulvany )

@hasanoruc

This comment has been minimized.

Copy link

hasanoruc commented Jan 8, 2016

@PeteX . thank you very very much. it works for me.
i recieved this error:
System.TimeoutException: The operation has timed out.
at Npgsql.NpgsqlConnector.Connect(NpgsqlTimeout timeout)
at Npgsql.NpgsqlConnector.RawOpen(NpgsqlTimeout timeout)
at Npgsql.NpgsqlConnector.Open(NpgsqlTimeout timeout)
at Npgsql.NpgsqlConnector.Open()
at Npgsql.NpgsqlConnectorPool.GetPooledConnector(NpgsqlConnection Connection)
at Npgsql.NpgsqlConnectorPool.RequestConnector(NpgsqlConnection connection)
at Npgsql.NpgsqlConnection.OpenInternal(NpgsqlTimeout timeout)
at Npgsql.NpgsqlConnection.Open()
... and special error.
and then install your System.Native.so file and then add this file to my bin folder. and then ad this package:
"runtime.linux.System.Net.NetworkInformation": "4.1.0-beta-23516",
"runtime.unix.System.Net.Security": "4.0.0-beta-23516"
and restore my packages with dnu restore and then select 1.0.0-rc1-update1 and rebot and voila....
thank you very much again.

@thmulvany

This comment has been minimized.

Copy link

thmulvany commented Jan 8, 2016

@sirentek , @hasanoruc, If you are deploying a console app to Linux (via Docker most likely) the .so file has to end up in /opt/DNX_BRANCH/runtimes ... you can use the docker COPY as one way to achieve this (see my comment above). For web app deploys System.Native.so needs to be in $HOME\.dnx\runtimes\dnx-coreclr-linux-x64.1.0.0-rc1-update1\bin folder. When you dnu publish ... --runtime "dnx-coreclr-linux-x64.1.0.0-rc1-update1" ... it wil use $HOME\.dnx\runtimes\dnx-coreclr-linux-x64.1.0.0-rc1-update1\bin and place under publish point root /approot/runtimes/... which means when you docker build ... that .so will make it up to the image. If you can't get it working, I can create a repo that demonstrates it works.

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 8, 2016

@thmulvany ,
I am developing a web app. I have the .so file in $HOME.dnx\runtimes\dnx-coreclr-linux-x64.1.0.0-rc1-update1\bin folder as you mentioned. My project.json file includes the following 2 packages.
"runtime.linux.System.Net.NetworkInformation": "4.1.0-beta-23516",
"runtime.unix.System.Net.Security": "4.0.0-beta-23516"

I use coreclr rc1-update1 but no luck. I don't use docker but I believe it should work without docker too. I didn't try ngpsql, I try the code above, socket.connect() creates an exception 10035..

@sirentek

This comment has been minimized.

Copy link

sirentek commented Jan 8, 2016

@thmulvany @hasanoruc @PeteX
I have tried it using with Npgsql and it worked!
Thanks to everybody.
Now I can delete all of my mono runtimes :)

@hasanoruc

This comment has been minimized.

Copy link

hasanoruc commented Jan 8, 2016

@thmulvany you are right. I am deploying web app only kestrel web server without docker. I ran 5000 inner port and forward 443 ssl out port with nginx. I dont know about security this this method but i read secure other writes. Kestrel is very newer but it will grow.

@thmulvany

This comment has been minimized.

Copy link

thmulvany commented Jan 10, 2016

This is beyond me with SSL, nginx, etc.
One thing to try is to do a dnvm install latest -r coreclr - x64 -u (for
unstable RC2) and then dnu publish --runtime ... with that runtime
specified.

All I know for sure is that by including those two deps in project json and
updating System.Native.so it fixed all of my
System.Net.NetowrkInfoformation, System.Net.Security and System.Net.Sockets
errors and then finally the .so file fixed the "timeout" which ONLY
manifested for me when the NpgSql (base alpha6 or EF7 provider) tried to
make low level net call to the DB instance. Outside of this scenario, I'm
afraid I will not be very useful. This of course only applies to errors on
Linux not OSX.

Sorry to hear of your problems.

On Fri, Jan 8, 2016 at 11:04 AM, hasanoruc notifications@github.com wrote:

@thmulvany https://github.com/thmulvany you are right. I am deploying
web app only kestrel web server without docker. I ran 5000 inner port and
forward 443 ssl out port with nginx. I dont know about security this this
method but i read secure other writes. Kestrel is very newer but it will
grow.


Reply to this email directly or view it on GitHub
#4631 (comment).

Terry

@snissim

This comment has been minimized.

Copy link

snissim commented Jan 11, 2016

@PeteX you sir, are a gentleman and a scholar.

@thmulvany thanks for the docker command! Note that it requires a trailing slash: COPY ./System.Native.so /opt/DNX_BRANCH/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin/

@iwnow

This comment has been minimized.

Copy link

iwnow commented Jan 23, 2016

Thank you all! 👍

@markvincze

This comment has been minimized.

Copy link

markvincze commented Feb 22, 2016

Thanks a lot @PeteX and @thmulvany!

One thing I had to modify: I don't know why, but on my system the coreclr binaries reside in /opt/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin/ instead of /opt/DNX_BRANCH/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin/, so I needed to add the following line to my Dockerfile:

COPY ./System.Native.so /opt/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update1/bin/

If someone else is in doubt, you can always start your docker image in interactive mode with bash and look around.

@Sinishin

This comment has been minimized.

Copy link

Sinishin commented May 6, 2016

Many thanks, @PeteX.

I was able to adapt ASP.NET HelloMVC project to connect to PostgreSQL using Npgsql 3.1.0-alpha6 on Ubuntu 14.04.2 LTS. Just replaced System.Native.so with your version of the file in "/.dnx/runtimes//.dnx/runtimes/dnx-coreclr-linux-x64.1.0.0-rc1-update2/bin". Didn't have to add the dependencies you mentioned, though. The problem I was facing is described in a separate issue.
Here's additional info about my setup for anybody toying with ASP.NET on Ubuntu using Docker and .NET Core:

dnvm version
1.0.0-rc2-15546

dnx --version
Microsoft .NET Execution environment
Version: 1.0.0-rc1-16609
Type: CoreClr
Architecture: x64
OS Name: Linux
OS Version: ubuntu 14.04
Runtime Id: ubuntu.14.04-x64

Looking forward to RC2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment