Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async notification fixes #61

Merged
merged 3 commits into from
Oct 2, 2013
Merged

Async notification fixes #61

merged 3 commits into from
Oct 2, 2013

Conversation

glenebob
Copy link
Contributor

Francisco,

I'm re-submitting the two fixes in PR 46 that are clean and correct.

This will fix the CPU hog problem you were seeing, and eliminates the need for the odd Sleep() call in ProcessServerMessages().

-Glen

Switch to the correct thread syncronization method, Monitor and lock{}.
Use a reasonable timeout on Poll(), which take micrseconds, not milliseconds.
@franciscojunior
Copy link
Member

Excellent, @glenebob ! I'll review it and merge it soon.

@franciscojunior
Copy link
Member

Sorry for late reply.

Excellent job, @glenebob !

Indeed I could confirm that the high cpu usage only happens on Mono. On Windows or OSX. ( I didn't test it on Gnu/Linux yes, but I'm almost sure it will happen there too.)

With your changes, the only drawback I noticed is that on Mono, the main thread waits a little longer to resume processing normal commands after the notification thread is stopped.
I did a test where I waited for notifications and then tried to sen the query select version() This select waited a little bit to start to be processed. Only on Mono I got this behavior. On ms.net it worked perfectly.

Do you have a Mono installation in your machine? I'll investigate here where the thread is being suspended. If it is in the Poll method, or if in the Monitor calls and will let you know so we can check what is happening and if there is anything we can do about it.

Other than that, I think your patch is excellent. It uses much less cpu.

@glenebob
Copy link
Contributor Author

On Sun, Sep 29, 2013 at 6:22 AM, Francisco Figueiredo Jr. <
notifications@github.com> wrote:

Excellent job, @glenebob https://github.com/glenebob !

Indeed I could confirm that the high cpu usage only happens on Mono. On
Windows or OSX. ( I didn't test it on Gnu/Linux yes, but I'm almost sure it
will happen there too.)

With your changes, the only drawback I noticed is that on Mono, the main
thread waits a little longer to resume processing normal commands after the
notification thread is stopped.
I did a test where I waited for notifications and then tried to sen the
query select version() This select waited a little bit to start to be
processed. Only on Mono I got this behavior. On ms.net it worked
perfectly.

Do you have a Mono installation in your machine? I'll investigate here
where the thread is being suspended. If it is in the Poll method, or if in
the Monitor calls and will let you know so we can check what is happening
and if there is anything we can do about it.

Francisco,

I don't have mono installed. I can install it (in Win8) if needed, and I
can also install it on an Ubuntu VM I have at Azure, but first... How long
is it waiting? It should wait for up to 20 ms but no more.

How are you testing it? Are you simply executing a query with the
notification thread running, or are you first removing the notification
thread explicitly?

-Glen

Other than that, I think your patch is excellent. It uses much less cpu.


Reply to this email directly or view it on GitHubhttps://github.com//pull/61#issuecomment-25320169
.

@franciscojunior
Copy link
Member

It is waiting something like 1 or 2 seconds.

I'm not in my development machine, I'm in my Android phone, so sorry for not writing a proper code. It was something like that:

Open the connection
Setup the notification callback
Send the listen command
Console.readline

Write line ( Npgsqlcommand ("select version")) ;

That's basically it.

Note that this behavior only occurs on mono, even on Windows. You can try to get a ready installed vm on virtualboxes.org and install mono on it with apt-get. I think this would be the fastest way to get a mono. Other way would be to install mono for Windows from mono-project.com

Please, give it a try and let me know if you can reproduce this behavior.

@glenebob
Copy link
Contributor Author

Francisco,

Problem solved and documented.

I did a bunch of testing on my Ubuntu VM, and found that System.Threading,Monitor has a bug on mono. The above commit includes a nice explanation comment.

I executed the code below (by calling MonitorTest()) on Windows and Linux. This code starts two threads which, in a loop, repeatedly compete for a lock on Locker. Once the lock is obtained, the thread sleeps for 20 ms and then issues a console line reporting which thread got the lock. This simulates the Npgsql notification thread competing with a user thread for access to the _socket. On Windows, the output is perfectly interwoven between the first and second threads, as it should be. On Linux, the output shows one thread obtaining the lock many times in a row even though the other one has been waiting for it for some time.

By introducing Thread.Sleep(1) just before the lock{} construct, the program behaves properly on Linux as well as Windows.

Note that Sleep(0) had no effect whatsoever in my tests.

-Glen

object Locker;
CountdownEvent Done;

void MonitorTest()
{
Locker = new object();
Done = new CountdownEvent(2);

Thread t1 = new Thread(MonitorTestThreadHandler);
Thread t2 = new Thread(MonitorTestThreadHandler);

t1.Start(1);
t2.Start(2);

Done.Wait();
}

void MonitorTestThreadHandler(object oContext)
{
DateTime s = DateTime.Now;

while (DateTime.Now < s.AddSeconds(10)) {
lock (Locker) {
Thread.Sleep(20);
Console.Error.WriteLine("Thread {0} here", oContext);
}
}

Done.Signal();
}

@roji
Copy link
Member

roji commented Sep 30, 2013

@glenebob, I ran your testcase on a very recent mono (3.2.3) on Linux and got the same behavior, this is clearly a serious mono issue...

I can file a bug with them and track it if you want.

@glenebob
Copy link
Contributor Author

Hi @roji,

I'm not sure how "serious" I'd call it. I've been doing some reading, and it seems that Microsoft's implementation isn't guaranteed to be FIFO either. It gets a lot closer, but the spec doesn't seem to say it has to even try. You can report it if you want, but I'd bet that without some standards compliance issue to reference, it will be considered low priority at best.

We could implement a true FIFO mutex, but then we're really just writing bandaids to fix bandaids. The real fix to all of this is to get down to one thread doing all the socket read operations. That will move the thread synchronization problem to a domain where FIFO'ness means nothing. The only reason FIFO'ness is of any real consequence now is that the "two threads fighting over a socket" approach is kind of asking for trouble to begin with.

-Glen

@roji
Copy link
Member

roji commented Sep 30, 2013

You're right, I don't think anyone makes guarantees about FIFO lock behavior under contention. But the behavior you show seems to me to be a pretty clear-cut case of "scheduling" starvation which would have a pretty negative impact on many other apps... Let's see what the mono people say. In any case I really don't think we should be impleenting our own mutext...

All this regardless of the specific issue of threads fighting over a socket - you're right it's best to avoid that kind of thing in the first place...

@roji
Copy link
Member

roji commented Sep 30, 2013

Here we go: https://bugzilla.xamarin.com/show_bug.cgi?id=15062

Let's wait and see...

@glenebob
Copy link
Contributor Author

glenebob commented Oct 1, 2013

Francisco,

Did you get a chance to re-test this since the latest commit?

-Glen

@franciscojunior
Copy link
Member

Hi, @glenebob! Sorry for not having commented it.
I'll review and merge it.

One question : I noticed you added a compile time condition to workaround mono bug, but I also noticed that this problem happens on mono on Windows.

I have one suggestion : what if you add the thread.sleep call without conditions? Does it give problems when running under ms .net? If it doesn't, I think we could go without adding a condition.

Another possibility could be add a runtime check for mono.

I'd rather go with option 1 though.

@glenebob
Copy link
Contributor Author

glenebob commented Oct 2, 2013

On Tue, Oct 1, 2013 at 7:16 PM, Francisco Figueiredo Jr. <
notifications@github.com> wrote:

Hi, @glenebob https://github.com/glenebob! Sorry for not having
commented it.
I'll review and merge it.

One question : I noticed you added a compile time condition to workaround
mono bug, but I also noticed that this problem happens on mono on Windows.

Oh... I assumed it was OK on mono/Windows, because the synchronization
code must be based on Windows primitives, and should therefore inherit the
quasi-FIFO behavior.

Oh, I guess you did mention that before :)

I have one suggestion : what if you add the thread.sleep call without
conditions? Does it give problems when running under ms .net? If it
doesn't, I think we could go without adding a condition.

Another possibility could be add a runtime check for mono.

I'd rather go with option 1 though.

Let's do that. I was about to suggest trying to runtime check it, but the
Sleep(1) is pretty cheap and does work fine on Windows. This code's days
are numbered anyway; as soon as I figure out how to get to an
only-one-thread-reads-the-socket design, none of this will matter.

-Glen

franciscojunior added a commit that referenced this pull request Oct 2, 2013
Async notification fixes

This will fix the CPU hog problem you were seeing, and eliminates the need for the odd Sleep() call in ProcessServerMessages().

-Glen
@franciscojunior franciscojunior merged commit 9a88a14 into npgsql:master Oct 2, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants