Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #956 (GUI freeze on Windows) #980

Closed
wants to merge 26 commits into from
Closed

Fix for #956 (GUI freeze on Windows) #980

wants to merge 26 commits into from

Conversation

Diapolo
Copy link

@Diapolo Diapolo commented Mar 23, 2012

Okay, so my former assumption boost 1.47 would fix #956 was wrong and boost 1.49 did not fix it either. It turns out, that I could recreate the error condition by running Bitcoin-Qt and simply switch of my PC. So there were the 2 orphan files from boost interprocess that cause the problem.

I found out, that no interprocess_exception was thrown, not in qtipcserver.cpp and not in bitcoin.cpp. Next step was to look through the code and the "create_or_open" option in the message_queue constructor got my attention. I changed this to create_only and voila there was an exception I could work with. It sais "file already exists", which makes sens as there are those orphan files left.

In the end I introduced a new ipcRecover() function, which is called after catching interprocess::already_exists_error. This function works with the path were the stale message queue file "BitcoinURL" resides and checks if the file is there and tries to remove it. If that succeeds the function returns true and ipcInit() is called again. If this fails an error is written to the debug.log.

Test case with 1.47 boost libs:

  • kill running Bitcoin-Qt instance by switching PC of
  • check for stale files, which are there
  • starting the client again with the fix disabled in the code
  • no exception, no log entries and a frozen GUI
  • starting the client again with fix enabled results in a not frozen GUI and normal client operation

Log:
ipcInit - boost interprocess exception #9: File exists.
ipcRecover - old message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...success

I know there are many commits in here, so I would suggest to first only look at the diff and if you are interested in the history of the fix take a look at the commits itself.

As I only run Windows the Linux / Mac devs should test this code, during our discussion.

@TheBlueMatt
Copy link
Contributor

How does bitcoin-qt cope if you start a second instance which runs this patch (ie a testnet one and a mainnet one at the same time)?

// try a recovery to fix #956 and pass our message queue name
if (ipcRecover("BitcoinURL"))
// if that worked try init once more
ipcInit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though it looks like it should probably never happen, the possibility of an infinite recursion here scares me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same in my mind and came to the conclusion it should never happen ... I don't decide it, but I guess there are several places in the code where are theoretical chances that something bad happens, you could surround nearly everything with try - catch. But let's discuss it :).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or just add a simple flag to say, dont try again if you fail here...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's even better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't shoot me, but I suggest goto...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wont ever use a goto in C-like languages ^^ ... you would have to force me to do this :D!

@Diapolo
Copy link
Author

Diapolo commented Mar 23, 2012

@TheBlueMatt I will have to try this ... never used testnet before (does this work in the default and same data-dir?). But I have an idea if it causes problems in the current state. Perhaps I could add an "testnet" string to the message_queue name, which is now "BitcoinURL". Is there a utility function to query where I mine?

@TheBlueMatt
Copy link
Contributor

Adding a testnet string doesnt fix the problem, there are also valid uses for opening two bitcoin-qt.exes on mainnet at the same time (using different datadirs and ports).

@Diapolo
Copy link
Author

Diapolo commented Mar 23, 2012

@TheBlueMatt I will do some tests on this tomorrow. I don't even know if there are issues with your mentioned case at all, but thanks for your input :)!

@sipa
Copy link
Member

sipa commented Mar 24, 2012

  1. can you rebase this into one commit? (15 is a bit excessive for a bug fix...)
  2. there seem to be some coding style errors and unneccesary code moves.
  3. if this indeed fixes 0.6rc4 gui frozen on Windows 7 #956, nice! that was the last blocking bug for 0.6.0, afaik.

using namespace boost;
using namespace std;

// Global state
bool fInitCalledAfterRecovery = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a parameter to ipcInit instead of a global

@Diapolo
Copy link
Author

Diapolo commented Mar 24, 2012

@sipa What do you mean by coding style errors, would be nice if you can point me to them, so I can fix it. I only added so many commits, to get some kind of history, will rebase.

@@ -256,6 +257,8 @@ int main(int argc, char *argv[])
mq.try_send(strURL, strlen(strURL), 0);
}
catch (boost::interprocess::interprocess_exception &ex) {
printf("boost interprocess exception #%d: %s\n", ex.get_error_code(), ex.what());
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you asked for coding style errors: please indent this break the same depth as the printf :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, used tabs there, doh ...

@Diapolo
Copy link
Author

Diapolo commented Mar 24, 2012

Question on coding style, is it common to only use ex for exeption handling, no matter what type they are? If yes, I will revert that, too. And what about boost::interprocess::message_queue mq -> boost::interprocess::message_queue mqMessageQueue, good or bad? I'm asking, because in bitcoin.cpp it's still mq.

@Diapolo
Copy link
Author

Diapolo commented Mar 24, 2012

But back to the test case ... as I currently can't compile Bitcoin-Qt with static libs, I'm not able to run 2 instanced at the same time. But I think this definately needs some further investigation and clarification!

So 2 instances with different ports and different data-dirs currently work on RC4? They even share the same message queue, right? Then this fix needs to take this into account and I have an idea, but need your thoughts.

Current fix version will do the following (I assume now they share the message queue file BitcoinURL):

  • start 1st instance, BitcoinURL will be created, because of interprocess::create_only flag in ipcInit() -> OK
  • start 2nd instance, BitcoinURL can't be created, because it's already there and the files is locked by the 1st instance -> triggers the exception interprocess::already_exists_error and tries to remove the file via ipcRecover(true)
  • this will cause an system::errc::broken_pipe error, which is catched and BitcoinURL is not deleted and should be normaly used in the further client execution, because in bitcoin.cpp we use "interprocess::message_queue mq(interprocess::open_only, "BitcoinURL");", which should work. Running 2 instances would trigger some entries to debug.log, but should not cause a program failure. What are your oppinions?

Edit: I'm off for a few hours, will be back later :).

@laanwj
Copy link
Member

laanwj commented Mar 24, 2012

So to summarize: If you start multiple instances, it will work, but the most recent instance will get the URLs?

That behavior is OK with me.

I don't think many people start multiple instances of bitcoin, and if they do, they generally have no use case involving a browser (more for servers etc...). However, it should be possible to start multiple instances with multiple datadirs.

@Diapolo
Copy link
Author

Diapolo commented Mar 24, 2012

Okay, so the latest version is more friendly in terms of what is displayed in the debug.log.

stale message queue IS there and can be removed:
ipcInit - boost interprocess exception #9: File already exists.
ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...success

This tries ipcInit() again!


message queue IS there and locked (this would be if another instance is running):
ipcInit - boost interprocess exception #9: File already exists.
ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...unneeded

This doesn't try ipcInit() again!


(stale?) message queue IS there, not locked, but cannot be removed:
ipcInit - boost interprocess exception #9: File already exists.
ipcRecover - possible stale message queue found, trying to remove C:\ProgramData\boost_interprocess\BitcoinURL ...failed
ipcRecover - removal of old message queue failed with error (value): (message)

This doesn't try ipcInit() again! But URL handling will most likely not work ...


Edit: I will rebase after the discussion is finished, if this is okay!

message_queue* mq = (message_queue*)parg;
std::string strIpcDir;
// get path to stale ipc message queue file
interprocess::ipcdetail::tmp_filename(pszFilename, strIpcDir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no interprocess::ipcdetail on Linux.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigating... got it, with boost 1.47 this is "detail" instead of "ipcdetail" for 1.49 ... will see if the fix still compiles with 1.47 (I was on 1.49 in my build-lab).

I would vote for a switch to 1.49 in the near future!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with last commit, compiles again with boost 1.47!

@sipa
Copy link
Member

sipa commented Mar 25, 2012

Since this is a workaround for a boost-specific windows-only problem, I have no problem with using undocumented and windows-only boost internals. You do need to make sure it keeps working on other systems though, so enclose the fix in #ifdef WIN32 / #else / #endif, and try to keep the windows-specific part as small as possible. You'll probably need other #ifdefs to select specific boost versions as well, but for now that seems the only option. At least building on non-windows systems should keep working with all (recent) boost versions.

Regarding explicit namespace or using namespace: do whatever makes the code easiest to read. I don't care.

@Diapolo
Copy link
Author

Diapolo commented Mar 25, 2012

For now I will add the needed #ifdefs, so that the fix is activated only on Windows. After we are sure this works and breaks nothing, I will definately look into how I can write this in a platform independent way (new pull request and no blocking change ^^).

Oh and thanks for the "using namespace" clarification ... that was in my mind, too. Easier to read code and better understandable code.

@gavinandresen
Copy link
Contributor

It feels to me like this feature (click-to-open bitcoin: URIs) needs a re-think on Windows; poking boost::interprocess internals is a bad idea that we should avoid if at all possible.

For the 0.6 release, I'm inclined to disable click-to-open bitcoin: URIs on Windows entirely, and get a cleaner fix for 0.7.

@Diapolo
Copy link
Author

Diapolo commented Mar 25, 2012

Tell me the intended release date for 0.6 ... I'm trying to get a cleaner fix ready.

@TheBlueMatt
Copy link
Contributor

Due to the massive list of changes since rc4, we really need to push an rc5 and give it a while to simmer before we can really release a 0.6 release (IMO). In that case, I have no problem merging this and poking windows-only boost internals, as long as all the changes in here are within #ifdef WIN32's

@Diapolo
Copy link
Author

Diapolo commented Mar 25, 2012

May I suggest the following, as this one has plenty of commits, I will rebase it into a single one with the current code here.

It could then be used in a possible RC5, but I would really like a dev to test a compilation and execution with Linux / Mac before, as I can't do this. During the testing phase of a possible RC5 I have time to work on a cross-platform compatible version without the use of boost::interprocess internals.

@sipa
Copy link
Member

sipa commented Mar 25, 2012

Don't worry, we won't merge it if it doesn't compile cleanly.

@Diapolo
Copy link
Author

Diapolo commented Mar 25, 2012

Last commit does not use any boost::interprocess internals, it works and removes stale message queues (tested this after I once more crashed / switched of my Windows machine to create a stale mq).

If an existing message queue is detected the first try is to remove it. If it's locked (i.e. by another BC instance) that won't work and I try ipcInit() again with the open_or_create mode on the message queue to use it.
If the stale message queue could be removed, ipcInit() is called once more with the create_only mode for the mq and a new one is created.

Perhaps I can now even safely remove the #ifdef WIN32, but this should be checked by the Linux / Mac devs.

Edit: The old ipcRecover() is in just for reference during the discussion, I would like to remove that one and change ipcRecover2() into ipcRecover()!

@TheBlueMatt
Copy link
Contributor

An alternative to this that (should) work and also fixes #981 can be found at:
TheBlueMatt@1b327da
though its more of a hack since I really, really dont feel like diving into the win32 apis and figuring out whats going on in that ugly mess.

@Diapolo
Copy link
Author

Diapolo commented Mar 25, 2012

@TheBlueMatt I compiled Bitcoin-Qt with boost 1.49 and that did NOT fix this bug. Is your commit a backport from boost > 1.47 to 1.47?

@TheBlueMatt
Copy link
Contributor

No, that commit is something I wrote, afaict boost has not yet fixed the bug (though it was filed 12 months ago as https://svn.boost.org/trac/boost/ticket/5392 )

@gavinandresen
Copy link
Contributor

I'm going to be the "bad guy" for this one and take the conservative route: lets get this fixed and thoroughly tested early in the 0.7 release, and disable bitcoin: URI handling on Windows for the 0.6 release.

@sipa
Copy link
Member

sipa commented Mar 26, 2012

Given #985, it seems URI's never worked on windows anyway...

@Diapolo
Copy link
Author

Diapolo commented Mar 26, 2012

The URI feature is little known, so perhaps the problem with URI handling in general was not observed, since no one ever used it on Windows (i.e. because currently there are very few links out in the internet).

@gavinandresen I'm fine with 0.7 ... you would only be the bad guy, if you said that code sucks all over the place, go away ;) ... but you didn't do that.

Philip Kaufmann added 22 commits March 27, 2012 11:47
…, if access to the file is denied and made ipcRecover() return a bool
…erprocess:detail / interprocess::ipcdetail internals
@Diapolo
Copy link
Author

Diapolo commented Mar 27, 2012

Rebased with current master and fixed up a little git mess I created ...

@Diapolo
Copy link
Author

Diapolo commented Mar 29, 2012

Doh, this one seems completely obsolete in the light of my current work on the IPC server and URI-handling code. It's a pain in Windows to use a "file" for doing IPC communication it seems (at least when things go wrong ^^). There a severe handling problems with stale / wrong mq files and the combination of 2 Bitcoin-Qt instances ... but I made some good progress.

I would like to close this one tomorrow and will publish my work until the weekend, so it can be discussed and reviewed!

@Diapolo
Copy link
Author

Diapolo commented Apr 1, 2012

Closed!

@Diapolo Diapolo closed this Apr 1, 2012
suprnurd pushed a commit to chaincoin-legacy/chaincoin that referenced this pull request Dec 5, 2017
ptschip pushed a commit to ptschip/bitcoin that referenced this pull request Mar 7, 2018
lateminer pushed a commit to lateminer/bitcoin that referenced this pull request Oct 30, 2019
c7fa318 Remove Bitcoin Core 0.8 block hardlinking (JSKitty)

Pull request description:

  This isn't used in PIVX since none of our clients are running with Bitcoin Core's old block file structure, time to kill it?

ACKs for top commit:
  furszy:
    Pretty straightforward cleanup 👍 ,  utACK [c7fa318](PIVX-Project@c7fa318)
  random-zebra:
    utACK PIVX-Project@c7fa318 and merging...

Tree-SHA512: 8e9fc875c6367d102685861c5ab6f02b6b634d6bfd52ba8817c776523339f0f3359cb5d2ca071ef8f6b7773ee8ac7a1fed5b5bdd7144aa66751b459facc43294
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0.6rc4 gui frozen on Windows 7
6 participants