Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LabRecorder deadlocks writing a streams footer, if it has not yet received anything from this stream #9

Closed
agricolab opened this issue Dec 3, 2018 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@agricolab
Copy link
Contributor

agricolab commented Dec 3, 2018

I noticed that LabRecorderCLI won't shut down after sending process.communicate('b\n')(or pressing Enter when run from a terminal). Instead it hung at Offsets thread is finished. The same happens with LabRecorder.exe. I could trace it to the fact that it was waiting for a stream which has not yet send any sample. Once i sent a sample, it would write the footer for the stream, close the xdf-file and return.

Don't know whether bug or feature, but probably worthwile to know.

@agricolab agricolab changed the title LabRecorderCLI deadlocks writing a streams footer, if it has not yet received anything from this stream LabRecorder deadlocks writing a streams footer, if it has not yet received anything from this stream Dec 3, 2018
@tstenner tstenner self-assigned this Dec 4, 2018
@tstenner tstenner added the bug Something isn't working label Dec 5, 2018
@tstenner
Copy link
Contributor

tstenner commented Dec 5, 2018

Definitely a bug, I was already looking into something related to this but a stream without data is a lot easier to reproduce.

@kowalej
Copy link

kowalej commented Dec 14, 2018

Hey I just uncovered this bug myself, the problem is that the calls to pull_sample and pull_chunk_multiplexed do not provide a timeout parameter, therefore they go by the LSL default which is "FOREVER". Therefore the typed_transfer_loop never checks or re-checks the shutdown flag, so the thread doesn't finish properly. Obviously if you actually have data for the inlet to pull, the code won't hang on the pull calls, but for a "dead" stream, it just keeps waiting.

recording.cpp typed_transfer_loop
first_timestamp = last_timestamp = in->pull_sample(chunk);
		timestamps.push_back(first_timestamp);
		file_.write_data_chunk(streamid, timestamps, chunk, in->get_channel_count());

		auto next_pull = Clock::now();
		while (!shutdown_) {
			// get a chunk from the stream
			in->pull_chunk_multiplexed(chunk, &timestamps);
			// for each sample...
			for (double &ts : timestamps) {
// if the time stamp can be deduced from the previous one...

@tstenner
Copy link
Contributor

Thanks @kowalej, that's exactly what I was looking for. I've commited a fix in fded925 and will upload binaries for that soon

@kowalej
Copy link

kowalej commented Dec 17, 2018

@tstenner I just finished doing a PR for the console message stuff and noticed that your commit fded925 has a potential bug.

//recording.cpp Line 377
while(first_timestamp == 0.0)
	first_timestamp = last_timestamp = in->pull_sample(chunk, 4.0); 

Since you are in a while loop and pull_sample keeps returning 0.0, you will get stuck in this loop during shutdown since you don't also check the shutdown state.

It should be:

while (!shutdown_ && first_timestamp == 0.0)
	first_timestamp = last_timestamp = in->pull_sample(chunk, 4.0); 

@tstenner
Copy link
Contributor

You're right, I fixed it.

@cboulay
Copy link
Contributor

cboulay commented Jan 9, 2019

We're still getting this bug.
Also the status bar kb-written seems to be broken.

@xbbsky
Copy link

xbbsky commented Jan 10, 2019

@cboulay I've fixed the status bar kb-written bug:

//mainwindow.cpp line 91 should be
auto fileinfo = QFileInfo(ui->rootEdit->text()+QDir::separator()+recFilename);

@cboulay
Copy link
Contributor

cboulay commented Jan 10, 2019

@xbbsky That didn't do much for me because it turns out that the file size wasn't incrementing. When I reduced the timeout parameter then everything seemed to fall in place. Thanks for your comment on that commit.

@cboulay
Copy link
Contributor

cboulay commented Jan 10, 2019

Maybe there's still some discussion to be had as to what the timeout parameter should be, but the bug as described in this issue is fixed.

@cboulay cboulay closed this as completed Jan 10, 2019
@kowalej
Copy link

kowalej commented Jan 11, 2019

I also experienced this issue again recently and setting a low value for timeout definitely fixed it. However, I don't really understand why it fixed the issue since the timeout was only a few seconds previously. Also, I may just be crazy but I remember the previous fix (with checking the shutdown parameter) as having resolved this issue for me before the holiday. Now it's a few weeks later and my recent tests weren't shutting down properly. It's like this issue cropped up again out of the blue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants