Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressed Seeking Quirks in FLAC Reader #1162

Merged
merged 1 commit into from Jan 5, 2017

Conversation

Cobaltergeist
Copy link
Contributor

Here's another fix for a bug I've identified while working on #1118. That PR seemed to fix the behavior of WAV and OGG files, but FLAC was still behaving oddly. I did some digging and figured out why:

The problem was that the underlying FLAC decoder has 2 seeking quirks that SoundFileReaderFlac::seek() and streamWrite() overlook.

First, every seek operation triggers a streamWrite() callback. Previously, this case would simply cause the function to return, but it turns out that it actually needs to do some work. When seeking, you'll land mid-block more often than not. And when that happens, the decoder feeds all the samples from the seek destination to the end of its block into streamWrite(). Ignoring those bytes is incorrect, as the next read call actually starts at the beginning of the next block. By skipping this partial block, we leave the stream at a later point than higher-level trackers (namely InputSoundFile) expect it to be. This causes the "early" short-read. To fix this I changed streamWrite() to act on these seek-decodes, and dump the output samples into the leftovers vector, where they will be picked up by the next SoundFileReaderFlac::read().

Second, the FLAC decoder does not like seeking straight to EOF. The workaround is easy. I just seek to one sample before the EOF, call FLAC__stream_decoder_skip_single_frame(), and re-clear the leftovers vector. In the process of fixing this one, I noticed that the m_channelCount variable added to address part of #966 is actually redundant. I removed it in favor of the already-available m_clientData.info.channelCount.

TL;DR, FLACs play fine from the start, but if you start jumping around, samples get dropped. And if you try to overrun the file, the underlying decoder resets and you start inadvertently playing sound from the beginning of the stream when you think you're at the end.

Overall, this is a smaller change, so hopefully it won't be too tough to review. I think I also found a small potential buffer overrun in an obscure condition within SoundFileReaderFlac::read. That code seemingly has the potential to overflow samples in the unlikely case where the leftovers buffer is larger than maxCount. It's a tiny tweak to fix, so I went ahead and did that.

@LaurentGomila
Copy link
Member

In the process of fixing this one, I noticed that the m_channelCount variable added to address part of #966 is actually redundant. I removed it in favor of the already-available m_clientData.info.channelCount.

I think it was added because m_clientData is only meant to contain the data that need to be shared between the reader instance and the FLAC callbacks, not persistent reader attributes. So strictly speaking, it was not redundancy but rather extra safety / clarity ;) But well, it's ok like this too, of course.

That code seemingly has the potential to overflow samples in the unlikely case where the leftovers buffer is larger than maxCount

It's more than a "potential overflow", it is an overflow everytime this code gets called. Because if we get there, we've already checked that leftovers.size() was greater than maxCount (ie. the size of the destination buffer).

Second, the FLAC decoder does not like seeking straight to EOF.

What happened exactly when the reader was asked to seek at EOF, before your fix?

*data->buffer++ = sample;
// If there's no output buffer, it means that we are seeking, so just skip the output write
if (data->buffer)
*data->buffer++ = sample;
data->remaining--;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the effect of decrementing data->remaining without adding data to the buffer? Well, is it useful at all to execute this statement if there's no buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, the whole function returned early when data->buffer was null. Since I removed that check, I figured I'd put a smaller one around the point where the buffer is actually dereferenced. I know that in practice, data->remaining will always be 0 when data->buffer is null (since that's the seek condition), and the data->leftovers.push_back(sample) path should always be taken then. But I'm always wary of dereferencing a pointer that hasn't been checked locally. So I just covered the single dereferencing line of code with the new check. I didn't touch the decrement because that preserves the countdown to where it will start loading the samples into the always-valid leftover buffer. But that's a bit of moot point since that should never happen anyway.

Maybe I should change the surrounding if condition to if (data->buffer && data->remaining > 0).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should change the surrounding if condition to if (data->buffer && data->remaining > 0).

That's the way to go in my opinion. I'd also change the comments because now the else block is also for the seeking case.

{
// The "write" callback will populate the leftovers buffer with the first batch of samples from the
// seek destination, and since we want that data in this typical case, we don't re-clear it afterward
FLAC__stream_decoder_seek_absolute(m_decoder, sampleOffset / m_clientData.info.channelCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight improvement: I'd compute and store (and comment) sampleOffset / m_clientData.info.channelCount before the test, instead of duplicating it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this purely for readability? The result of that division op is only used once in the if case and once in the mutually-exclusive else case. It's never repeated in any single call to seek(), so I can't see it optimizing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course it is for maintainability reasons, not for optimizing anything. If the expression ever has to change in the future, there will only be one place to modify, not two. Duplicated code, as small as it is, is never a good thing. Plus, it gives a name to the expression, making more obvious what it represents.

But now that I think about it, the else bock should use m_clientData.info.sampleCount, because the requested sample offset may be greater than that (and in this case, subtracting one sample still leads to EOF and the case that we're trying to avoid).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I think m_clientData.info.sampleCount was my original intention, but that somehow got left out while I was cleaning up my implementation. I'll fix that.

@Cobaltergeist
Copy link
Contributor Author

What happened exactly when the reader was asked to seek at EOF, before your fix?

The effect was described by @eXpl0it3r:

When I run my test app posted on the forum, I still see the value count up at offset 200, but it only runs about 1s and the stops playing.

Under the hood, when the FLAC decoder is asked to seek to (or beyond) the EOF, it returns a failure code, and resets its internal state, effectively seeking to 0 of its own accord. Because of this, on its next read, InputSoundFile will get back a full buffer of samples at the beginning of the file, instead of an empty buffer at the end of the file like it expects. Because of this mismatch, the tracked sample offset will overrun the sample count, and the EOF check will still be tripped, but since there's nonzero data, SoundStream can't be told to immediately clear the playing offset. Because of this, that offset will count up from the requested invalid position. To top it off, the buffer is processed, and the user will hear 1 second of data from the beginning of the file, before it loops or stops, correcting itself.

(What happens now is that InputSoundFile gets its empty buffer, the tracked offset stays at the EOF, the EOF check trips fully, the offset is reset, and the player stops/loops immediately).

@LaurentGomila
Copy link
Member

Ok I see. I feel like there's a cleaner way to avoid this issue, without writing specific code for this special case, but I can't find one so I'm fine with your fix.

@Cobaltergeist
Copy link
Contributor Author

If you're talking about the special case of seeking to the EOF, there is one other thing I have in mind. Instead of rigging up a real "seek" to EOF, the reader could just seek to zero and set a flag that will cause the next read() to return empty. Not sure if that's any better though, as it adds a special case of its own, but it means we wouldn't have to trick the decoder.

But if you're talking about the error case(s) in general, I think the best permanent solution would be to change the call/return sequence from Soundstream all the way down to the readers. Currently, everything fits together in a pretty fragile manner. This makes stuff like EOF detection and Loop Point addition tricky to reliably achieve within the constraint of preserving the API. I think once 3.0 comes around, we should rewrite the buffer-filling stack so that more information is propagated in parameters and returns than just sample counts and bools, so the classes don't have to go out of their way to infer the state of lower levels and coax state-changes in upper levels.

@LaurentGomila
Copy link
Member

Yes, we definitely need more robust error handling there.

@Cobaltergeist
Copy link
Contributor Author

I moved the buffer check, and tweaked the EOF seek to use the sample count.

I didn't pursue the "EOF flag" idea that I brought up, as nobody weighed in on it.

@@ -133,7 +133,6 @@ class SoundFileReaderFlac : public SoundFileReader
////////////////////////////////////////////////////////////
FLAC__StreamDecoder* m_decoder; ///< FLAC decoder
ClientData m_clientData; ///< Structure passed to the decoder callbacks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that m_channelCount is gone, you can remove two spaces before the comments XD

@eXpl0it3r
Copy link
Member

@Cobaltergeist Can you rebase this onto master?
@LaurentGomila Can you do a (final) review?

@Cobaltergeist
Copy link
Contributor Author

Rebased.

@LaurentGomila
Copy link
Member

Good for me, if this has been tested thoroughly.

@Cobaltergeist
Copy link
Contributor Author

Cobaltergeist commented Nov 16, 2016

I feel like this should be able to make it into 2.4.2, considering it's approved already, and also because the 2.4.2 project has Issue #966, which should be fully fixed once this PR gets merged.

@eXpl0it3r
Copy link
Member

Does this depend on your previous PR? We weren't able to include that in 2.4.1, because it added (or removed) functions, thus changing the ABI. If it doesn't depend on your previous changes, could you rebase the PR onto 2.4.x instead of master?

@Cobaltergeist
Copy link
Contributor Author

Wait, I thought my previous change was in 2.4.1. I guess it was changed since I last looked at it because it affected the ABI. Technically this is a non-conflicting change, but it still makes the most sense alongside #1118 even if it's not a strict prerequisite. So forget my last comment, we can keep this branch as-is.

That said, why is #966 in 2.4.2 if the PRs resolving it are in 2.5.0?

@eXpl0it3r
Copy link
Member

eXpl0it3r commented Nov 16, 2016

That said, why is #966 in 2.4.2 if the PRs resolving it are in 2.5.0?

Because it got added without knowing what it depends on. 😉
I changed it back to 2.5.0 just now.

@eXpl0it3r eXpl0it3r merged commit 5eaa748 into SFML:master Jan 5, 2017
@SFML SFML deleted a comment from eXpl0it3r Dec 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
SFML 2.5.0
  
Merged / Superseded
Development

Successfully merging this pull request may close these issues.

None yet

4 participants