Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop start/end position may be wrong (slightly off) #4

Open
segfault-bilibili opened this issue Mar 3, 2022 · 9 comments
Open

Loop start/end position may be wrong (slightly off) #4

segfault-bilibili opened this issue Mar 3, 2022 · 9 comments

Comments

@segfault-bilibili
Copy link

segfault-bilibili commented Mar 3, 2022

Currently it seems to be:

  1. loop start sample offset = loop start block index * 0x400
  2. loop end sample offset = loop end block index * 0x400

The loop header section seems to be interpreted as:

  1. loop start block index: big endian unsigned 32-bit integer
  2. loop end block index: big endian unsigned 32-bit integer
  3. loop cycle count (when it equals to 128 it means infinite): big endian unsigned 16-bit integer
  4. loop r01 (sorry, but I can't understand what "r01" means): big endian unsigned 16-bit integer

However, according to VGAudio:

https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Codecs/CriHca/HcaInfo.cs#L35

        public int LoopStartSample => LoopStartFrame * 1024 + PreLoopSamples - InsertedSamples;
        public int LoopEndSample => (LoopEndFrame + 1) * 1024 - PostLoopSamples - InsertedSamples;

https://github.com/Thealexbarney/VGAudio/blob/9d8f6ea04c83cccccb3dd7851a631bbd53a8dbbe/src/VGAudio/Containers/Hca/HcaReader.cs#L189

        private static void ReadLoopChunk(BinaryReader reader, HcaStructure structure)
        {
            structure.Hca.Looping = true;
            structure.Hca.LoopStartFrame = reader.ReadInt32();
            structure.Hca.LoopEndFrame = reader.ReadInt32();
            structure.Hca.PreLoopSamples = reader.ReadInt16();
            structure.Hca.PostLoopSamples = reader.ReadInt16();
            structure.Hca.SampleCount = Math.Min(structure.Hca.SampleCount, structure.Hca.LoopEndSample);
        }
@segfault-bilibili segfault-bilibili changed the title Loop start/end position may be wrong Loop start/end position may be wrong (slightly off) Mar 3, 2022
@segfault-bilibili
Copy link
Author

(Sorry for mistyping something above. Now it should be corrected)

I'm not very sure which interpretation of the loop header is correct. Or maybe both make some sense?

@segfault-bilibili
Copy link
Author

@Thealexbarney

@Thealexbarney
Copy link

Well, think about it a little. If your first two points are true than no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.

The structure of the loop block is

int LoopStartFrame:
int LoopEndFrame;
short PreLoopSamples;
short PostLoopSamples;

The pre-loop samples are the number of samples in the loop start frame that come before the loop point. The post-loop samples are the number of samples in the loop end frame that come after the loop point.

Or maybe both make some sense?
Nah, that decoder's interpretation of those values in the structure is completely wrong.

My decoder/encoder should be completely correct and has been thoroughly tested against CRI's decoders/encoders that are available.

@segfault-bilibili
Copy link
Author

If your first two points are true

Well, actually it's not "my" point - I didn't know HCA at all until I came across https://github.com/y2361547758/hca.js, which is TypeScript port of this project (https://github.com/Nyagamon/HCADecoder). Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).

no HCA file could have a loop that's not a multiple of 0x400. This would be extremely restrictive and result in funky loop points if they couldn't be multiples of 0x400.

That's actually exactly what I had thought of. However I have been unable to imagine where the more accurate loop start/end pointers could be put at, until I came across your project (https://github.com/Thealexbarney/VGAudio).

I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples == 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".

@segfault-bilibili
Copy link
Author

By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.

@Thealexbarney
Copy link

Even for now I still have no idea how stock/official HCA decoder works (which should require reverse engineering).

I've reverse engineered the HCA encoder/decoder. The implementation in VGAudio is functionally the same as the official one is, producing the exact same data output for both encoding and decoding, so it makes a good reference.

(Note: I replaced the IMDCT implementation CRI uses with a faster one. The only difference in the output from the current master VGAudio build will be due to tiny rounding differences. When using the IMDCT implementation CRI uses the outputs are identical.)

I think Nyagamon's interpretation of loop header may make some sense because every (although the number is very few) HCA (which is infinitely looped in game) I have examined seems to have loop.PreLoopSamples == 0x0080. Therefore I guess maybe it makes sense that 0x0080 probably means "loop count is infinite".

No, that's because of how the encoder works. The encoder inserts a subframe of audio at the start because decoding a subframe requires some of the data from the previous subframe.
Then the encoder adds enough samples to align the loop start to the beginning of a frame so the minimum amount of processing is needed to seek to the loop point.

This results in the loop start being one subframe past the start of a frame since the decoder needs data from the previous subframe to decode the next one.

BTW, be sure to account for the InsertedSamples and AppendedSamples when doing everything. These are empty samples added to the beginning and end of the actual audio because encoding to HCA requires the number of samples to be a multiple of the frame size.

@Thealexbarney
Copy link

By the way, I wonder it's signed or unsigned integers? It doesn't seem to make sense to use nagetive values here.

They're signed, but it doesn't really matter since they won't get anywhere near the limit of the signed types.

@segfault-bilibili
Copy link
Author

Thank you very much!

decoding a subframe requires some of the data from the previous subframe.

  1. Is a subframe 128-sample long (and, a frame consists of 8 subframes)? I once observed this in hex editor but I'm still not sure how long the "influence" would last.

  2. Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data.


To be honest I know almost nothing about signal processing etc... I feel sorry if my noob questions occupied your time. However these two questions should be the last ones I want to ask.

Again, thanks a lot!

@Thealexbarney
Copy link

  1. Is a subframe 128-sample long (and, a frame consists of 8 subframes)?

Yes. This is true for all HCA files.

Side thought: Oops, I just noticed that naming inconsistency

public const int SubframesPerFrame = 8;
public const int SubFrameSamplesBits = 7;
  1. Is any successive (following) data also needed to decode one frame? I once heard that (I)MDCT has reference to both previous and successive data

Oversimplifying enough to answer your question, decoding subframe(SF) N requires only the encoded data from both SF N-1 and SF N. It doesn't need any data from any other SF, so it doesn't need data from either SF N-2 or SF N+1.

For example, decoding the audio in subframe 4 requires only the encoded data from both SF 3 and SF 4. It doesn't need any data from SF 2, SF 5 or any other SF.

This is why an extra subframe is inserted at the beginning during encoding and thrown out as garbage when decoding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants