Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logical bug in MaestroDataset #26

Open
almostimplemented opened this issue May 18, 2023 · 0 comments
Open

Logical bug in MaestroDataset #26

almostimplemented opened this issue May 18, 2023 · 0 comments

Comments

@almostimplemented
Copy link

In utils/data_generator.py, line 86, we ensure we grab a segment that is contained by the waveform:

        # Load hdf5
        with h5py.File(hdf5_path, 'r') as hf:
            start_sample = int(start_time * self.sample_rate)
            end_sample = start_sample + self.segment_samples

            if end_sample >= hf['waveform'].shape[0]:
                start_sample -= self.segment_samples 
                end_sample -= self.segment_samples

However, you fail to update start_time, so when you later grab the target_dict, it will be off by self.segment_seconds.

            # Process MIDI events to target
            (target_dict, note_events, pedal_events) = \
                self.target_processor.process(start_time, midi_events_time, 
                    midi_events, extend_pedal=True, note_shift=note_shift)

I don't think this is an issue, because your Sampler logic only constructs meta for valid segments:

while (start_time + self.segment_seconds < hf.attrs['duration'])

but it is still a logical error so I thought I would report and offer a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant