Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix_manifests function cost much time #1281

Closed
xiangxyq opened this issue Feb 5, 2024 · 3 comments · Fixed by #1284
Closed

fix_manifests function cost much time #1281

xiangxyq opened this issue Feb 5, 2024 · 3 comments · Fixed by #1284

Comments

@xiangxyq
Copy link

xiangxyq commented Feb 5, 2024

Hi,
I prepare my own data in TAG: 1.19.0, it is ok;

but when update code in TAG: 1.20.0, I found it cost too many time to prepare my data. check the code, the issue caused by fix_manifests function, but I don't know how to fix it.

part of my code:

with ProcessPoolExecutor(num_jobs) as ex:
    for (recording, segment) in tqdm(
        ex.map(
            parse_utterance,
            raw_manifests
        ),
        desc="Processing Corpus",
    ):
        manifests["recordings"].append(recording)
        manifests["supervisions"].append(segment)

recordings, supervisions = fix_manifests(
    recordings=RecordingSet.from_recordings(manifests["recordings"]),
    supervisions=SupervisionSet.from_segments(manifests["supervisions"]),
)
validate_recordings_and_supervisions(
    recordings=recordings, supervisions=supervisions
)

Thanks

@Keith-Hon
Copy link

same issue

@pzelasko
Copy link
Collaborator

pzelasko commented Feb 9, 2024

Will try to look into it tomorrow

@pzelasko pzelasko linked a pull request Feb 9, 2024 that will close this issue
@pzelasko
Copy link
Collaborator

pzelasko commented Feb 9, 2024

Please try again with PR #1284

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants