Remove pandas dependency from meldataset.py by netlinux-ai · Pull Request #362 · yl4579/StyleTTS2

netlinux-ai · 2026-05-09T06:50:34Z

Problem

meldataset.py imports pandas and uses it for one operation: filtering data_list by speaker_id when sampling a reference clip. Pandas is otherwise unused.

This adds a heavyweight dependency for a single filter call. It also pulls in pandas's transitive pyarrow dependency, which is the source of compatibility friction:

pyarrow's published wheels assume a v2-baseline x86 CPU (SSE4.1 in static initialisers); older CPUs cannot load them.
pyarrow tracks newer Python release schedules aggressively, sometimes lagging behind by a release.
pip install weight goes up by ~80 MB for one filter line.

Fix

Replace the single pandas usage with a list comprehension + random.choice:

# before
ref_data = (self.df[self.df[2] == str(speaker_id)]).sample(n=1).iloc[0].tolist()

# after
matching = [r for r in self.data_list if r[2] == str(speaker_id)]
ref_data = random.choice(matching)

Drops both pandas and pyarrow from the dependency graph. Unused self.df member is removed; unused import pandas as pd is removed.

Tested with

Full fine-tune training run on PyTorch 2.7.0
Reference-clip sampling behaviour confirmed identical: same uniform-random selection within speaker
~80 MB dep removal verified

The only use of pandas was a single speaker_id filter when sampling a reference clip; replaced with a list comprehension and random.choice. Drops pandas (and its transitive pyarrow dep) from the dependency graph.

Remove pandas dependency from meldataset.py

694516e

The only use of pandas was a single speaker_id filter when sampling a reference clip; replaced with a list comprehension and random.choice. Drops pandas (and its transitive pyarrow dep) from the dependency graph.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove pandas dependency from meldataset.py#362

Remove pandas dependency from meldataset.py#362
netlinux-ai wants to merge 1 commit intoyl4579:mainfrom
netlinux-ai:chore/drop-pandas

netlinux-ai commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

netlinux-ai commented May 9, 2026

Problem

Fix

Tested with

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant