Skip to content

Comments

WSS3Shard: remote audio shards on S3#34

Open
jpc wants to merge 4 commits intojpc/fields-cleanupfrom
jpc/s3-shards
Open

WSS3Shard: remote audio shards on S3#34
jpc wants to merge 4 commits intojpc/fields-cleanupfrom
jpc/s3-shards

Conversation

@jpc
Copy link
Member

@jpc jpc commented Feb 6, 2026

📚 PR Stack

# Branch Base PR
1 jpc/preloading main #16
2 jpc/validate-shards-in-sql jpc/preloading #18
3 jpc/shard-pipe jpc/validate-shards-in-sql #19
4 jpc/sql-without-index jpc/shard-pipe #20
5 jpc/special-shard-columns jpc/sql-without-index #21
6 jpc/remove-in-progress jpc/special-shard-columns #22
7 jpc/fix-jupyter-repr jpc/remove-in-progress #23
8 jpc/sql-select-dotted jpc/fix-jupyter-repr #24
9 jpc/optimize-durations jpc/sql-select-dotted #25
10 jpc/source-links jpc/optimize-durations #26
11 jpc/audio-qol jpc/source-links #27
12 jpc/warn-subsampling-in-dataloader jpc/audio-qol #28
13 jpc/wsds-inspect-head jpc/warn-subsampling-in-dataloader #29
14 jpc/feather-index jpc/wsds-inspect-head #30
15 jpc/indexer jpc/feather-index #31
16 jpc/keys-diff jpc/indexer #32
17 jpc/fields-cleanup jpc/keys-diff #33
18 jpc/s3-shards jpc/fields-cleanup #34 ◀️
19 jpc/indexing-fixes jpc/s3-shards #35
20 jpc/validate-keys-on-the-fly jpc/indexing-fixes #36
21 jpc/repr-missing-last jpc/validate-keys-on-the-fly #37
22 jpc/drop-slots jpc/repr-missing-last #38

  • WSDataset: scan the dataset folder even if the index contains a field list
  • WSShardInterface: remove source_dataset from the from_link interface
  • WSS3Shard: remote audio shards on S3
  • pupyarrow: a pure-Python PyArrow implementation with good lazy-loading support

@jpc jpc changed the title WSDataset: scan the dataset folder even if the index contains a field list; WSShardInterface: remove source_dataset from the from_link interface; WSS3Shard: remote audio shards on S3; pupyarrow: a pure-Python PyArrow implementation with good lazy-loading support WSS3Shard: remote audio shards on S3 Feb 6, 2026
@jpc jpc marked this pull request as ready for review February 6, 2026 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant