Skip to content

Releases: huggingface/dataset-viewer

0.21.0

14 Feb 10:49
ded9a8c
Compare
Choose a tag to compare

What's Changed

  • split the code and move to a monorepo by @severo in #210
  • Docker by @severo in #214
  • Send docker images to ecr by @severo in #218
  • Rename to datasets server by @severo in #221
  • Use kubernetes by @severo in #227
  • Add datasets-server-worker to the Kube cluster by @severo in #236
  • Nginx proxy by @severo in #245
  • feat: 🎸 upgrade datasets to 2.2.0 by @severo in #246
  • feat: 🎸 upgrade the docker images to use datasets 2.2.0 by @severo in #247
  • feat: 🎸 upgrade datasets to 2.2.1 by @severo in #253
  • feat: 🎸 use images with datasets 2.2.1 by @severo in #254
  • Add metrics by @severo in #258
  • feat: 🎸 upgrade images to get /prometheus endpoint by @severo in #262
  • fix: 🐛 add support for mongodb+srv:// URLs using dnspython by @severo in #263
  • Prod env by @severo in #266
  • feat: 🎸 upgrade images by @severo in #267
  • fix: 🐛 fix loop by @severo in #268
  • feat: 🎸 upgrade image by @severo in #269
  • fix: 🐛 fix the query to get the list of jobs in the queue by @severo in #271
  • Upgrade worker by @severo in #272
  • Add service monitor by @severo in #260
  • fix: 🐛 fix nfs mount by @severo in #274
  • feat: 🎸 add the admin service (to run admin scripts) by @severo in #275
  • feat: 🎸 enable monitoringin prod by @severo in #276
  • fix: 🐛 the block list must be a comma-separated list by @severo in #278
  • Fix ram in prod by @severo in #280
  • feat: 🎸 upgrade images by @severo in #281
  • fix: 🐛 disable the metrics about cache and queue by @severo in #282
  • feat: 🎸 upgrade images by @severo in #283
  • test: 💍 fix test by @severo in #284
  • feat: 🎸 update prod values by @severo in #285
  • perf: ⚡️ reduce the number of workers by @severo in #287
  • fix: 🐛 increase resources for api, and block big datasets by @severo in #289
  • feat: 🎸 upgrade datasets to 2.2.2 (and minor upgrades) by @severo in #290
  • feat: 🎸 update docker images by @severo in #291
  • Fix valid endpoint query by @severo in #292
  • Update docker images by @severo in #294
  • feat: 🎸 add indexes in mongo by @severo in #295
  • feat: 🎸 update docker images by @severo in #296
  • Reenable metrics by @severo in #298
  • feat: 🎸 update docker images by @severo in #299
  • fix: 🐛 disable cache and queue metrics for now by @severo in #300
  • feat: 🎸 update the docker images by @severo in #303
  • perf: ⚡️ increase the number of replicas for the API by @severo in #304
  • feat: 🎸 block two datasets by @severo in #305
  • ci: 🎡 use cache (gha) when building the docker images by @severo in #313
  • ci: 🎡 use cache with poetry by @severo in #314
  • ci: 🎡 launch e2e after docker build, and use the images by @severo in #316
  • feat: 🎸 use only one uvicorn worker per api pod by @severo in #317
  • feat: 🎸 adapt the value of resources based on monitoring by @severo in #321
  • feat: 🎸 upgrade dependencies by @severo in #322
  • Respond to datasets-server.huggingface.co by @severo in #328
  • Optimize the query behind /splits by @severo in #329
  • feat: 🎸 update the docker image for api by @severo in #330
  • feat: 🎸 use the tls certificate with two domains by @severo in #331
  • fix: 🐛 optimize the query to get the list of valid datasets by @severo in #333
  • feat: 🎸 update api docker image by @severo in #335
  • feat: 🎸 update dependencies to update libcache and libqueue by @severo in #336
  • feat: 🎸 update docker image by @severo in #337
  • feat: 🎸 add an index to optimize the distinct query by @severo in #338
  • feat: 🎸 update docker image by @severo in #339
  • Add metrics endpoint to admin by @severo in #340
  • Expose admin metrics by @severo in #341
  • fix: 🐛 give every servicemonitor its name by @severo in #342
  • ci: 🎡 use reusable workflows, and conditional runs on path by @severo in #344
  • Be more explicit about the current docker images by @severo in #345
  • Be more explicit about the current docker images by @severo in #346
  • ci: 🎡 fix the file extension by @severo in #347
  • ci: 🎡 checkout the repo before accessing a file by @severo in #348
  • ci: 🎡 fix missing replace by @severo in #349
  • feat: 🎸 remove old domain datasets-server.huggingface.tech by @severo in #351
  • Remove the datasets blocklist and re-enqueue server errors by @severo in #352
  • feat: 🎸 upgrade libqueue and libcache by @severo in #353
  • Fix worker by @severo in #354
  • feat: 🎸 update images by @severo in #356
  • feat: 🎸 increase resources for the workers by @severo in #357
  • feat: 🎸 update the resources by trial and error by @severo in #358
  • fix: 🐛 adapt the pods resources by @severo in #359
  • feat: 🎸 use the new certificate by @severo in #360
  • fix: 🐛 ensure the NUMBA_CACHE_DIR is set by @severo in #361
  • fix: 🐛 use a new name for the numba cache preparation by @severo in #362
  • Allow none path in audio by @severo in #363
  • fix: 🐛 don't mark empty splits as stalled by @severo in #366
  • docs: ✏️ add doc about k8 by @severo in #370
  • Fix dockerfiles by @severo in #372
  • Add timestamp type by @severo in #374
  • feat: 🎸 upgrade datasets to 2.3.1 by @severo in #375
  • fix: 🐛 fix the log name by @severo in #377
  • feat: 🎸 upgrade datasets (and dependencies) by @severo in #381
  • feat: 🎸 adjust the prod resources by @severo in #383
  • feat: use new cache locations (to have empty ones) by @severo in #385
  • feat: 🎸 increase the log verbosity to help debug by @severo in https://github.com/huggingface/datasets-se...
Read more

0.20.2

14 Apr 13:24
751053e
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.20.1...0.20.2

0.20.1

12 Apr 11:45
4f940cb
Compare
Choose a tag to compare

What's Changed

  • fix: 🐛 allow streaming=False in get_rows by @severo in #207

Full Changelog: 0.20.0...0.20.1

0.20.0

12 Apr 08:17
623606d
Compare
Choose a tag to compare

What's Changed

  • feat: 🎸 install libsndfile 1.0.30 and support opus files by @severo in #195
  • Fix detection of pending jobs by @severo in #198
  • [BREAKING] fix: 🐛 quick fix to avoid mongodb errors with big rows by @severo in #201
  • Simplify cache by dropping two collections by @severo in #202

Migration: the cache database structure has been modified. Run 20220408_cache_remove_dbrow_dbcolumn.py to migrate the database

Full Changelog: 0.19.1...0.20.0

0.19.1

04 Apr 16:27
4a9bf7a
Compare
Choose a tag to compare

What's Changed

  • test: 💍 re-enable tests for temporarily disabled datasets by @severo in #192
  • give reason in error if dataset/split cache is refreshing by @severo in #193

Full Changelog: 0.19.0...0.19.1

0.19.0

04 Apr 09:28
1a6eb0c
Compare
Choose a tag to compare

What's Changed

  • remove "gated datasets unlock" logic by @severo in #189. Note that it's a breaking change that requires the use of the new "app tokens" from moon-landing.

Full Changelog: 0.18.3...0.19.0

0.18.3

25 Mar 13:57
de2ff07
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.18.2...0.18.3

0.18.2

16 Mar 11:11
6f1b609
Compare
Choose a tag to compare

What's Changed

  • feat: 🎸 upgrade to datasets 2.0.0 by @severo in #182

Full Changelog: 0.18.1...0.18.2

0.18.1

14 Mar 14:33
155843f
Compare
Choose a tag to compare

What's Changed

  • feat: 🎸 revert double limit on the rows size (reverts #162) by @severo in #179

Full Changelog: 0.18.0...0.18.1

0.18.0

14 Mar 14:14
f406c0d
Compare
Choose a tag to compare

What's Changed

  • feat: 🎸 truncate cell contents instead of removing rows by @severo in #178

Full Changelog: 0.17.8...0.18.0