Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[add:lib] Add sequentia.datasets module #214

Merged
merged 4 commits into from
Jun 19, 2022
Merged

[add:lib] Add sequentia.datasets module #214

merged 4 commits into from
Jun 19, 2022

Conversation

eonu
Copy link
Owner

@eonu eonu commented Jun 19, 2022

Implements #141 and partially #194 by adding sequentia.datasets.

Major changes

  • Add sequentia.datasets module:
    • sequentia.datasets.load_digits for loading the Free Spoken Digit Dataset (stored in lib/sequentia/datasets/data/digits.npz).
    • sequentia.datasets.base.Dataset for representing generic dataset objects.
    • sequentia.datasets.base.TorchDataset for representing torch-compatible datasets.
    • Add RTD documentation and code examples.
    • Update notebooks to use sequentia.datasets.
  • Modify _Validator.is_observation_sequences() to accept a specific dtype for numpy.ndarray objects (defaults to numpy.float32).
  • Use numpy.float64 for KNNClassifier.
  • Move check_package into lib/sequentia/internals/versions.py and also add is_torch_installed() for checking torch versions.
  • Remove torchaudio, torchvision and torchfsdd dependencies.

Minor changes

  • Remove sentiment classification example from README.md.
  • Shorten RTD preprocessing warning about torch compatibilty.
  • Add notebooks/nbutils.py with a helper function play_audio for playing audio samples from numpy.ndarray objects.

@eonu eonu merged commit 8c473ea into dev Jun 19, 2022
@eonu eonu deleted the add/datasets-module branch June 19, 2022 01:24
@eonu eonu mentioned this pull request Jun 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add sequentia.datasets module for readily available real-world datasets
1 participant