MLDatasets.jl

This package represents a community effort to provide a common interface for accessing common Machine Learning datasets. In contrast to other data-related Julia packages, the focus of MLDatasets.jl is specifically on downloading, unpacking, and accessing benchmark datasets. Functionality for the purpose of data processing or visualization is only provided to a degree that is special to some dataset.

This package is a part of the JuliaML ecosystem. Its functionality is built on top of the package DataDeps.jl.

Available Datasets

Datasets are grouped into different categories. Click on the links below for a full list of datasets available in each category.

Graphs - Datasets with an underlying graph structure: Cora, PubMed, CiteSeer, ...
Misc - Datasets that do not fall into any of the other categories: Iris, BostonHousing, ...
Text - Datasets for language models.
Vision - Vision related datasets such as MNIST, CIFAR10, CIFAR100, ...

Installation

To install MLDatasets.jl, start up Julia and type the following code snippet into the REPL. It makes use of the native Julia package manger.

import Pkg
Pkg.add("MLDatasets")

Contributing to MLDatasets

Pull requests contributing new datasets are warmly welcome. See the source code of any of the available implemented datasets for implementation examples.

Other data repositories for Julia

If you don't find here the dataset you are looking for, please let us know by opening an issue. Moreover, you can check out these other packages to find what you need:

License

This code is free to use under the terms of the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 426 Commits
.github/workflows		.github/workflows
data		data
docs		docs
src		src
test		test
.JuliaFormatter.toml		.JuliaFormatter.toml
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
LICENSE		LICENSE
Project.toml		Project.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

data

data

docs

docs

src

src

test

test

.JuliaFormatter.toml

.JuliaFormatter.toml

.git-blame-ignore-revs

.git-blame-ignore-revs

.gitignore

.gitignore

LICENSE

LICENSE

Project.toml

Project.toml

README.md

README.md

Repository files navigation

MLDatasets.jl

Available Datasets

Installation

Contributing to MLDatasets

Other data repositories for Julia

License

About

Releases 40

Contributors 33

Languages

License

JuliaML/MLDatasets.jl

Folders and files

Latest commit

History

Repository files navigation

MLDatasets.jl

Available Datasets

Installation

Contributing to MLDatasets

Other data repositories for Julia

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages