Skip to content

PMBio/mudatasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Datasets

mudatasets provides some public datasets with multimodal data, primarily focusing on multimodal omics datasets.

MuData library | MuData documentation

Installation

PyPi version

# Stable, with muon
pip install "mudatasets[muon]"
# Dev
pip install git+https://github.com/gtca/mudatasets

Getting started

import mudatasets as mds

Find available datasets

mds.list_datasets()

Load a dataset

mdata = mds.load("pbmc3k_multiome")
print(mdata)

Some common attributes for .load() are:

  • data_dir= for location to save the dataset (~/mudatasets/ by default)
  • with_info=True for also returning the second argument with dataset description as a dictionary (False by default)
  • backed=True for reading data in a backed format, only for .h5mu and .h5ad files (True by default)
  • files= for downloading specific files from the dataset
  • full=True for downloading all the files defined for the dataset (False by default)

Get dataset info

mds.info("pbmc3k_multiome")

List dataset file names

mds.list_files("pbmc3k_multiome")

Webpage with all the files

mds.serve_webpage(port=8000)

This command will launch a server providing a simple (temporarily created) HTML page at http://localhost:8000 with files across all of the datasets listed.