Skip to content

alex-medvedev-msc/ukb_loader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UK Biobank data loader

This repository provides a library and set of utilities for the efficient loading of phenotype and genotype data from the UK Biobank.

Features include:

  • Loading quantitative and categorical phenotypes, includeding self-reported phenotypes and phenotypes based on ICD-10 disease codes.
  • Fast parallelized loading that leverages chunked and compressed Zarr arrays.
  • Utilities for splitting the dataset samples randomly, or based on a predefined structure.

Usage

First, the UKB dataset needs to be converted into the Zarr format with the desired test/train/validation split. For this, use the provided conversion script.

For examples on loading various types of phenotypes, see this example notebook.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages