Advanced BigQuery examples on genomic data.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


The data stories and queries in this repository demonstrate working with genomic data via Google BigQuery. All examples are built upon public datasets.

Have other data stories you would like to see here? Have any data stories you would like to share? Have corrections to the biology covered in this material? Have query simplifications or speed improvements? Let us know by filing an issue or contacting us directly.

Getting Started

If you are new to BigQuery, start here instead: Analyze Variants Using BigQuery.

Otherwise, navigate through the tree of content in this repository. You will find queries, RMarkdown, rendered analyses, and provenance details.

Loading your own Variant Data into BigQuery

After trying these queries on public data, you can load your own variant data into BigQuery.

For other types of data, such as variant annotations, see Preparing Data for BigQuery and also BigQuery in Practice : Loading Data Sets That are Terabytes and Beyond for more detail.

The mailing list

The Google Genomics Discuss mailing list is a good way to sync up with other people who use googlegenomics including the core developers. You can subscribe by sending an email to or just post using the web forum page.