<a href="https://colab.research.google.com/github/isb-cgc/Community-Notebooks/blob/master/MitelmanDB/Mitelman_DB_Views.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using BigQuery Views in the Mitelman Database BigQuery Tables

Check out other notebooks at our [Community Notebooks Repository](https://github.com/isb-cgc/Community-Notebooks)!

- Title: Using BigQuery Views in the Mitelman Database BigQuery Tables
- Author: Jacob Wilson
- Created: 2025-02-26
- URL: https://github.com/isb-cgc/Community-Notebooks/blob/master/MitelmanDB/Mitelman_DB_Views.ipynb
- Purpose: Extend our capabilities in obtaining data in the Mitelman Datbase BigQuery tables by performing table joins.
<br/>

In this notebook, we will take what we learned in previous notebooks about finding information in the database and put them to use in storable queries called Views.

## Background on BigQuery Views
In previous notebooks we demonstrated how to find information in the Mitelman Database BigQuery tables by using SQL and joins. These queries give us the power to gather information from complex databases. One extension of this abilty is to utilize views in BigQuery. A view is a virtual table of data that is generated from a user-defined SQL query. This gives you the option to store a complex query and access the result of that query as if it were its own unique data table.

See more about BigQuery views from Google: https://cloud.google.com/bigquery/docs/views-intro

## Initialize Notebook Environment

Before beginning, we first need to load dependencies and authenticate to BigQuery.

## Install Dependencies

In [1]:
# GCP libraries
from google.cloud import bigquery
from google.colab import auth

## Authenticate

In order to utilize BigQuery, we must obtain authorization to BigQuery and Google Cloud.

In [2]:
# if you're using Google Colab, authenticate to gcloud with the following
auth.authenticate_user()

# alternatively, use the gcloud SDK
#!gcloud auth application-default login

## Google project ID

Set your own Google project ID for use with this notebook.

In [3]:
# set the google project that will be billed for this notebook's computations
google_project = 'your_project_id'  ## change this

## BigQuery Client

In [7]:
# Initialize a client to access the data within BigQuery
if google_project == 'your_project_id':
    print('Please update the project ID with your Google Cloud Project')
else:
    client = bigquery.Client(google_project)

# set the project and dataset that will contain the newly created view
bq_project = ''
bq_dataset = ''

# set the Mitelman Database project
mitelman_project = 'mitelman-db'
mitelman_dataset = 'prod'

# Creating a BigQuery View

In [8]:
# give a named location for the view
view_name = f'''{bq_project}.{bq_dataset}.ALL_BCR_cases'''
view = bigquery.Table(view_name)

# The following query will return a table that includes the PubMed ID and
# publication title for all cases having BCR gene fusions and ALL morphology.
view.view_query = f'''
WITH TargetCases AS (
  SELECT g.RefNo
  FROM `{mitelman_project}.{mitelman_dataset}.MolClinGene` g
  WHERE g.Gene LIKE "BCR"
)
SELECT DISTINCT
  r.Pubmed,
  r.titleShort,
  g.Gene,
  k.Benamning AS Morphology
FROM
  TargetCases tc
JOIN `{mitelman_project}.{mitelman_dataset}.Reference` r ON r.RefNo = tc.RefNo
JOIN `{mitelman_project}.{mitelman_dataset}.MolClinGene` g ON g.RefNo = tc.RefNo
JOIN `{mitelman_project}.{mitelman_dataset}.MolBiolClinAssoc` m ON m.RefNo = tc.RefNo
JOIN `{mitelman_project}.{mitelman_dataset}.Koder` k ON k.Kod = m.Morph
WHERE m.Morph LIKE "1602"'''

# Make an API request to create the view.
view = client.create_table(view)
print(f"Created {view.table_type}: {str(view.reference)}")

Created VIEW: isb-project-zero.jaw_scratch.ALL_BCR_cases


## Conclusion
BigQuery Views can be an extremely useful tool if you're performing the same queries often, such as if you are keeping track of specific data over time. The resulting table from your view can be used in downstream projects or exported as an Excel spreadsheet.

With this series of notebooks covering queries, join, and views, you should able to find the Mitelman data that is needed for your data analysis. See our other Mitleman notebooks that cover more in-depth data analysis: https://github.com/isb-cgc/Community-Notebooks/tree/master/MitelmanDB