<a href="https://colab.research.google.com/github/marthab1/brainx92/blob/master/Using_Loom_scRNAseq_files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Starting Out
#####This code will open files in the loom format from single cell RNA sequencing data. Once imported, we will manipulate the files to extract the data we need about cell-type-specific expression in the fly nervous system.

#####We are using the Google Colab Platform, and you can learn more about how this works to run Python scripts here: https://www.jcchouinard.com/google-colab-with-python/

#####Last updated on July 22, 2023 by Martha Bhattacharya.

#####First we will install pandas (used to manipulate data frames) and loompy (which helps us read loom files). I am working from this page which details the code necessary to get started:

#####https://linnarssonlab.org/loompy/installation/index.html

In [None]:
import pandas as pd
!pip install -U loompy
#humfly = pd.read_csv(r'C:\Users\marthab1\Downloads\BioMart_fly_human_20210506_unique.csv')

Collecting loompy
  Downloading loompy-3.0.7.tar.gz (4.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.8/4.8 MB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting numpy-groupies (from loompy)
  Downloading numpy_groupies-0.9.22.tar.gz (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.3/53.3 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: loompy, numpy-groupies
  Building wheel for loompy (setup.py) ... [?25l[?25hdone
  Created wheel for loompy: filename=loompy-3.0.7-py3-none-any.whl size=52018 sha256=0f84716b23cfa83ddd3b87f0eb4aa1702a753595a31105772dbd2ebd6ad80fd5
  Stored in directory: /root/.cache/pip/wheels/2c/22/1f/792a4621bb631e538bf1c21feae9bbaa6b19fd6d6ab382d1fd
  Building wheel for numpy-groupies (setup.py) ... [?25l[?25hdone
  Created wheel for numpy-groupies: filenam

In [None]:
!python --version

Python 3.10.6


In [None]:
import numpy as np

Now we need to get the data set to use. Adult fly whole body scRNAseq and brain RNAseq datasets are available through FlyCellAtlas, which is based on this paper: https://pubmed.ncbi.nlm.nih.gov/35239393/
The link to the file is here:
https://cloud.flycellatlas.org/index.php/s/yNGaMWYFaNkFSKY/download/s_fca_biohub_body_10x.loom

We will first connect to Google Drive, and then we will download the dataset we need directly int GDrive.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


The next bit of code will get a file from a URL (copy this first) and then import it to your MyDrive folder. You may be able to modify the file destination, but I have not done this below.

In [None]:
import sys

#if branch is stable, will install via pypi, else will install from source
branch = "stable"
IN_COLAB = "google.colab" in sys.modules

if IN_COLAB and branch == "stable":
    !pip install --quiet scvi-tools[tutorials]
elif IN_COLAB and branch != "stable":
    !pip install --quiet --upgrade jsonschema
    !pip install --quiet git+https://github.com/yoseflab/scvi-tools@$branch#egg=scvi-tools[tutorials]

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m351.0/351.0 kB[0m [31m5.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m103.0/103.0 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m13.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m304.6/304.6 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m732.5/732.5 kB[0m [31m18.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.0/81.0 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
import scvi
import scanpy as sc

  self.seed = seed
  self.dl_pin_memory_gpu_training = (


In [None]:
!wget 'https://cloud.flycellatlas.org/index.php/s/yNGaMWYFaNkFSKY/download/s_fca_biohub_body_10x.loom'

--2023-07-24 03:04:43--  https://cloud.flycellatlas.org/index.php/s/yNGaMWYFaNkFSKY/download/s_fca_biohub_body_10x.loom
Resolving cloud.flycellatlas.org (cloud.flycellatlas.org)... 134.58.50.9
Connecting to cloud.flycellatlas.org (cloud.flycellatlas.org)|134.58.50.9|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 863715561 (824M) [application/octet-stream]
Saving to: ‘s_fca_biohub_body_10x.loom’


2023-07-24 03:05:42 (14.2 MB/s) - ‘s_fca_biohub_body_10x.loom’ saved [863715561/863715561]



In [None]:
body_loomfile = scvi.data.read_loom("s_fca_biohub_body_10x.loom")

In [None]:

# Connect to the file
ds = loompy.connect("/content/s_fca_biohub_body_10x.loom", "r") # r for read-only and r+ for read/write
# Retrieve the data and put in variable clusters
clusters = ds.col_attrs["cluster_seurat"]

# Close the file handle
ds.close()

Now let's try to open the dataset and see what it looks like!