### 1. General info of dataset GSE162454

This is the Jupyter Notebook for dataset GSE162454. Its dataset includes barcodes/genes/matrix files for each sample.

Thus, we need to simply incorparate these barcodes/genes/matrix files and generate an AnnData object for each sample. In total, there are 3/6 pediatric samples (OS_1, OS_5, OS_6).

<span style="color:green">**[OS]**</span> Osteosarcoma

In [1]:
# Environment setup
import numpy as np
import pandas as pd
import scanpy as sc
import anndata as anndata
import scipy

### 2. AnnData object of each sample

<span style="color:red">**IMPORTANT:**</span> rename features.tsv to genes.tsv

1. `barcodes.tsv`: cell barcodes, which go into `.obs`
2. `genes.tsv`: gene names, `.var`
3. `matrix.mtx`: the expression matrix, `.X`

In [2]:
from pathlib import Path

# Specify directory paths
data_directory = Path('/scratch/user/s4543064/xiaohan-john-project/data/GSE162454')
write_directory = Path('/scratch/user/s4543064/xiaohan-john-project/write/GSE162454')

# Loop through all files in the directory
for sample_directory in data_directory.iterdir():
    sample_name = sample_directory.stem
    sample_h5ad = sample_name + '.h5ad'

    sample = sc.read_10x_mtx(
    sample_directory,
    var_names='gene_symbols',  
    cache=False
    )
    print(sample)

    # save the anndata object
    output_path = write_directory / sample_h5ad
    sample.write_h5ad(output_path, compression="gzip")

GSM5155200_OS_6
GSM4952363_OS_1
GSM5155199_OS_5


### 3. Confirmation of created AnnData objects

In [3]:
from pathlib import Path

# Specify directory paths
write_directory = Path('/scratch/user/s4543064/xiaohan-john-project/write/GSE162454')

# Loop through all files in the directory
for file in write_directory.iterdir():
    sample = anndata.read_h5ad(file)
    print(sample)

AnnData object with n_obs × n_vars = 8894 × 33538
    var: 'gene_ids'
AnnData object with n_obs × n_vars = 10926 × 33538
    var: 'gene_ids'
AnnData object with n_obs × n_vars = 9689 × 33538
    var: 'gene_ids'
