This notebook shows how the data for runtime and memory performance has been generated.
The final merge `anndata.AnnData` object can be downloaded from here:
[https://ndownloader.figshare.com/files/25120694?private_link=a187bbb4aa21f7223523](https://ndownloader.figshare.com/files/25120694?private_link=a187bbb4aa21f7223523).

# Preliminaries

## Import packages

In [1]:
# import standard packages
from pathlib import Path

import os
import sys

# import single-cell packages
import scanpy as sc
import scvelo as scv

# set verbosity levels
sc.settings.verbosity = 2
scv.settings.verbosity = 3 

## Print package versions for reproducibility

In [2]:
scv.logging.print_versions()

scvelo==0.2.2  scanpy==1.6.0  anndata==0.7.4  loompy==3.0.6  numpy==1.19.2  scipy==1.5.2  matplotlib==3.3.2  sklearn==0.23.2  pandas==1.1.3  
 Your version: 		 0.2.2 
Latest version: 	 modeling


## Set up paths

In [3]:
sys.path.insert(0, "../../..")  # this depends on the notebook depth and must be adapted per notebook

from paths import DATA_DIR

### Define path for the Morris dataset

This dataset comes from [Biddy, B.A., Kong, W., Kamimoto, K. et al. Single-cell mapping of lineage and identity in direct reprogramming. Nature 564, 219–224 (2018)](https://doi.org/10.1038/s41586-018-0744-4).

In [8]:
root = DATA_DIR / "morris_data_raw_loom_files"

## Load the data

In [9]:
adatas = []
for dirname in os.listdir(root):
    if dirname.startswith("hf"):
        fname = [f for f in os.listdir(root / dirname) if f.endswith(".loom")][0]
        adatas.append(scv.read_loom(root / dirname / fname))
        adatas[-1].var_names_make_unique()  # for merging
        adatas[-1].obs_names_make_unique()

Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Variable names are not unique. To make them unique, call `.var_names_make_unique`.
Vari

## Concatenate the objects

The dataset is saved in multiple parts si we have to merge the objects.

In [10]:
adata = adatas[0].concatenate(adatas[1:])

adata.var_names_make_unique()
adata.obs_names_make_unique()

adata

AnnData object with n_obs × n_vars = 104679 × 22630
    obs: 'batch'
    var: 'Accession', 'Chromosome', 'End', 'Start', 'Strand'
    layers: 'matrix', 'ambiguous', 'spliced', 'unspliced'

Remove layers that are not needed.

In [11]:
del adata.var, adata.layers['matrix'], adata.layers['ambiguous']

## Write the data

In [13]:
sc.write(DATA_DIR / "morris_data" / "adata.h5ad", adata)