### Notebook to add `scNym` labels and scores to query TB PBMC object

- **Developed by**: Carlos Talavera-López Ph.D
- **Institute of Computational Biology - Computational Health Centre - Helmholtz Munich**
- v221017

### Export required modules

In [1]:
import anndata
import scipy as sp
import pandas as pd
import scanpy as sc

### Set up working environment

In [2]:
sc.settings.verbosity = 3
sc.logging.print_versions()
sc.settings.set_figure_params(dpi = 180, color_map = 'magma_r', dpi_save = 300, vector_friendly = True, format = 'svg')

-----
anndata     0.8.0
scanpy      1.9.1
-----
PIL                         9.2.0
asttokens                   NA
backcall                    0.2.0
beta_ufunc                  NA
binom_ufunc                 NA
cffi                        1.15.1
colorama                    0.4.5
cycler                      0.10.0
cython_runtime              NA
dateutil                    2.8.2
debugpy                     1.5.1
decorator                   5.1.1
entrypoints                 0.4
executing                   0.8.3
h5py                        3.7.0
hypergeom_ufunc             NA
ipykernel                   6.9.1
jedi                        0.18.1
joblib                      1.2.0
kiwisolver                  1.4.4
llvmlite                    0.39.1
matplotlib                  3.6.1
mpl_toolkits                NA
natsort                     8.2.0
nbinom_ufunc                NA
ncf_ufunc                   NA
numba                       0.56.2
numpy                       1.23.3
packaging           

## Format CTRL object

- Read in `scNym` annotated object to extract labels

In [3]:
query_scnym = sc.read('/home/cartalop/data/single_cell/lung/tb/working_objects/CaiY_PBMC_TB_post-scnym_ctl220717.h5ad')
query_scnym

AnnData object with n_obs × n_vars = 145381 × 20199
    obs: 'object', 'domain_label', 'cell_states', 'scNym', 'scNym_confidence'
    var: 'gene_id-query', 'n_cells', 'n_counts'
    uns: 'cell_states_colors', 'log1p', 'neighbors', 'object_colors', 'scNym_colors', 'scNym_probabilities', 'umap'
    obsm: 'X_scnym', 'X_umap'
    obsp: 'connectivities', 'distances'

- Read in raw object

In [4]:
query_raw = sc.read_h5ad('/home/cartalop/data/single_cell/lung/tb/merged/CaiY_PBMC-TB_QCed_pre-processed_ctl221017.h5ad') 
query_raw

FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = '/home/cartalop/data/single_cell/lung/tb/merged/CaiY_PBMC-TB_QCed_pre-processed_ctl221017.h5ad', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

### Add annotations to raw object

In [None]:
query_annotated = query_raw.copy()
query_annotated

### Copy observations from raw object to annotated object

In [None]:
query_annotated.obs.head()

In [None]:
query_annotated.obs = query_annotated.obs.assign(scNym = pd.Series(query_scnym.obs['scNym']).values)
query_annotated.obs = query_annotated.obs.assign(scNym_confidence = pd.Series(query_scnym.obs['scNym_confidence']).values)
query_annotated.obs['scNym'].value_counts()

In [None]:
query_annotated.obs['scNym'].cat.categories

### Make matrix sparse

In [None]:
query_annotated.X = sp.sparse.csr_matrix(query_annotated.X)

### Save object

In [None]:
query_annotated.write('/home/cartalop/data/single_cell/lung/tb/working_objects/CaiY_TB-PBMC_scnym_annotated_ctl221017.h5ad')