## Integration

Only necessary if you are runing an analysis with multiple samples. The experiment sampled 19 COVID-19 patients and 7 control patients. There are a total of 26 samples in this analysis, thus, integration is necessary.

We are essentially gonna now do the same thing as we did for (1) loading the data, (2) doublet removal, (3) preprocessing, and (4) Clustering. However, we will do this for all samples, to make this more efficient, we made this into a function and have the function iterate through each sample file.

In [4]:
import gc  # Import garbage collector
import scanpy as sc
import scvi
import pandas as pd
import numpy as np
import os

def pp(csv_path):
    adata = sc.read_csv(csv_path).T
    sc.pp.filter_genes(adata, min_cells=10)
    sc.pp.highly_variable_genes(adata, n_top_genes=2000, subset=True, flavor='seurat_v3')
    scvi.model.SCVI.setup_anndata(adata)
    vae = scvi.model.SCVI(adata)
    vae.train()
    solo = scvi.external.SOLO.from_scvi_model(vae)
    solo.train()
    df = solo.predict()
    df['prediction'] = solo.predict(soft=False)
    df.index = df.index.map(lambda x: x[:-2])
    df['dif'] = df.doublet - df.singlet
    doublets = df[(df.prediction == 'doublet') & (df.dif > 0.4)]
    
    adata = sc.read_csv(csv_path).T
    adata.obs['Sample'] = csv_path.split('_')[2]
    adata.obs['doublet'] = adata.obs.index.isin(doublets.index)
    adata = adata[~adata.obs.doublet]
    
    sc.pp.filter_cells(adata, min_genes=200)
    adata.var['mt'] = adata.var_names.str.startswith('mt-')
    
    ribo_url = "http://software.broadinstitute.org/gsea/msigdb/download_geneset.jsp?geneSetName=KEGG_RIBOSOME&fileType=txt"
    ribo_genes = pd.read_table(ribo_url, skiprows=2, header=None)
    adata.var['ribo'] = adata.var_names.isin(ribo_genes[0].values)
    
    sc.pp.calculate_qc_metrics(adata, qc_vars=['mt', 'ribo'], percent_top=None, log1p=False, inplace=True)
    
    upper_lim = np.quantile(adata.obs.n_genes_by_counts.values, .98)
    adata = adata[adata.obs.n_genes_by_counts < upper_lim]
    adata = adata[adata.obs.pct_counts_mt < 20]
    adata = adata[adata.obs.pct_counts_ribo < 2]
    
    return adata

In [5]:
out = []
for file in os.listdir('GSE171524_RAW/'):
    processed_data = pp('GSE171524_RAW/' + file)
    out.append(processed_data)
    
    # Clear memory
    del processed_data
    gc.collect()

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [12:28<00:00,  2.08s/it, v_num=1, train_loss_step=335, train_loss_epoch=322]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [12:28<00:00,  1.87s/it, v_num=1, train_loss_step=335, train_loss_epoch=322]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 221/400:  55%|█████▌    | 221/400 [02:12<01:47,  1.67it/s, v_num=1, train_loss_step=0.232, train_loss_epoch=0.292]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.274. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [08:34<00:00,  1.18s/it, v_num=1, train_loss_step=416, train_loss_epoch=397]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [08:34<00:00,  1.29s/it, v_num=1, train_loss_step=416, train_loss_epoch=397]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 254/400:  64%|██████▎   | 254/400 [01:29<00:51,  2.83it/s, v_num=1, train_loss_step=0.304, train_loss_epoch=0.296]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.312. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [13:12<00:00,  1.99s/it, v_num=1, train_loss_step=615, train_loss_epoch=334]   

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [13:12<00:00,  1.98s/it, v_num=1, train_loss_step=615, train_loss_epoch=334]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 230/400:  57%|█████▊    | 230/400 [02:00<01:29,  1.90it/s, v_num=1, train_loss_step=0.373, train_loss_epoch=0.311] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.312. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [08:17<00:00,  1.33s/it, v_num=1, train_loss_step=302, train_loss_epoch=308]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [08:17<00:00,  1.24s/it, v_num=1, train_loss_step=302, train_loss_epoch=308]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 202/400:  50%|█████     | 202/400 [01:17<01:16,  2.60it/s, v_num=1, train_loss_step=0.259, train_loss_epoch=0.235] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.253. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [10:34<00:00,  1.54s/it, v_num=1, train_loss_step=318, train_loss_epoch=306]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [10:34<00:00,  1.59s/it, v_num=1, train_loss_step=318, train_loss_epoch=306]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 188/400:  47%|████▋     | 188/400 [01:19<01:29,  2.36it/s, v_num=1, train_loss_step=0.29, train_loss_epoch=0.232] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.210. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [07:27<00:00,  1.14s/it, v_num=1, train_loss_step=358, train_loss_epoch=325]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [07:27<00:00,  1.12s/it, v_num=1, train_loss_step=358, train_loss_epoch=325]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 152/400:  38%|███▊      | 152/400 [00:48<01:18,  3.15it/s, v_num=1, train_loss_step=0.229, train_loss_epoch=0.266]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.257. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [08:57<00:00,  1.26s/it, v_num=1, train_loss_step=300, train_loss_epoch=290]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [08:57<00:00,  1.34s/it, v_num=1, train_loss_step=300, train_loss_epoch=290]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 271/400:  68%|██████▊   | 271/400 [01:42<00:48,  2.64it/s, v_num=1, train_loss_step=0.466, train_loss_epoch=0.246]  
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.236. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [05:37<00:00,  1.19it/s, v_num=1, train_loss_step=356, train_loss_epoch=332]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [05:37<00:00,  1.19it/s, v_num=1, train_loss_step=356, train_loss_epoch=332]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 144/400:  36%|███▌      | 144/400 [00:33<00:58,  4.35it/s, v_num=1, train_loss_step=0.275, train_loss_epoch=0.278]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.258. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [09:08<00:00,  1.49s/it, v_num=1, train_loss_step=533, train_loss_epoch=475]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [09:08<00:00,  1.37s/it, v_num=1, train_loss_step=533, train_loss_epoch=475]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 168/400:  42%|████▏     | 168/400 [01:02<01:26,  2.70it/s, v_num=1, train_loss_step=0.357, train_loss_epoch=0.352]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.334. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [06:38<00:00,  1.02s/it, v_num=1, train_loss_step=271, train_loss_epoch=260]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [06:38<00:00,  1.00it/s, v_num=1, train_loss_step=271, train_loss_epoch=260]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 199/400:  50%|████▉     | 199/400 [00:52<00:53,  3.77it/s, v_num=1, train_loss_step=0.288, train_loss_epoch=0.275]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.296. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [08:31<00:00,  1.21s/it, v_num=1, train_loss_step=404, train_loss_epoch=327]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [08:31<00:00,  1.28s/it, v_num=1, train_loss_step=404, train_loss_epoch=327]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 125/400:  31%|███▏      | 125/400 [00:44<01:38,  2.79it/s, v_num=1, train_loss_step=0.123, train_loss_epoch=0.299]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.279. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [05:32<00:00,  1.19it/s, v_num=1, train_loss_step=306, train_loss_epoch=344]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [05:32<00:00,  1.20it/s, v_num=1, train_loss_step=306, train_loss_epoch=344]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 186/400:  46%|████▋     | 186/400 [00:47<00:54,  3.93it/s, v_num=1, train_loss_step=0.44, train_loss_epoch=0.329] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.352. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [14:04<00:00,  2.07s/it, v_num=1, train_loss_step=319, train_loss_epoch=363]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [14:04<00:00,  2.11s/it, v_num=1, train_loss_step=319, train_loss_epoch=363]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 177/400:  44%|████▍     | 177/400 [01:52<02:21,  1.57it/s, v_num=1, train_loss_step=0.337, train_loss_epoch=0.297]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.293. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [09:09<00:00,  1.31s/it, v_num=1, train_loss_step=327, train_loss_epoch=345]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [09:09<00:00,  1.37s/it, v_num=1, train_loss_step=327, train_loss_epoch=345]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 225/400:  56%|█████▋    | 225/400 [01:24<01:05,  2.66it/s, v_num=1, train_loss_step=1.07, train_loss_epoch=0.359]  
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.335. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [07:42<00:00,  1.10s/it, v_num=1, train_loss_step=366, train_loss_epoch=342]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [07:42<00:00,  1.16s/it, v_num=1, train_loss_step=366, train_loss_epoch=342]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 180/400:  45%|████▌     | 180/400 [01:01<01:14,  2.94it/s, v_num=1, train_loss_step=0.401, train_loss_epoch=0.306]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.289. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [06:51<00:00,  1.01s/it, v_num=1, train_loss_step=351, train_loss_epoch=338]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [06:51<00:00,  1.03s/it, v_num=1, train_loss_step=351, train_loss_epoch=338]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 237/400:  59%|█████▉    | 237/400 [01:04<00:44,  3.67it/s, v_num=1, train_loss_step=1.17, train_loss_epoch=0.251]  
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.253. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [02:56<00:00,  2.48it/s, v_num=1, train_loss_step=318, train_loss_epoch=314]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [02:56<00:00,  2.27it/s, v_num=1, train_loss_step=318, train_loss_epoch=314]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 209/400:  52%|█████▏    | 209/400 [00:25<00:23,  8.19it/s, v_num=1, train_loss_step=0.307, train_loss_epoch=0.276]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.290. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [06:13<00:00,  1.12it/s, v_num=1, train_loss_step=332, train_loss_epoch=361]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [06:13<00:00,  1.07it/s, v_num=1, train_loss_step=332, train_loss_epoch=361]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 224/400:  56%|█████▌    | 224/400 [00:58<00:46,  3.81it/s, v_num=1, train_loss_step=0.299, train_loss_epoch=0.346]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.353. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [07:14<00:00,  1.12s/it, v_num=1, train_loss_step=349, train_loss_epoch=319]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [07:14<00:00,  1.09s/it, v_num=1, train_loss_step=349, train_loss_epoch=319]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 297/400:  74%|███████▍  | 297/400 [01:29<00:31,  3.31it/s, v_num=1, train_loss_step=0.272, train_loss_epoch=0.308]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.314. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [09:12<00:00,  1.37s/it, v_num=1, train_loss_step=315, train_loss_epoch=371]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [09:12<00:00,  1.38s/it, v_num=1, train_loss_step=315, train_loss_epoch=371]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 182/400:  46%|████▌     | 182/400 [01:15<01:30,  2.42it/s, v_num=1, train_loss_step=0.261, train_loss_epoch=0.301]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.313. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [07:33<00:00,  1.07s/it, v_num=1, train_loss_step=401, train_loss_epoch=351]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [07:33<00:00,  1.13s/it, v_num=1, train_loss_step=401, train_loss_epoch=351]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 316/400:  79%|███████▉  | 316/400 [01:33<00:24,  3.39it/s, v_num=1, train_loss_step=0.304, train_loss_epoch=0.291]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.275. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [03:25<00:00,  2.07it/s, v_num=1, train_loss_step=370, train_loss_epoch=378]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [03:25<00:00,  1.95it/s, v_num=1, train_loss_step=370, train_loss_epoch=378]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 284/400:  71%|███████   | 284/400 [00:41<00:17,  6.80it/s, v_num=1, train_loss_step=0.282, train_loss_epoch=0.296]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.334. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [08:40<00:00,  1.23s/it, v_num=1, train_loss_step=335, train_loss_epoch=342]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [08:40<00:00,  1.30s/it, v_num=1, train_loss_step=335, train_loss_epoch=342]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 171/400:  43%|████▎     | 171/400 [00:58<01:18,  2.90it/s, v_num=1, train_loss_step=0.442, train_loss_epoch=0.267] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.236. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [05:18<00:00,  1.27it/s, v_num=1, train_loss_step=310, train_loss_epoch=303]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [05:18<00:00,  1.26it/s, v_num=1, train_loss_step=310, train_loss_epoch=303]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 184/400:  46%|████▌     | 184/400 [00:38<00:45,  4.78it/s, v_num=1, train_loss_step=0.189, train_loss_epoch=0.193] 
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.219. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [04:39<00:00,  1.50it/s, v_num=1, train_loss_step=464, train_loss_epoch=418]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [04:39<00:00,  1.43it/s, v_num=1, train_loss_step=464, train_loss_epoch=418]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 167/400:  42%|████▏     | 167/400 [00:31<00:43,  5.31it/s, v_num=1, train_loss_step=0.356, train_loss_epoch=0.339]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.370. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [06:23<00:00,  1.03it/s, v_num=1, train_loss_step=287, train_loss_epoch=318]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [06:23<00:00,  1.04it/s, v_num=1, train_loss_step=287, train_loss_epoch=318]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 302/400:  76%|███████▌  | 302/400 [01:27<00:28,  3.45it/s, v_num=1, train_loss_step=0.184, train_loss_epoch=0.3]  
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.284. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 400/400: 100%|██████████| 400/400 [13:00<00:00,  1.95s/it, v_num=1, train_loss_step=357, train_loss_epoch=340]

`Trainer.fit` stopped: `max_epochs=400` reached.


Epoch 400/400: 100%|██████████| 400/400 [13:00<00:00,  1.95s/it, v_num=1, train_loss_step=357, train_loss_epoch=340]
[34mINFO    [0m Creating doublets, preparing SOLO model.                                                                  


GPU available: False, used: False
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.
c:\Users\alexg\anaconda3\envs\bioenv\Lib\site-packages\lightning\pytorch\trainer\connectors\data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=11` in the `DataLoader` to improve performance.


Epoch 241/400:  60%|██████    | 241/400 [02:08<01:24,  1.87it/s, v_num=1, train_loss_step=0.338, train_loss_epoch=0.317]
Monitored metric validation_loss did not improve in the last 30 records. Best score: 0.319. Signaling Trainer to stop.


  return func(*args, **kwargs)
  return func(*args, **kwargs)
  adata.obs["n_genes"] = number


In [6]:
adata = sc.concat(out)

In [7]:
adata

AnnData object with n_obs × n_vars = 108822 × 34546
    obs: 'Sample', 'doublet', 'n_genes', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'total_counts_ribo', 'pct_counts_ribo'

In [8]:
sc.pp.filter_genes(adata, min_cells=10)

In [9]:
adata.X

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

In [10]:
from scipy.sparse import csr_matrix

In [11]:
adata.X = csr_matrix(adata.X) #help compress the data for memory efficiency

In [12]:
adata.X

<Compressed Sparse Row sparse matrix of dtype 'float32'
	with 95760461 stored elements and shape (108822, 29581)>

In [13]:
adata.write_h5ad('combined.h5ad')

Once the data is now saved into the hard drive. You can restart the Kernel and clear up cached memory to process the rest of the analysis without memory error.