# Description

It profiles some functions to compute the correlation between predicted gene expression. Each of these notebooks is supposed to be run in a particular changeset.

**Before running this notebook**, make sure you are in this changeset:
```bash
git co 6149a6f90f41534d0979b434cd16d17cc28d2c5f
```

In [1]:
%load_ext line_profiler

# Modules

In [2]:
from entity import Gene

# Functions

In [3]:
def compute_ssm_correlation(all_genes):
    res = []
    for g1_idx, g1 in enumerate(all_genes[:-1]):
        for g2 in all_genes[g1_idx:]:
            c = g1.get_ssm_correlation(
                g2,
                reference_panel="1000G",
                model_type="MASHR",
                use_within_distance=False,
            )
            res.append(c)
    return res

# Test case

In [4]:
gene1 = Gene(ensembl_id="ENSG00000180596")
gene2 = Gene(ensembl_id="ENSG00000180573")
gene3 = Gene(ensembl_id="ENSG00000274641")
gene4 = Gene(ensembl_id="ENSG00000277224")

all_genes = [gene1, gene2, gene3, gene4]

In [5]:
assert len(set([g.chromosome for g in all_genes])) == 1

# Run timeit

In [6]:
%timeit compute_ssm_correlation(all_genes)

40.4 s ± 30.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


# Profile

In [7]:
%prun -l 20 -s cumulative compute_ssm_correlation(all_genes)

 

         106908919 function calls (105751327 primitive calls) in 68.422 seconds

   Ordered by: cumulative time
   List reduced from 507 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   68.422   68.422 {built-in method builtins.exec}
        1    0.000    0.000   68.422   68.422 <string>:1(<module>)
        1    0.000    0.000   68.422   68.422 511008798.py:1(compute_ssm_correlation)
        9    0.001    0.000   68.422    7.602 entity.py:1018(get_ssm_correlation)
       27    0.216    0.008   68.402    2.533 entity.py:966(get_tissues_correlations)
    64827    1.652    0.000   68.068    0.001 entity.py:877(get_expression_correlation)
    88734    0.216    0.000   48.663    0.001 indexing.py:864(__getitem__)
    29560    9.317    0.000   41.885    0.001 entity.py:762(_get_snps_cov)
    29560    0.154    0.000   31.837    0.001 indexing.py:1042(_getitem_tuple)
    29560    0.082    0.000   30.350    0.00

# Profile by line

## Function `get_expression_correlation`

In [8]:
%lprun -f Gene.get_expression_correlation compute_ssm_correlation(all_genes)

Timer unit: 1e-06 s

Total time: 82.5494 s
File: /opt/code/libs/entity.py
Function: get_expression_correlation at line 877

Line #      Hits         Time  Per Hit   % Time  Line Contents
   877                                               def get_expression_correlation(
   878                                                   self,
   879                                                   other_gene,
   880                                                   tissue: str,
   881                                                   other_tissue: str = None,
   882                                                   reference_panel: str = "GTEX_V8",
   883                                                   model_type: str = "MASHR",
   884                                                   use_within_distance=True,
   885                                               ):
   886                                                   """
   887                                                   Given anoth

## Function `_get_snps_cov`

In [9]:
%lprun -f Gene._get_snps_cov compute_ssm_correlation(all_genes)

Timer unit: 1e-06 s

Total time: 51.2996 s
File: /opt/code/libs/entity.py
Function: _get_snps_cov at line 762

Line #      Hits         Time  Per Hit   % Time  Line Contents
   762                                               @staticmethod
   763                                               def _get_snps_cov(
   764                                                   snps_ids_list1,
   765                                                   snps_ids_list2=None,
   766                                                   check=False,
   767                                                   reference_panel="GTEX_V8",
   768                                                   model_type="MASHR",
   769                                               ):
   770                                                   """
   771                                                   Given one or (optionally) two lists of SNPs IDs, it returns the
   772                                                   covariance