Script to query data from Cell Census.
    
    
SOMA = STACKS of matrices, annotated: https://github.com/single-cell-data/SOMA/blob/main/abstract_specification.md

CELLxGENE dataset schema: https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/3.0.0/schema.md

Helpful links:
https://github.com/chanzuckerberg/cell-census/blob/main/api/python/notebooks/api_demo/census_query_extract.ipynb

Overview of AnnData: https://adamgayoso.com/posts/ten_min_to_adata/

Functions to write:

1) get data using get_anndata
2) check data query for existing keywords so it doesn't time out - DONE

In [1]:
import cell_census
import anndata as ad

In [2]:
census = cell_census.open_soma(census_version="latest")


In ```n_obs``` there are a few ontology related terms. One of these might be our target variable, perhaps cell_type?

- ```cell_type_ontology_term_id``` 
- ```development_stage_ontology_term_id``` 
- ```disease_ontology_term_id``` 
- ```self_reported_ethnicity_ontology_term_id``` 
- ```sex_ontology_term_id``` 
- ```tissue_ontology_term_id``` 
- ```tissue_general_ontology_term_id```

```obs``` = cell metadata
```var``` = feature metadata

Data is stored in ```adata.X``` which is a sparse matrix 

## Check Query

Code to check query before running get_anndata and crashing the kernel.

Assumes there is a census object already open

In [42]:
cell_10v3.columns

Index(['soma_joinid', 'dataset_id', 'assay', 'assay_ontology_term_id',
       'cell_type', 'cell_type_ontology_term_id', 'development_stage',
       'development_stage_ontology_term_id', 'disease',
       'disease_ontology_term_id', 'donor_id', 'is_primary_data',
       'self_reported_ethnicity', 'self_reported_ethnicity_ontology_term_id',
       'sex', 'sex_ontology_term_id', 'suspension_type', 'tissue',
       'tissue_ontology_term_id', 'tissue_general',
       'tissue_general_ontology_term_id'],
      dtype='object')

In [6]:
def check_cell_census_query(metadata_columns,col_vals):
    '''
    This function checks an active census object to see if you can successfully filter on the inputs. This is
    a quick way to check your query before running get_anndata(), which can result in a kernel crash if the 
    filtering is not correct.
    
    Assumes there is an active census object already open. Assumes you only want to query on cell metadata. 
    Gene metadata querying not currently supported.
    
    Parameters
    ----------
    metadata_columns : list
        list of strings containing obs parameters to query
        
    col_vals : list
        list of strings containing obs parameters values you hope to filter on
        
    Returns
    -------
        printed string detailing in query would be valid or not
    
    '''
    cell_metadata_check = census["census_data"]["homo_sapiens"].obs.read(column_names=metadata_columns).concat().to_pandas()

    for x in range(len(metadata_columns)):
        if col_vals[x] in cell_metadata_check[metadata_columns[x]].unique():
            print(col_vals[x], ' is in query')
        else:
            print(col_vals[x], ' is NOT in query. Rewrite querey before running get_anndata()')

In [8]:
metadata_columns = ['tissue_general_ontology_term_id','assay']
col_vals = ['UBERON:0002405',"10x 3' v3"]

check_cell_census_query(metadata_columns,col_vals)

UBERON:0002405  is in query
10x 3' v3  is in query


## Find Query Options

Before you run a query, see the options for a subset of columns.

The *obs* columns to query are

- soma_joinid
- dataset_id
- assay
- assay_ontology_term_id
- cell_type
- cell_type_ontology_term_id
- development_stage
- development_stage_ontology_term_id
- disease
- disease_ontology_term_id
- donor_id
- is_primary_data
- self_reported_ethnicity
- self_reported_ethnicity_ontology_term_id
- sex
- sex_ontology_term_id
- suspension_type
- tissue
- tissue_ontology_term_id
- tissue_general
- tissue_general_ontology_term_id


In [12]:
def see_cell_census_column_options(column_to_check):
    '''
    This function checks an active census object to identify the unique values contained in the
    column of interest.
    
    Assumes there is an active census object already open. Assumes you only want to query on cell metadata. 
    Gene metadata querying not currently supported.
    
    Parameters
    ----------
    column_to_check : string
        string containing obs parameter to query
                
    Returns
    -------
        printed string detailing unique values for input column
    
    '''
    cell_column_check = census["census_data"]["homo_sapiens"].obs.read(column_names=column_check).concat().to_pandas()
    for col in column_check:
        print('The unique values in ', col, ' are')
        print(cell_column_check[col].unique())
        print('')

In [14]:
column_check = ['cell_type_ontology_term_id']

see_cell_census_column_options(column_check)

The unique values in  cell_type_ontology_term_id  are
['CL:0000649' 'CL:0002187' 'CL:0000148' 'CL:0000312' 'CL:0000242'
 'CL:0000988' 'CL:2000092' 'CL:0002189' 'CL:0000499' 'CL:0000623'
 'CL:0000192' 'CL:0000151' 'CL:0000067' 'CL:0000235' 'CL:0000669'
 'CL:0000236' 'CL:0000097' 'CL:0000115' 'CL:0002138' 'CL:0000738'
 'CL:1000334' 'CL:0019032' 'CL:0002071' 'CL:0009039' 'CL:0000677'
 'CL:1000495' 'CL:0009042' 'CL:0009041' 'CL:0009043' 'CL:0009017'
 'CL:0002254' 'CL:0009012' 'CL:0009011' 'CL:0011026' 'CL:0009006'
 'CL:1000343' 'CL:1000353' 'CL:0000576' 'CL:0000451' 'CL:0000084'
 'CL:4030006' 'CL:0000057' 'CL:0000786' 'CL:0000003' 'CL:0000171'
 'CL:0000173' 'CL:0000169' 'CL:0002275' 'CL:1000329' 'CL:0000787'
 'CL:0000798' 'CL:0000909' 'CL:1000348' 'CL:0000064' 'CL:0000898'
 'CL:0000939' 'CL:0005012' 'CL:0000775' 'CL:0000158' 'CL:0000068'
 'CL:0000453' 'CL:0017000' 'CL:0000788' 'CL:0000990' 'CL:0000814'
 'CL:0000890' 'CL:0001065' 'CL:0000076' 'CL:0001058' 'CL:0000815'
 'CL:0000938' 'CL:0000

## Check Subset Of Data

Let's write a function so that we can filter on one set of data and check for the presence of a possible secondary filter.

In [4]:
def check_subset(filter,col):
    '''
    This function checks an active census object to identify the unique values contained in the
    column of interest, after filtering on an initial column.
    
    Assumes there is an active census object already open. Assumes you only want to query on cell metadata. 
    Gene metadata querying not currently supported. Currently only supports querying one column at a time.
    
    Parameters
    ----------
    filter : string
        string containing obs parameter filter
        
    col : string
        string containing column of interest for identifying unique values
                
    Returns
    -------
        printed string detailing unique values for input column after applying filter
    
    '''
    cell_data = (
        census["census_data"]["homo_sapiens"]
        .obs.read(value_filter=filter)
        .concat()
        .to_pandas()
    )
    
    print('After filtering on ', filter, 'the unique values for ', col, 'are:')
    print(cell_data[col].unique())
    

In [4]:
check_subset('''assay == "10x 3' v3"''','self_reported_ethnicity')

After filtering on  assay == "10x 3' v3" the unique values for  self_reported_ethnicity are:
['European' 'Asian' 'African American'
 'African American or Afro-Caribbean' 'unknown' 'multiethnic'
 'Greater Middle Eastern  (Middle Eastern, North African or Persian)'
 'Hispanic or Latin American' 'African' 'Chinese']


In [16]:
check_subset('tissue_general_ontology_term_id == "UBERON:0002405"', 'assay')

["10x 5' v1" "10x 3' v2"]


In [5]:
check_subset('assay == "10x 3\' v3"',
             'cell_type_ontology_term_id')

After filtering on  assay == "10x 3' v3" the unique values for  cell_type_ontology_term_id are:
['CL:0000151' 'CL:0000115' 'CL:0000499' 'CL:0000192' 'CL:0000669'
 'CL:0000623' 'CL:0000236' 'CL:0002138' 'CL:0000235' 'CL:0000097'
 'CL:0000067' 'CL:0000738' 'CL:1000334' 'CL:0019032' 'CL:0002071'
 'CL:0009039' 'CL:0000677' 'CL:1000495' 'CL:0009042' 'CL:0009041'
 'CL:0009043' 'CL:0009017' 'CL:0002254' 'CL:0009012' 'CL:0009011'
 'CL:0011026' 'CL:0009006' 'CL:1000343' 'CL:1000353' 'CL:0000576'
 'CL:0000451' 'CL:0000084' 'CL:4030006' 'CL:0000057' 'CL:0000786'
 'CL:0000003' 'CL:4023040' 'CL:0002605' 'CL:4023051' 'CL:4023070'
 'CL:4023012' 'CL:4023013' 'CL:0000128' 'CL:4023041' 'CL:4023017'
 'CL:1001602' 'CL:4023011' 'CL:4023038' 'CL:4023016' 'CL:4023036'
 'CL:4023018' 'CL:0000129' 'CL:4023015' 'CL:0002453' 'CL:0000583'
 'CL:0002063' 'CL:0002632' 'CL:0002062' 'CL:0000064' 'CL:0000745'
 'CL:0000750' 'CL:0000749' 'CL:0000636' 'CL:0000127' 'CL:0000604'
 'CL:0000573' 'CL:0000561' 'CL:1001509' 'CL:00

## Find 10X 3' V3 data from human immune cells

10x 3' V3 is the assay
homo sapiens gets us the human portion

how do we get immune cells only?

From (https://www.cancer.gov/publications/dictionaries/cancer-terms/def/immune-cell), immune cells include neutrophils, eosinophils, basophils, mast cells, monocytes, macrophages, dendritic cells, natural killer cells, and lymphocytes (B cells and T cells).

All of these show up as s ```cell_type``` using the ```obs``` axis. Some show up in multiple ways. We could create a list of cell_types to search for. 

Use cell_census.get_anndata to get the gene expression data

### UBERON:0002405 is the ontology code for the immune system. Need to find with ontology that is located within. But is not included in any of the ontologies for humans. Either

- we are using the wrong code
- we need another way to identify immune cells


In [6]:
# does this just bring in the meta data? I think I need the gene expression data as well. How do I get that? 
# use .get_anndata

# the below takes ~2 minutes to run

# I know we want this particular assay, so we'll include this as the only value, 
# then we need to look at the different ontology terms to try to find tissue

cell_10v3 = (
   census["census_data"]["homo_sapiens"].obs.read(value_filter='''assay == "10x 3\' v3" and cell_type_ontology_term_id == "CL:0000738"''').concat().to_pandas()
)


In [8]:
cell_10v3.shape

(68291, 21)

In [4]:
cell_10v3.columns

Index(['soma_joinid', 'dataset_id', 'assay', 'assay_ontology_term_id',
       'cell_type', 'cell_type_ontology_term_id', 'development_stage',
       'development_stage_ontology_term_id', 'disease',
       'disease_ontology_term_id', 'donor_id', 'is_primary_data',
       'self_reported_ethnicity', 'self_reported_ethnicity_ontology_term_id',
       'sex', 'sex_ontology_term_id', 'suspension_type', 'tissue',
       'tissue_ontology_term_id', 'tissue_general',
       'tissue_general_ontology_term_id'],
      dtype='object')

In [10]:
# both tissue_ontology_term_id and tissue_general_ontology_term_id contain
# UBERON numbers, but neither contain 2405, which equals the immune system

cell_10v3['tissue_general_ontology_term_id'].unique()


array(['UBERON:0003889', 'UBERON:0000992', 'UBERON:0002108',
       'UBERON:0001155', 'UBERON:0000995', 'UBERON:0002358',
       'UBERON:0003688', 'UBERON:0000160', 'UBERON:0007795',
       'UBERON:0001366', 'UBERON:0001255', 'UBERON:0001015',
       'UBERON:0002107', 'UBERON:0000059', 'UBERON:0000916',
       'UBERON:0000029', 'UBERON:0003697', 'UBERON:0035210',
       'UBERON:0000955', 'UBERON:0002048', 'UBERON:0000970',
       'UBERON:0000178', 'UBERON:0002240', 'UBERON:0001004',
       'UBERON:0000948', 'UBERON:0002367', 'UBERON:0001836',
       'UBERON:0000310', 'UBERON:0002113', 'UBERON:0002371',
       'UBERON:0001087', 'UBERON:0009472', 'UBERON:0002369',
       'UBERON:0002049', 'UBERON:0001007', 'UBERON:0002368',
       'UBERON:0000030', 'UBERON:0002106', 'UBERON:0001264',
       'UBERON:0002365', 'UBERON:0001723', 'UBERON:0001013',
       'UBERON:0002097', 'UBERON:0018707'], dtype=object)

In [16]:

human_immune_data = cell_census.get_anndata(
        census=census,
        organism = "Homo sapiens",
        obs_value_filter = 'tissue_ontology_term_id == "UBERON:0002299" and assay == "10x 3\' v3"',
        column_names={"obs": ["sex"]},
        )


# the below runs in 2 minutes:

#obs_value_filter="tissue_ontology_term_id=='UBERON:0002048'

# adata = cell_census.get_anndata(
#     census=census,
#     organism="Homo sapiens",
#     var_value_filter="feature_id in ['ENSG00000161798', 'ENSG00000188229']",
#     obs_value_filter="cell_type == 'B cell' and tissue_general == 'lung' and disease == 'COVID-19'",
#     column_names={"obs": ["sex"]},
# )

start 1135
stopped 1136

IF the obs_value_filter actually matches, then this is quick. If it does not match, it crashes. We should first check if search terms match

or i am grabbing a different data set somehow?
am I crashing the memory?

1) the only different now is looking for tissue_ontology_term_id. Double check that this search term is valid.

In [17]:
human_immune_data

AnnData object with n_obs × n_vars = 57747 × 60664
    obs: 'sex', 'tissue_ontology_term_id', 'assay'
    var: 'soma_joinid', 'feature_id', 'feature_name', 'feature_length'

In [5]:
help(cell_census.get_anndata)

Help on function get_anndata in module cell_census.get_anndata:

get_anndata(census: tiledbsoma.collection.Collection, organism: str, measurement_name: str = 'RNA', X_name: str = 'raw', obs_value_filter: Union[str, NoneType] = None, obs_coords: Union[NoneType, int, slice, Sequence[int], pyarrow.lib.Array, pyarrow.lib.ChunkedArray, numpy.ndarray[Any, numpy.dtype[numpy.integer]]] = None, var_value_filter: Union[str, NoneType] = None, var_coords: Union[NoneType, int, slice, Sequence[int], pyarrow.lib.Array, pyarrow.lib.ChunkedArray, numpy.ndarray[Any, numpy.dtype[numpy.integer]]] = None, column_names: Union[somacore.query.query.AxisColumnNames, NoneType] = None) -> anndata._core.anndata.AnnData
    Convience wrapper around soma.Experiment query, to build and execute a query,
    and return it as an AnnData object.
    
    [lifecycle: experimental]
    
    Parameters
    ----------
    census : soma.Collection
        The census object, usually returned by `cell_census.open_soma()`
    o

In [None]:
adata = cell_census.get_anndata(
        census=census,
        organism = "Homo sapiens",
        obs_value_filter = 'tissue_ontology_term_id == "UBERON:0002299" and assay == "10x 3\' v3"',
        column_names={"obs": ["development_stage"]},
        )

display(adata)

In [10]:
help(cell_census.get_anndata)

Help on function get_anndata in module cell_census.get_anndata:

get_anndata(census: tiledbsoma.collection.Collection, organism: str, measurement_name: str = 'RNA', X_name: str = 'raw', obs_value_filter: Union[str, NoneType] = None, obs_coords: Union[NoneType, int, slice, Sequence[int], pyarrow.lib.Array, pyarrow.lib.ChunkedArray, numpy.ndarray[Any, numpy.dtype[numpy.integer]]] = None, var_value_filter: Union[str, NoneType] = None, var_coords: Union[NoneType, int, slice, Sequence[int], pyarrow.lib.Array, pyarrow.lib.ChunkedArray, numpy.ndarray[Any, numpy.dtype[numpy.integer]]] = None, column_names: Union[somacore.query.query.AxisColumnNames, NoneType] = None) -> anndata._core.anndata.AnnData
    Convience wrapper around soma.Experiment query, to build and execute a query,
    and return it as an AnnData object.
    
    [lifecycle: experimental]
    
    Parameters
    ----------
    census : soma.Collection
        The census object, usually returned by `cell_census.open_soma()`
    o

In [16]:
cell_10v3.head()

Unnamed: 0,soma_joinid,dataset_id,assay,assay_ontology_term_id,cell_type,cell_type_ontology_term_id,development_stage,development_stage_ontology_term_id,disease,disease_ontology_term_id,...,is_primary_data,self_reported_ethnicity,self_reported_ethnicity_ontology_term_id,sex,sex_ontology_term_id,suspension_type,tissue,tissue_ontology_term_id,tissue_general,tissue_general_ontology_term_id
0,68036,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
1,68037,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
2,68038,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,monocyte,CL:0000576,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
3,68039,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
4,68040,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,monocyte,CL:0000576,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048


In [28]:
cell_10v3

Unnamed: 0,soma_joinid,dataset_id,assay,assay_ontology_term_id,cell_type,cell_type_ontology_term_id,development_stage,development_stage_ontology_term_id,disease,disease_ontology_term_id,...,is_primary_data,self_reported_ethnicity,self_reported_ethnicity_ontology_term_id,sex,sex_ontology_term_id,suspension_type,tissue,tissue_ontology_term_id,tissue_general,tissue_general_ontology_term_id
0,68036,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
1,68037,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
2,68038,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,monocyte,CL:0000576,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
3,68039,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,T cell,CL:0000084,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
4,68040,1e5bd3b8-6a0e-4959-8d69-cafed30fe814,10x 3' v3,EFO:0009922,monocyte,CL:0000576,63-year-old human stage,HsapDv:0000157,normal,PATO:0000461,...,True,European,HANCESTRO:0005,male,PATO:0000384,cell,alveolus of lung,UBERON:0002299,lung,UBERON:0002048
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18278131,43548596,d3a83885-5198-4b04-8314-b753b66ef9a8,10x 3' v3,EFO:0009922,"effector CD8-positive, alpha-beta T cell",CL:0001050,71-year-old human stage,HsapDv:0000165,benign prostatic hyperplasia,MONDO:0010811,...,False,European,HANCESTRO:0005,male,PATO:0000384,cell,transition zone of prostate,UBERON:8410025,prostate gland,UBERON:0002367
18278132,43548597,d3a83885-5198-4b04-8314-b753b66ef9a8,10x 3' v3,EFO:0009922,"effector CD8-positive, alpha-beta T cell",CL:0001050,71-year-old human stage,HsapDv:0000165,benign prostatic hyperplasia,MONDO:0010811,...,False,European,HANCESTRO:0005,male,PATO:0000384,cell,transition zone of prostate,UBERON:8410025,prostate gland,UBERON:0002367
18278133,43548598,d3a83885-5198-4b04-8314-b753b66ef9a8,10x 3' v3,EFO:0009922,CD1c-positive myeloid dendritic cell,CL:0002399,71-year-old human stage,HsapDv:0000165,benign prostatic hyperplasia,MONDO:0010811,...,False,European,HANCESTRO:0005,male,PATO:0000384,cell,transition zone of prostate,UBERON:8410025,prostate gland,UBERON:0002367
18278134,43548599,d3a83885-5198-4b04-8314-b753b66ef9a8,10x 3' v3,EFO:0009922,"effector CD8-positive, alpha-beta T cell",CL:0001050,71-year-old human stage,HsapDv:0000165,benign prostatic hyperplasia,MONDO:0010811,...,False,European,HANCESTRO:0005,male,PATO:0000384,cell,transition zone of prostate,UBERON:8410025,prostate gland,UBERON:0002367


In [26]:
cell_types = cell_10v3['cell_type'].unique()

In [27]:
print(cell_types)

['T cell' 'monocyte' 'dendritic cell' 'alveolar macrophage'
 'natural killer cell' 'B cell' 'mast cell' 'macrophage' 'plasma cell'
 'type II pneumocyte' 'endothelial cell'
 'epithelial cell of lower respiratory tract' 'smooth muscle cell'
 'fibroblast' 'type I pneumocyte' 'endothelial cell of lymphatic vessel'
 'ciliated cell' 'pericyte' 'enterocyte of epithelium of small intestine'
 'intestinal tuft cell' 'enterocyte of epithelium of large intestine'
 'colon goblet cell' 'gut absorptive cell' 'small intestine goblet cell'
 'enteroendocrine cell of colon' 'tuft cell of colon'
 'intestinal crypt stem cell of colon'
 'intestinal crypt stem cell of small intestine'
 'epithelial cell of small intestine'
 'transit amplifying cell of small intestine'
 'transit amplifying cell of colon' 'progenitor cell'
 'enteroendocrine cell of small intestine'
 'paneth cell of epithelium of small intestine'
 'microfold cell of epithelium of small intestine'
 'luminal epithelial cell of mammary gland' 'basa

## Test

In [15]:
adata = cell_census.get_anndata(
    census=census,
    organism="Homo sapiens",
    obs_value_filter='''assay == "10x 3\' v3" and cell_type_ontology_term_id == "CL:0000542"''',
    column_names={"obs": ["sex"]},
)


In [16]:
display(adata)

AnnData object with n_obs × n_vars = 42769 × 60664
    obs: 'sex', 'assay', 'cell_type_ontology_term_id'
    var: 'soma_joinid', 'feature_id', 'feature_name', 'feature_length'

In [12]:
adata.X

<704x2 sparse matrix of type '<class 'numpy.float32'>'
	with 102 stored elements in Compressed Sparse Row format>

testing example query:
start 1449
end 1451

Why is this so fast and mine so slow?