# Census summary cell counts example

*Goal:* demonstrate basic use of the `census_summary_cell_counts` dataframe.

Each Cell Census contains a top-level dataframe summarizing counts of various cell labels. You can read this into a Pandas DataFrame:

In [3]:
import cell_census

census = cell_census.open_soma()
census_summary_cell_counts = census["census_info"]["summary_cell_counts"].read_as_pandas_all()

# Dropping the soma_joinid column as it isn't useful in this demo
census_summary_cell_counts = census_summary_cell_counts.drop(columns=["soma_joinid"])

census_summary_cell_counts

Unnamed: 0,organism,category,ontology_term_id,unique_cell_count,total_cell_count,label
0,Homo sapiens,all,na,22044980,34115852,na
1,Homo sapiens,assay,EFO:0008722,177719,260396,Drop-seq
2,Homo sapiens,assay,EFO:0008780,0,51304,inDrop
3,Homo sapiens,assay,EFO:0008913,133511,133511,single-cell RNA sequencing
4,Homo sapiens,assay,EFO:0008919,44721,161998,Seq-Well
...,...,...,...,...,...,...
1147,Mus musculus,tissue_general,UBERON:0002113,164881,188361,kidney
1148,Mus musculus,tissue_general,UBERON:0002365,15577,31154,exocrine gland
1149,Mus musculus,tissue_general,UBERON:0002367,37715,130135,prostate gland
1150,Mus musculus,tissue_general,UBERON:0002368,13322,26644,endocrine gland


This dataframe is precomputed from the experiments in the Cell Census, and is intended to simplify quick looks at the Census contents.

You can do similar group statistics using Pandas `groupby` functions. 

The code below reproduces the above counts using full `obs` dataframe in the `Homo_sapiens` experiment.

Keep in mind that the Cell Census is very large, and any queries will return significant amount of data. You can manage that by narrowing the query request using `column_names` and `value_filter` in your query.

In [4]:
human = census["census_data"]["homo_sapiens"]
obs_df = human.obs.read_as_pandas_all(column_names=["cell_type_ontology_term_id", "cell_type"])
obs_df.groupby(by=["cell_type_ontology_term_id", "cell_type"], as_index=False, observed=True).size()

Unnamed: 0,cell_type_ontology_term_id,cell_type,size
0,CL:0000001,primary cultured cell,80
1,CL:0000003,native cell,611233
2,CL:0000006,neuronal receptor cell,2502
3,CL:0000019,sperm,11
4,CL:0000031,neuroblast (sensu Vertebrata),2355
...,...,...,...
540,CL:4023041,L5 extratelencephalic projecting glutamatergic...,2361
541,CL:4023051,vascular leptomeningeal cell,3937
542,CL:4023070,caudal ganglionic eminence derived GABAergic c...,8463
543,CL:4028002,alveolar capillary type 1 endothelial cell,16048
