Description: 
We integrated and analyzed the single-cell transcriptomes of 48 human pancreatic islets coming from 17 Healthy, 14 Pre-Diabetic and 17 Type 2 Diabetic cadaveric donors of matched sex, age, and ancestry. Our dataset consists of 245,878 high-quality cells, clustered into 14 cell types. We summarize the findings of the integrated cell type transcriptomes, as well as those of the re-integrated Alpha, Beta and Delta cell types.

Reference:
Single-cell decoding of human islet cell type-specific alterations in type 2 diabetes reveals converging genetic- and state-driven Î²-cell gene expression defects
Khushdeep Bandesh, Efthymios Motakis, Siddhi Nargund, Romy Kursawe, Vijay Selvam, Redwan M Bhuiyan, Giray Naim Eryilmaz, Sai Nivedita Krishnan, Cassandra N. Spracklen, Duygu Ucar, Michael L. Stitzel
bioRxiv 2025.01.17.633590; doi: https://doi.org/10.1101/2025.01.17.633590

Data: https://datasets.cellxgene.cziscience.com/d45ff50f-90e1-4983-9388-c5b2ca1f2866.h5ad

In [3]:
!pip install scanpy
!pip install igraph
!pip install leidenalg
!pip install anndata
!pip install pandas



In [4]:
import scanpy as sc
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

In [None]:
import requests

url = 'https://datasets.cellxgene.cziscience.com/d45ff50f-90e1-4983-9388-c5b2ca1f2866.h5ad'
response = requests.get(url)

In [4]:
#Save the file locally
with open('d45ff50f-90e1-4983-9388-c5b2ca1f2866.h5ad', 'wb') as file:
    file.write(response.content)

In [5]:
sc_data = sc.read_h5ad("d45ff50f-90e1-4983-9388-c5b2ca1f2866.h5ad")

In [None]:
output_h5ad = sc_data.copy()

sc.pp.normalize_total(output_h5ad)

sc.pp.log1p(output_h5ad)

In [5]:
#print(output_h5ad.var.columns)

Index(['vst.mean', 'vst.variance', 'vst.variance.expected',
       'vst.variance.standardized', 'vst.variable', 'feature_is_filtered',
       'feature_name', 'feature_reference', 'feature_biotype',
       'feature_length', 'feature_type'],
      dtype='object')


In [6]:
#print(output_h5ad.var['feature_name'])

ENSG00000241860    ENSG00000241860
ENSG00000286448    ENSG00000286448
ENSG00000237491          LINC01409
ENSG00000177757             FAM87B
ENSG00000228794          LINC01128
                        ...       
ENSG00000261442    ENSG00000261442
ENSG00000262815    ENSG00000262815
ENSG00000267480          LINC02979
ENSG00000236352    ENSG00000236352
ENSG00000228459          LINC01546
Name: feature_name, Length: 26936, dtype: category
Categories (26896, object): ['A1BG', 'A1BG-AS1', 'A1CF', 'A2M', ..., 'ZYG11B', 'ZYX', 'ZZEF1', 'ZZZ3']


In [7]:
# Find genes containing 'KIF21A'
#matches = [gene for gene in output_h5ad.var.feature_name if 'KIF21A' in gene]
#print(matches)

['KIF21A']


In [None]:
normal_h5ad = output_h5ad[output_h5ad.obs.disease == 'normal', :]
T2D_h5ad = output_h5ad[output_h5ad.obs.disease == 'type 2 diabetes mellitus', :]

In [None]:
gene_name = 'KIF21A'

normal_expr = normal_h5ad[:, normal_h5ad.var.feature_name == gene_name].X.toarray().flatten()
T2D_expr = T2D_h5ad[:, T2D_h5ad.var.feature_name == gene_name].X.toarray().flatten()

normal_expr = list(filter(lambda x: x != 0, normal_expr))
T2D_expr = list(filter(lambda x: x != 0, T2D_expr))

In [None]:
max_length = max(len(normal_expr), len(T2D_expr))
df = pd.DataFrame({
    'normal': normal_expr + [None] * (max_length - len(normal_expr)),
    'type 2 diabetes mellitus': T2D_expr + [None] * (max_length - len(T2D_expr)),
})
df.to_csv('sc_human_KIF21A_expr_disease.csv', index=False)

In [None]:
df = pd.DataFrame({
    'Kif21a': np.concatenate([normal_expr, T2D_expr]),
    'Group': ['normal'] * len(normal_expr) + ['type 2 diabetes mellitus'] * len(T2D_expr)
})

# Create violin plot
plt.figure(figsize=(8, 6))
sns.violinplot(x='Group', y='Kif21a', data=df)
plt.title(f'KIF21A Expression in human normal vs type 2 diabetes mellitus')
plt.ylabel('KIF21A Gene Expression')
plt.show()