#SARS-CoV-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes

Authors: Waradon Sungnak, Ni Huang, Christophe Bécavin, Marijn Berg, Rachel Queen, Monika Litvinukova, Carlos Talavera-López, Henrike Maatz, Daniel Reichart, Fotios Sampaziotis, Kaylee B. Worlock, Masahiro Yoshida, Josephine L. Barnes & HCA Lung Biological Network -  Nat Med 26, 681–687 (2020). https://doi.org/10.1038/s41591-020-0868-6

The authors investigated SARS-CoV-2 potential tropism by surveying expression of viral entry-associated genes in single-cell RNA-sequencing data from multiple tissues from healthy human donors.

They co-detected these transcripts in specific respiratory, corneal and intestinal epithelial cells, potentially explaining the high efficiency of SARS-CoV-2 transmission. These genes are co-expressed in nasal epithelial cells with genes involved in innate immunity, highlighting the cells’ potential role in initial viral infection, spread and clearance. The study offers a useful resource for further lines of inquiry with valuable clinical samples from COVID-19 patients and we provide our data in a comprehensive, open and user-friendly fashion at www.covid19cellatlas.org.

https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.offline as py
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

#ACE2 as a receptor for cellular entry

In symptomatic patients, nasal swabs have yielded higher viral loads than throat swabs. The same distribution was observed in an asymptomatic patient, implicating the nasal epithelium as a portal for initial infection and transmission. Cellular entry of coronaviruses depends on the binding of the spike (S) protein to a specific cellular receptor and subsequent S protein priming by cellular proteases. Similarly to SARS-CoV5,6, SARS-CoV-2 employs ACE2 as a receptor for cellular entry.

The binding affinity of the S protein and ACE2 was found to be a major determinant of SARS-CoV replication rate and disease severity,. Viral entry also depends on TMPRSS2 protease activity and cathepsin B/L activity may be able to substitute for TMPRSS2 (Transmembrane protease, serine 2 is an enzyme that in humans is encoded by the TMPRSS2 gene).
https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
df = pd.read_csv('/kaggle/input/ai4all-project/results/deconvolution/CIBERSORTx_Results_Krasnow_facs_droplet.csv')
df.head()

In [None]:
import plotly.express as px

# Grouping it by viral load and B Cell
plot_data = df.groupby(['viral_load', 'B cell'], as_index=False).Neutrophil.sum()

fig = px.bar(plot_data, x='viral_load', y='Neutrophil', color='B cell')
fig.show()

#ACE2 (Angiotensin-converting enzyme 2) and TMPRSS2 (Transmembrane protease, serine 2).

ACE2 and TMPRSS2 (Transmembrane protease, serine 2) have been detected in both nasal and bronchial epithelium by immunohistochemistry. Gene expression of ACE2 and TMPRSS2 has been reported to occur largely in alveolar epithelial type II cells, which are central to SARS-CoV pathogenesis, whereas a different study reported the absence of ACE2 in the upper airway. 

To clarify the expression patterns of ACE2 and TMPRSS2(Transmembrane protease, serine 2), the authors analyzed their expression and the expression of other genes potentially associated with SARS-CoV-2 pathogenesis at cellular resolution, using single-cell RNA sequencing (scRNA-seq) datasets from healthy donors generated by the Human Cell Atlas consortium and other resources to inform and prioritize the use of precious, limited clinical material that is becoming available from COVID-19 patients.
https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
from plotly.subplots import make_subplots


fig= make_subplots(rows= 2,cols=2, 
                    specs=[[{'secondary_y': True},{'secondary_y': True}],[{'secondary_y': True},{'secondary_y': True}]],
                    subplot_titles=("Basophil/Mast","Dendritic","T cell","Goblet")
                   )
fig.add_trace(go.Bar(x=df['viral_load'],y=df['Basophil/Mast'],
                    marker=dict(color=df['Basophil/Mast'],coloraxis='coloraxis')),1,1)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['Dendritic'],
                    marker=dict(color=df['Dendritic'],coloraxis='coloraxis1')),1,2)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['T cell'],
                    marker=dict(color=df['T cell'],coloraxis='coloraxis2')),2,1)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['Goblet'],
                    marker=dict(color=df['Goblet'],coloraxis='coloraxis3')),2,2)

#Gene expression of ACE2 (Angiotensin-converting enzyme 2)

Gene expression of ACE2 (Angiotensin-converting enzyme 2) in an in vitro air-liquid interface (ALI) system Epithelial regeneration system from nasal epithelial cells was used for in vitro cultures on successive days, resulting in different epithelial cell types along differentiation trajectory.

The cultures were differentiated in Pneumacult media. Schematic illustration depicts the respective cell types in the differentiation trajectory, and the dot plot illustrates the cultured cell types along the differentiation pseudotime, along with their respective location within the epithelial layers.

For gene expression results in the dot plot: the dot size represents the proportion of cells within the respective cell type expressing the gene and the dot color represents the average gene expression level within the particular cell type.

![](https://www.researchgate.net/publication/340867442/figure/fig1/AS:883712338178050@1587704848523/Gene-expression-of-ACE2-in-an-in-vitro-air-liquid-interface-ALI-system-Epithelial.jpg)https://www.researchgate.net/figure/Gene-expression-of-ACE2-in-an-in-vitro-air-liquid-interface-ALI-system-Epithelial_fig1_340867442

In [None]:
fig= make_subplots(rows= 2,cols=2, 
                    specs=[[{'secondary_y': True},{'secondary_y': True}],[{'secondary_y': True},{'secondary_y': True}]],
                    subplot_titles=("Basal","Ciliated","Ionocyte","Monocytes/macrophages")
                   )
fig.add_trace(go.Bar(x=df['viral_load'],y=df['Basal'],
                    marker=dict(color=df['Basal'],coloraxis='coloraxis')),1,1)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['Ciliated'],
                    marker=dict(color=df['Ciliated'],coloraxis='coloraxis')),1,2)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['Ionocyte'],
                    marker=dict(color=df['Ionocyte'],coloraxis='coloraxis')),2,1)

fig.add_trace(go.Bar(x=df['viral_load'],y=df['Monocytes/macrophages'],
                    marker=dict(color=df['Monocytes/macrophages'],coloraxis='coloraxis')),2,2)

#Viral receptors and entry-associated molecules in the nasal region.

The authors asked whether enriched expression of viral receptors and entry-associated molecules in the nasal region/upper airway might be relevant for viral transmissibility. They assessed the expression of viral receptor genes that are used by other coronaviruses and influenza viruses in our datasets.

They looked for ANPEP, used by HCoV-22944 and DPP4, used by MERS-CoV45, as well as enzymes ST6GAL1 and ST3GAL4, which are important for the synthesis of α(2,6)-linked and α(2,3)-linked sialic acids recognized by influenza viruses.

Notably, their expression distribution coincided with viral transmissibility patterns based on a comparison to the basic reproduction number (R0), which estimates the number of people who can become infected from a single infected person.

The skewed distribution of the receptors/enzymes toward the upper airway is observed in viruses with higher R0/infectivity, including those of SARS-CoV/SARS-CoV-2 (R0 ~1.4–5.0), influenza (mean R0 ~1.347) and HCoV-229E (unidentified R0; associated with common cold). 

This distribution is in distinct contrast with that of DPP4, the receptor for MERS-CoV (R0 ~0.3–0.8), a coronavirus with limited human-to-human transmission, in which expression skews toward lower airway/lung parenchyma. 

Therefore, their data highlight the possibility that viral transmissibility is dependent on the spatial distribution of receptor accessibility along the respiratory tract.https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
import plotly.express as px

fig = px.histogram(df, x="viral_load", y="B cell", color = 'czb_id',
                   marginal="rug", # or violin, rug,
                   hover_data=df.columns,
                   color_discrete_sequence=['indianred','lightblue'],
                   )

fig.update_layout(
    title="Covid-19 Viral Load",
    xaxis_title="Viral Load",
    yaxis_title="B cells",
)
fig.update_yaxes(tickangle=-30, tickfont=dict(size=7.5))

fig.show();

#Expression of ACE2 (an entry receptor for SARS-CoV and SARS-CoV-2)

Expression of ACE2 (an entry receptor for SARS-CoV and SARS-CoV-2), ANPEP (an entry receptor for HCoV-229E), ST6GAL1/ST3GAL4 (enzymes important for synthesis of influenza entry receptors) and DPP4 (an entry receptor for MERS-CoV) from the airway epithelial datasets: Vieira Braga et al.26 (left) and Deprez et al.27 (right).

The basic reproductive number (R0) for respective viruses, if available, is shown. b, Respiratory epithelial expression of the top 50 genes correlated with ACE2 expression based on Spearman’s correlation analysis (with Benjamini–Hochberg-adjusted P values) performed on all cells within the Vieira Braga et al.26 airway epithelial dataset. 

The colored gene names represent genes that are immune-associated (GO:0002376, immune system process or GO:0002526, acute inflammatory response). For gene expression results in the dot plots, the dot size represents the proportion of cells within the respective cell type expressing the gene and the color represents the average gene expression level within the particular cell type.

![](https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41591-020-0868-6/MediaObjects/41591_2020_868_Fig2_HTML.png?as=webp)https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
import plotly.express as px

fig = px.histogram(df, x="viral_load", y="T cell", color = 'Dendritic',
                   marginal="rug", # or violin, rug,
                   hover_data=df.columns,
                   color_discrete_sequence=['fuchsia','cyan'],
                   )

fig.update_layout(
    title="Covid-19 Viral Load",
    xaxis_title="Viral Load",
    yaxis_title="T cells",
)
fig.update_yaxes(tickangle=-30, tickfont=dict(size=7.5))

fig.show();

#Study findings: Drugs/Vaccines administered intranasally could be highly effective in limiting spread.

In this study, the authors explored multiple scRNA-seq datasets generated within the Human Cell Atlas (HCA) consortium and other resources and found that the SARS-CoV-2 entry receptor ACE2 and viral entry-associated protease TMPRSS2 are highly expressed in nasal goblet and ciliated cells.

This finding implicates these cells as loci of original infection and possible reservoirs for dissemination within and between individuals. Co-expression in other barrier surface tissues could also suggest further investigation into alternative transmission routes. 

For example, the co-expression in esophagus, ileum and colon could explain viral fecal shedding observed clinically, with implications for potential fecal–oral transmission, whereas the co-expression in superficial conjunctival cells could explain an ocular phenotype observed in a small portion of COVID-19 patients with the potential of spread through the nasolacrimal duct

Moreover, as SARS-CoV-2 is an enveloped virus, its release does not require cell lysis. Thus, the virus might exploit existing secretory pathways in nasal goblet cells sustained at a presymptomatic stage. These discoveries could have translational implications. For example, given that nasal carriage is likely to be a key feature of transmission, drugs/vaccines administered intranasally could be highly effective in limiting spread.
.https://www.nature.com/articles/s41591-020-0868-6#citeas

In [None]:
fig = px.parallel_categories(df, color="B cell", color_continuous_scale=px.colors.sequential.OrRd)
fig.show()

#Methods

For correlation analysis with ACE2, the authors performed Spearman’s correlation with statistical tests using the R Hmisc package (v.4.3-1) and P values were adjusted with the Benjamini–Hochberg method with the R stats package (v.3.6.1).

They also tested multiple additional approaches, including Kendall’s correlation, data transformation by sctransform function in the Seurat package and data imputation by the Markov affinity-based graph imputation of cells algorithm to compare correlation results.

While imputation significantly improved correlations, the top genes correlated with ACE2 are largely the same as the analysis performed on un-imputed data. With the uncertainty of the extent that imputation artificially distorted the data, they reported results with no imputation, even though correlations were low. 

The correlation coefficients for all genes are included as Supplementary Data. The top 50 genes in each dataset were characterized based on gene ontology classes from the Gene Ontology database and associated pathways in PathCards were from the Pathway Unification database.https://www.nature.com/articles/s41591-020-0868-6#citeas

Analysis notebooks are available at github.com/Teichlab/covid19_MS1.

In [None]:
import plotly.figure_factory as ff
fig = make_subplots(rows=1, cols=5)
df_num = df[['B cell', 'T cell', 'Neutrophil', 'Club', 'Monocytes/macrophages']]

fig1 = ff.create_distplot([df_num['B cell']], ['B cell'])
fig2 = ff.create_distplot([df_num['T cell']], ['T cell'])
fig3 =  ff.create_distplot([df_num['Neutrophil']], ['Neutrophil'])
fig4 =  ff.create_distplot([df_num['Club']], ['Club'])
fig5 =  ff.create_distplot([df_num['Monocytes/macrophages']], ['Monocytes/macrophages'])

fig.add_trace(go.Histogram(fig1['data'][0], marker_color='blue'), row=1, col=1)
fig.add_trace(go.Histogram(fig2['data'][0],marker_color='red'), row=1, col=2)
fig.add_trace(go.Histogram(fig3['data'][0], marker_color='green'), row=1, col=3)
fig.add_trace(go.Histogram(fig4['data'][0],marker_color='yellow'), row=1, col=4)
fig.add_trace(go.Histogram(fig5['data'][0],marker_color='purple'), row=1, col=5)


fig.show()

#Current Understanding of Nasal Epithelial Cell Mis-Differentiation

Authors: Agmal Scherzad, Rudolf Hagen, and Stephan Hackenberg - J Inflamm Res. 2019; 12: 309–317.
Published online 2019 Dec 13. doi: 10.2147/JIR.S180853

The functional role of the respiratory epithelium is to generate a physical barrier. In addition, the epithelium supports the innate and acquired immune system through various cytokines and chemokines. However, epithelial cells are also involved in the pathogenesis of various respiratory diseases, some of which are mediated by increased permeability of the mucosal membrane or disturbed mucociliary transport. 

In addition, it has been shown that epithelial cells are involved in the development of inflammatory respiratory diseases. The following review article focuses on the aspects of epithelial mis-differentiation, in particular with respect to nasal mucosal barrier function, epithelial immunogenicity, nasal epithelial–mesenchymal transition and nasal microbiome.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916682/

In [None]:
df1 = pd.read_csv('/kaggle/input/ai4all-project/figures/classifier/auc_table.csv')
df1.head()

In [None]:
ax = df1['Covid vs Other virus'].head(10).value_counts().plot.barh(figsize=(14, 6), color='orange')
ax.set_title('Covid vs Other virus Distribution',color='green', size=18)
ax.set_ylabel('Covid vs Other virus', size=14)
ax.set_xlabel('Count', size=14)

#Nasal mucosal barrier function

The nasal mucosa represents an interface between the environment and the inside of the human organism. It is the first barrier against continuously inhaled substances such as pathogens and allergens.

An important intrinsic defense system is the mucociliary clearance of the nasal cavity. Ciliary beat in a well-orchestrated and coordinated manner, which results in a wave motion leading to a successful elimination of foreign bodies. The respiratory epithelium contains about 200 cilia per cell. These have nine peripheral microtubule pairs that surround a central microtubule pair, which leads to the well-known “9+2” arrangement of microtubules.

Chronic inflammation or locally applied medication can have negative effects on epithelium functions, which are associated with the disturbed or missing ciliary activity, epithelial metaplasia leading to an impaired mucociliary clearance. Thus, the integrity of the nasal protective mechanisms may be further compromised.

Other possible etiologic factors for nasal epithelia metaplasia are cigarette smoke, ozone, and heavy metals. Chronic inflammation such as chronic rhinosinusitis (CRS) or asthma leads to epithelial damage resulting in increased paracellular permeability, impaired epithelial repair mechanisms and inflammation. 

Histologically, the respiratory epithelium changes into a hypersecretory mucus state with increased proliferation rates of goblet cells, hypertrophy of submucosal glands, basement membrane thickening, hypertrophy of smooth muscles, and a thick layer of mucus on the apical surface.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916682/

In [None]:
#Code by Taha07  https://www.kaggle.com/taha07/data-scientists-jobs-analysis-visualization/notebook

color = plt.cm.Pastel1(np.linspace(0,1,20))
df1["Model"].value_counts().sort_values(ascending=False).head(10).plot.pie(y="Covid vs Other virus",colors=color,autopct="%0.1f%%")
plt.title("Covid vs Other Virus Models")
plt.axis("off")
plt.show()

#The Claudin family

The claudin family, comprising 27 members, are transmembrane proteins that form the structural basis for a close TJ connection. Typically, claudins have a unique secondary protein structure: four transmembrane (TM) domains, N- and C-terminal domains aligned to the cytosol, two extracellular domains, and a short intracellular loop.

Functionally, the main task of the claudins is the formation of the paracellular TJ barrier and therefore the key position with regard to the permeability of individual epithelia. In addition, claudins are categorized according to their abilities, ie, formation of paracellular channels (pore-forming) and restriction of paracellular permeability (sealing claudins).

This emphasizes the different characteristics of the individual claudin family members with regard to their barrier properties. Some claudin subtypes, such as claudin-3, −4, −5, and −8, are mainly detectable in impermeable epithelial cells.

Other claudin species, such as claudin-2, are found in permeable epithelia like the surface epithelium of duodenum, ileum and jejunum. This shows the different role of claudins in the epithelial barrier functionality.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916682/

In [None]:
fig = px.bar(df1, x= "Covid vs Other virus", y= "Model", color_discrete_sequence=['purple'], title="Covid vs Other virus Models")
fig.show()

#Nasal Epithelial Innate Immunogenicity - Host defense

The mechanical barrier function of the nose has been known for a long time, but its immunogenic function has been the subject of new findings in recent decades, which show that the nasal mucosa is an amazingly active participant in the innate immunity of the respiratory tract. 

However, the detailed function of this immunogenicity is not fully understood. As part of the innate immune defense, the nasal mucosa includes receptors for the identification of pathogenic structures of microorganisms, fungi and viruses. 

Further mechanisms are chemical components such as antimicrobial peptides and cellular components such as neutrophil granulocytes, macrophages and dendritic cells. Although pathogens can be distinguished from non-pathogens, innate immunity is relatively unspecific compared to the adaptive immune response. The recognition of pathogens is achieved by pattern-recognition receptors (PRRs) on the mucosal surface, which were first described by Charles Janeway Jr. PRRs can be divided into three large subunits, namely the Toll-like receptors (TLRs), retinoic acid-inducible gene (RIG)-I-like receptors (RLRs) and nucleotide-binding oligomerisation domain (NOD)-like receptors (NLRs).

NLRs (NOD-like receptors) are intracellular PRRs that induce an immune response after the detection of PAMPs(Pathogen-associated molecular patterns) or damage-associated molecular patterns (DAMPs). The activation of NLRs shows different functions, which can be divided into four main categories: inflammasome formation, signaling transduction, transcription activation, and autophagy. In humans, 22 NLRs are known, and the association malfunction with human diseases reflect their vital role in host defense.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6916682/

In [None]:
fig = px.bar(df1, x= "Model", y= "All", color_discrete_sequence=['darksalmon'], title="Covid vs No virus Models")
fig.show()

In [None]:
#Code by Olga Belitskaya https://www.kaggle.com/olgabelitskaya/sequential-data/comments
from IPython.display import display,HTML
c1,c2,f1,f2,fs1,fs2=\
'#eb3434','#eb3446','Akronim','Smokum',30,15
def dhtml(string,fontcolor=c1,font=f1,fontsize=fs1):
    display(HTML("""<style>
    @import 'https://fonts.googleapis.com/css?family="""\
    +font+"""&effect=3d-float';</style>
    <h1 class='font-effect-3d-float' style='font-family:"""+\
    font+"""; color:"""+fontcolor+"""; font-size:"""+\
    str(fontsize)+"""px;'>%s</h1>"""%string))
    
    
dhtml('Marília Prata, @mpwolke Was here' )