#What is the ACE2 receptor, how is it connected to coronavirus and why might it be key to treating COVID-19?

Authors: Krishna Sriram: Postdoctoral Fellow, University of California San Diego 
Paul Insel: Professor of Pharmacology and Medicine, University of California San Diego 
Rohit Loomba: Professor of Medicine, University of California San Diego 

In the search for treatments for COVID-19, many researchers are focusing their attention on a specific protein that allows the virus to infect human cells. Called the angiotensin-converting enzyme 2, or ACE2 “receptor,” the protein provides the entry point for the coronavirus to hook into and infect a wide range of human cells.

ACE2 is a protein on the surface of many cell types. It is an enzyme that generates small proteins – by cutting up the larger protein angiotensinogen – that then go on to regulate functions in the cell.

Using the spike-like protein on its surface, the SARS-CoV-2 virus binds to ACE2 – like a key being inserted into a lock – prior to entry and infection of cells. Hence, ACE2 acts as a cellular doorway – a receptor – for the virus that causes COVID-19. https://theconversation.com/what-is-the-ace2-receptor-how-is-it-connected-to-coronavirus-and-why-might-it-be-key-to-treating-covid-19-the-experts-explain-136928

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcT3hI0zL5ogpWGhM1kedDeZvdpRtewiXOh5ZIAaf7M-Ask5CM8-&usqp=CAU',width=400,height=400)

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

ACE2 is present in many cell types and tissues including the lungs, heart, blood vessels, kidneys, liver and gastrointestinal tract.  In the lungs, ACE2 is highly abundant on type 2 pneumocytes, an important cell type present in chambers within the lung called alveoli, where oxygen is absorbed and waste carbon dioxide is released.

ACE2 is a vital element in a biochemical pathway that is critical to regulating processes such as blood pressure, wound healing and inflammation, called the renin-angiotensin-aldosterone system (RAAS) pathway.

ACE2 helps modulate the many activities of a protein called angiotensin II (ANG II) that increases blood pressure and inflammation.

Of greatest relevance to COVID-19, ANG II can increase inflammation and the death of cells in the alveoli which are critical for bringing oxygen into the body. https://theconversation.com/what-is-the-ace2-receptor-how-is-it-connected-to-coronavirus-and-why-might-it-be-key-to-treating-covid-19-the-experts-explain-136928

In [None]:
df = pd.read_csv('../input/cusersmarildownloadsangiotensincsv/angiotensin.csv', sep=';')
df

#Codes from Nitin Datta  https://www.kaggle.com/nitindatta/finance-data-analysis

In [None]:
# First we will see Gender of our respondenets
sns.countplot(df['Genotyping'],linewidth=3,palette="Set2",edgecolor='black')
plt.show()

When the SARS-CoV-2 virus binds to ACE2, it prevents ACE2 from performing its normal function to regulate ANG II signaling. Thus, ACE2 action is “inhibited,” removing the brakes from ANG II signaling and making more ANG II available to injure tissues. This “decreased braking” likely contributes to injury, especially to the lungs and heart, in COVID-19 patients. 

ACE2 is present in all people but the quantity can vary among individuals and in different tissues and cells. Some evidence suggests that ACE2 may be higher in patients with hypertension, diabetes and coronary heart disease.

A lack of ACE2 is associated with severe tissue injury in the heart, lungs and other tissue types. 

The SARS-CoV-2 virus requires ACE2 to infect cells but the precise relationship between ACE2 levels, viral infectivity and severity of infection are not well understood.https://theconversation.com/what-is-the-ace2-receptor-how-is-it-connected-to-coronavirus-and-why-might-it-be-key-to-treating-covid-19-the-experts-explain-136928

In [None]:
# First we will see Gender of our respondenets
sns.countplot(df['End_point'],linewidth=3,palette="Set2",edgecolor='black')
plt.show()

When the amount of ACE2 is reduced because the virus is occupying the receptor, individuals may be more susceptible to severe illness from COVID-19. That is because enough ACE2 is available to facilitate viral entry but the decrease in available ACE2 contributes to more ANG II-mediated injury. In particular, reducing ACE2 will increase susceptibility to inflammation, cell death and organ failure, especially in the heart and the lung.

The lungs are the primary site of injury by SARS-CoV-2 infection, which causes COVID-19. The virus reaches the lungs after entry in the nose or mouth.

The virus also impacts other tissues that express ACE2, including the heart, where damage and inflammation (myocarditis) can occur. The kidneys, liver and digestive tract can also be injured. Blood vessels may also be a site for damage. 

Abnormally high ANG II activity can be a key factor that determines severity of damage in patients with COVID-19.
https://theconversation.com/what-is-the-ace2-receptor-how-is-it-connected-to-coronavirus-and-why-might-it-be-key-to-treating-covid-19-the-experts-explain-136928

In [None]:
# First we will see Gender of our respondenets
sns.countplot(df['Country'],linewidth=3,palette="Set3",edgecolor='black')
plt.show()

Angiotensin converting enzyme (ACE, aka ACE1) is another protein, also found in tissues such as the lung and heart, where ACE2 is present. Drugs that inhibit the actions of ACE1 are called ACE inhibitors.

Examples of these drugs are ramipril, lisinopril, and enalapril. These drugs block the actions of ACE1 but not ACE2. ACE1 drives the production of ANG II. In effect, ACE1 and ACE2 have a “yin-yang” relationship; ACE1 increases the amount of ANG II, whereas ACE2 reduces ANG II. 

By inhibiting ACE1, ACE inhibitors reduce the levels of ANG II and its ability to increase blood pressure and tissue injury. ACE inhibitors are commonly prescribed for patients with hypertension, heart failure and kidney disease. 

A multidisciplinary group of investigators, has initiated a multicenter (randomized, double-blinded, placebo-controlled) clinical trial to examine the efficacy of ramipril - an ACE inhibitor - compared to a placebo in reducing mortality, ICU admission or need for mechanical ventilation in patients with COVID-19.https://theconversation.com/what-is-the-ace2-receptor-how-is-it-connected-to-coronavirus-and-why-might-it-be-key-to-treating-covid-19-the-experts-explain-136928

In [None]:
sns.countplot(x=df['Selection'],palette='coolwarm',linewidth=2,edgecolor='black')

#High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa

Citation: Xu, H., Zhong, L., Deng, J. et al. High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa. Int J Oral Sci 12, 8 (2020). https://doi.org/10.1038/s41368-020-0074-x

The ACE2 expressed on the mucosa of oral cavity.This receptor was highly enriched in epithelial cells of tongue. Those findings have explained the basic mechanism that the oral cavity is a potentially high risk for 2019-nCoV infectious susceptibility and provided a piece of evidence for the future `prevention strategy in dental clinical practice` as well as daily life.

The expression and distribution of the ACE2 in human body may indicate the potential infection routes of 2019-nCoV. Through the developed single-cell RNA sequencing (scRNA-Seq) technique and single-cell transcriptomes based on the public database, researchers analyzed the ACE2 RNA expression profile at single-cell resolution.

High ACE2 expression was identified in type II alveolar cells (AT2) of lung, and many others. These findings indicated that those organs with high ACE2-expressing cells should be considered as potential high risk for 2019-nCoV infection.

The ACE2 could be expressed in the oral cavity, and was highly enriched in epithelial cells. Moreover, among different oral sites, ACE2 expression was higher in tongue than buccal and gingival tissues. These findings indicate that the mucosa of oral cavity may be a potentially high risk route of 2019-nCov infection. https://www.nature.com/articles/s41368-020-0074-x#citeas

In [None]:
plt.figure(figsize=(18,6))
plt.subplot(1, 2, 1)
sns.countplot(x=df['LVH_GG'],hue=df['Outcomer'],palette='summer',linewidth=3,edgecolor='white')
plt.title('Outcomer')
plt.subplot(1, 2, 2)
sns.countplot(x=df['LVH_GG'],hue=df['Comparability'],palette='hot',linewidth=3,edgecolor='white')
plt.title('Comparability')
plt.show()

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41368-020-0074-x/MediaObjects/41368_2020_74_Fig1_HTML.png?as=webp',width=400,height=400)

#Bulk RNA-seq dataset analysis 

Authors: Xu, H., Zhong, L., Deng, J. et al. High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa. Int J Oral Sci 12, 8 (2020). https://doi.org/10.1038/s41368-020-0074-x

Contributions
H.X., TW.L. and QM.C. contributed equally by conceiving and designing the study. H.X., JX.D., L.Z., JK.P., HX.D., X.Z. and TW.L. performed the tissue collection, experiments, and analyzed the data. H.X., JX.D., L.Z., TW.L. and QM.C. wrote the paper. https://www.nature.com/articles/s41368-020-0074-x#citeasPublic

In [None]:
sns.countplot(x=df['LVH_GA'],hue=df['Selection'],palette='Oranges',linewidth=2,edgecolor='black')
plt.title('LVH GA')
plt.show()

To investigate the ACE2 expression on mucosa of oral cavity, They looked into the ACE2 expression in different oral sites. According to the site information provided by the TCGA, among the 32 adjacent normal tissues, 13 tissues located in the oral tongue, 2 tissues located in the base of tongue,and 14 tissues did not definite the site and were just put into the category of oral cavity. When they combined the base of tongue, floor of mouth and oral cavity as other sites, and compared them with oral tongue, they found the obvious tendency that the mean expression of ACE2 was higher in oral tongue (13 tissues) than others (19 tissues).https://www.nature.com/articles/s41368-020-0074-x#citeas

As far as I can say as a DDS that's PROBABLY due to the tongue's vilosity that can absorb more with its anatomy.

In [None]:
fig = px.bar(df, x= "Country", y= "LVH_GG", color_discrete_sequence=['crimson'],)
fig.show()

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://media.springernature.com/lw685/springer-static/image/art%3A10.1038%2Fs41368-020-0074-x/MediaObjects/41368_2020_74_Fig2_HTML.png?as=webp',width=400,height=400)

Xu, H., Zhong, L., Deng, J. et al. High expression of ACE2 receptor of 2019-nCoV on the epithelial cells of oral mucosa. Int J Oral Sci 12, 8 (2020). https://doi.org/10.1038/s41368-020-0074-x

https://www.nature.com/articles/s41368-020-0074-x#citeas

The above results indicated that the ACE2 could be expressed on the epithelial cells of the oral mucosa and highly enriched in tongue epithelial cells.

#Codes from rossinEndrew https://www.kaggle.com/endrewrossin/fast-initial-lightgbm-model-to-detect-exam-result/comments

In [None]:
import shap
import lightgbm as lgb
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import KFold
import random

In [None]:
df.isnull().sum()

In [None]:
df['LVH_GG'] = df['LVH_GG'].replace(['negative','positive'], [0,1])

In [None]:
SEED = 99
random.seed(SEED)
np.random.seed(SEED)

In [None]:
dfmodel = df.copy()

# read the "object" columns and use labelEncoder to transform to numeric
for col in dfmodel.columns[dfmodel.dtypes == 'object']:
    le = LabelEncoder()
    dfmodel[col] = dfmodel[col].astype(str)
    le.fit(dfmodel[col])
    dfmodel[col] = le.transform(dfmodel[col])

In [None]:
#change columns names to alphanumeric
dfmodel.columns = ["".join (c if c.isalnum() else "_" for c in str(x)) for x in dfmodel.columns]

In [None]:
X = dfmodel.drop(['LVH_GG','Author_and_year'], axis = 1)
y = dfmodel['LVH_GG']

In [None]:
lgb_params = {
                    'objective':'binary',
                    'metric':'auc',
                    'n_jobs':-1,
                    'learning_rate':0.005,
                    'num_leaves': 20,
                    'max_depth':-1,
                    'subsample':0.9,
                    'n_estimators':2500,
                    'seed': SEED,
                    'early_stopping_rounds':100, 
                }

In [None]:
# choose the number of folds, and create a variable to store the auc values and the iteration values.
K = 5
folds = KFold(K, shuffle = True, random_state = SEED)
best_scorecv= 0
best_iteration=0

# Separate data in folds, create train and validation dataframes, train the model and cauculate the mean AUC.
for fold , (train_index,test_index) in enumerate(folds.split(X, y)):
    print('Fold:',fold+1)
          
    X_traincv, X_testcv = X.iloc[train_index], X.iloc[test_index]
    y_traincv, y_testcv = y.iloc[train_index], y.iloc[test_index]
    
    train_data = lgb.Dataset(X_traincv, y_traincv)
    val_data   = lgb.Dataset(X_testcv, y_testcv)
    
    LGBM = lgb.train(lgb_params, train_data, valid_sets=[train_data,val_data], verbose_eval=250)
    best_scorecv += LGBM.best_score['valid_1']['auc']
    best_iteration += LGBM.best_iteration

best_scorecv /= K
best_iteration /= K
print('\n Mean AUC score:', best_scorecv)
print('\n Mean best iteration:', best_iteration)

In [None]:
lgb_params = {
                    'objective':'binary',
                    'metric':'auc',
                    'n_jobs':-1,
                    'learning_rate':0.05,
                    'num_leaves': 20,
                    'max_depth':-1,
                    'subsample':0.9,
                    'n_estimators':round(best_iteration),
                    'seed': SEED,
                    'early_stopping_rounds':None, 
                }

train_data_final = lgb.Dataset(X, y)
LGBM = lgb.train(lgb_params, train_data)

In [None]:
print(LGBM)

In [None]:
# telling wich model to use
explainer = shap.TreeExplainer(LGBM)
# Calculating the Shap values of X features
shap_values = explainer.shap_values(X)

In [None]:
shap.summary_plot(shap_values[1], X, plot_type="bar")

In [None]:
shap.summary_plot(shap_values[1], X)

Kaggle Notebook Runner: Marília Prata   @mpwolke