#  Drugmakers are halting drug studies as COVID-19 cases hit hospitals hard.

<h3> Task Details </h3>

The Roche Data Science Coalition is a group of like-minded public and private organizations with a common mission and vision to bring actionable intelligence to patients, frontline healthcare providers, institutions, supply chains, and government. The tasks associated with this dataset were developed and evaluated by global frontline healthcare providers, hospitals, suppliers, and policy makers. They represent key research questions where insights developed by the Kaggle community can be most impactful in the areas of at-risk population evaluation and capacity management. - COVID19 Uncover Challenge

<h3> Overview of Topics Researched on in this notebook </h3>

Thia notebook contains some datasets created by me, which has publically been posted under the kaggle datasets for COVID-19. The notebook looks for data and the realtive information of delay in clinical trails due to coronavirus pandemic. It wrangles multiple sources from internet to study how the pandemic shaped and research was involved in the COVID-19 cure and how it halted the ongoing clinical trials for other diseases and what the possible impact could be. It studies the spread of cases, clinical trials and compares them with conditions where drugmakers are halting drug studies as COVID-19 hits.

Special thanks to Marilia Prata for providing contents for this notebook.



# Importing the Essential Libraries

In [None]:
#Data Analyses Libraries
import pandas as pd                
import numpy as np    
from urllib.request import urlopen
import json
import glob
import os

#Importing Data plotting libraries
import matplotlib.pyplot as plt     
import plotly.express as px       
import plotly.offline as py       
import seaborn as sns             
import plotly.graph_objects as go 
from plotly.subplots import make_subplots
import matplotlib.ticker as ticker
import matplotlib.animation as animation

#Other Miscallaneous Libraries
import warnings
warnings.filterwarnings('ignore')
from IPython.display import HTML
import matplotlib.colors as mc
import colorsys
from random import randint
import re

# Datasets used in the notebook

1. We read the Novel-Corona-Virus-2019-dataset managed by SRK into this notebook. The dataset holds information about the cumulative case counts of COVID-19 Across the world. The dataset can be viewed and downloaded from - here

2. Coronavirus Capillary Data (Coronavirus capillary and liver tumor samples)

#  Understanding the Rise of COVID-19 

In [None]:
#Reading the cumulative cases dataset
covid_cases = pd.read_csv('../input/novel-corona-virus-2019-dataset/covid_19_data.csv')

#Viewing the dataset
covid_cases.head()

In [None]:
#Groping the same cities and countries together along with their successive dates.

country_list = covid_cases['Country/Region'].unique()

country_grouped_covid = covid_cases[0:1]

for country in country_list:
    test_data = covid_cases['Country/Region'] == country   
    test_data = covid_cases[test_data]
    country_grouped_covid = pd.concat([country_grouped_covid, test_data], axis=0)
    
country_grouped_covid.reset_index(drop=True)
country_grouped_covid.head()

#Dropping of the column Last Update
country_grouped_covid.drop('Last Update', axis=1, inplace=True)

#Replacing NaN Values in Province/State with a string "Not Reported"
country_grouped_covid['Province/State'].replace(np.nan, "Not Reported", inplace=True)

#Creating a dataset to analyze the cases country wise - As of 05/17/2020

latest_data = country_grouped_covid['ObservationDate'] == '05/17/2020'
country_data = country_grouped_covid[latest_data]

#Plotting a bar graph for confirmed cases vs deaths due to COVID-19 in World.

unique_dates = country_grouped_covid['ObservationDate'].unique()
confirmed_cases = []
recovered = []
deaths = []

for date in unique_dates:
    date_wise = country_grouped_covid['ObservationDate'] == date  
    test_data = country_grouped_covid[date_wise]
    
    confirmed_cases.append(test_data['Confirmed'].sum())
    deaths.append(test_data['Deaths'].sum())
    recovered.append(test_data['Recovered'].sum())
    
#Converting the lists to a pandas dataframe.

country_dataset = {'Date' : unique_dates, 'Confirmed' : confirmed_cases, 'Recovered' : recovered, 'Deaths' : deaths}
country_dataset = pd.DataFrame(country_dataset)

#Plotting the Graph of Cases vs Deaths Globally.

fig = go.Figure()
fig.add_trace(go.Bar(x=country_dataset['Date'], y=country_dataset['Confirmed'], name='Confirmed Cases of COVID-19', marker_color='rgb(55, 83, 109)'))
fig.add_trace(go.Bar(x=country_dataset['Date'],y=country_dataset['Deaths'],name='Total Deaths because of COVID-19',marker_color='rgb(26, 118, 255)'))

fig.update_layout(title='Confirmed Cases and Deaths from COVID-19',xaxis_tickfont_size=14,
                  yaxis=dict(title='Reported Numbers',titlefont_size=16,tickfont_size=14,),
    legend=dict(x=0,y=1.0,bgcolor='rgba(255, 255, 255, 0)',bordercolor='rgba(255, 255, 255, 0)'),barmode='group',bargap=0.15, bargroupgap=0.1)
fig.show()


fig = go.Figure()
fig.add_trace(go.Bar(x=country_dataset['Date'], y=country_dataset['Confirmed'], name='Confirmed Cases of COVID-19', marker_color='rgb(55, 83, 109)'))
fig.add_trace(go.Bar(x=country_dataset['Date'],y=country_dataset['Recovered'],name='Total Recoveries because of COVID-19',marker_color='rgb(26, 118, 255)'))

fig.update_layout(title='Confirmed Cases and Recoveries from COVID-19',xaxis_tickfont_size=14,
                  yaxis=dict(title='Reported Numbers',titlefont_size=16,tickfont_size=14,),
    legend=dict(x=0,y=1.0,bgcolor='rgba(255, 255, 255, 0)',bordercolor='rgba(255, 255, 255, 0)'),
    barmode='group',bargap=0.15, bargroupgap=0.1)
fig.show()

From the Graph of Confirmed Cases vs Deaths we observe the following trends.

1. On March 17th 2020, 56 Days post the first confirmed case of COVID-19. The Global Count of confirmed covid-19 cases crossed 200k mark.
2. Within 7 days, on 24th March 2020, the Global confirmed case count reached beyond 400k mark.
3. It took 3 days from March 24th 2020 to March 27th 2020, for global confirmed case count to reach 600k mark.
4. The same trends were observed of 3 days. On April 2, 2020 1m mark of COVID-19 was crossed.
5. Within the next 2 days, 200k more confirmed cases was added.

The total cumber of recovered cases was far more less than the confirmed cases. A total of 20.55% cases were recovered out of total confirmed cases as of April 6th 2020. The rapid rise in cases, lead to more testing and more research on COVID-19 Curing which directly affected the ongoing research and clinical trails in Oncology.

# What we know about Testing Process of COVID-19?

The content for the same is available at - [See here]( https://www.npr.org/sections/health-shots/2020/03/28/822869504/why-it-takes-so-long-to-get-most-covid-19-test-results)

First, a sample is taken from a patient's nose or throat, using a special swab. That swab goes into a tube and is sent to a lab. Some large hospitals have on-site molecular test labs, but most samples are sent to outside laboratories for processing. More on that later.

That transit time usually runs about 24 hours, but it could be longer, depending on how far the hospital is from the processing laboratory. Once at the lab, the specimen is processed, which means lab workers extract the virus's RNA, the molecule that helps regulate genes. "That step of cleaning — the RNA extraction step — is one limiting factor," says Cathie Klapperich, vice chair of the department of biomedical engineering at Boston University. "Only the very biggest labs have automated ways of extracting RNA from a sample and doing it quickly."

After the RNA is extracted, technicians also must carefully mix special chemicals with each sample and run those combinations in a machine for analysis, a process called polymerase chain reaction, which can detect whether the sample is positive or negative for COVID.

"Typically, a PCR test takes six hours from start to finish to complete," says Kelly Wroblewski, director of infectious disease programs at the Association of Public Health Laboratories. Some labs have larger staffs and more machines, so they can process more tests at a time than others. But even for those labs, as demand grows, so does the backlog.

Capacity is expanding, but not fast enough Initially, only a few public health labs and the federal Centers for Disease Control and Prevention processed COVID-19 tests. Problems with the first CDC test kits also led to delays. Now the CDC has a better kit, and 94 public health labs across the country do COVID-19 testing, says Wroblewski. But those labs can't possibly do all that's needed. In normal times, their main function is regular public health surveillance — detecting more common threats such as outbreaks of measles or monitoring seasonal influenza — "but not to do diagnostic testing of the magnitude that is required in this response," she says.

"A chief medical officer on the East Coast said that, up until two days ago, on average, it was taking 72 hours to get results," says Susan Van Meter, executive director of AdvaMedDx, a division of the Advanced Medical Technology Association, a device and diagnostics industry trade group. "That will get better as our member companies come on the market." Even so, supply is not keeping up with demand, Roche CEO Severin Schwan told CNBC on Monday. Roche won the first approval from the FDA for a test kit under emergency rules, and it has delivered more than 400,000 kits so far. "Demand continues to be much higher than supply," Schwan told CNBC. "So we are glad that overall capacity is increasing, but the reality is that broad-based testing is not yet possible.

<img src="https://www.statista.com/graphic/1/1028731/covid19-tests-select-countries-worldwide.jpg" alt="Statistic: Number of coronavirus (COVID-19) tests performed in the most impacted countries worldwide as of May 19, 2020* | Statista" style="width: 100%; height: auto !important; max-width:1000px;-ms-interpolation-mode: bicubic;"/>

The Testing and research may directly contribute to delay in turnaround times and clinical trials for other diseases.



The integrity of more than 330,000 clinical trials listed on ClinicalTrials.gov remains threatened as the COVID-19 outbreak continues to spread globally. Further, as of March 26, at least 18 biotech or pharma companies have reported a disruption to a clinical trial as a result of this pandemic.

The biopharma industry’s shift in focus on developing vaccines and therapies in response to COVID-19 — along with the burden this crisis is placing on medical centres worldwide — is having the unintended consequence of potentially disrupting clinical trials for other diseases. 

<img src="https://www.statista.com/graphic/1/1106306/coronavirus-clinical-trials-worldwide.jpg" alt="Statistic: Number of coronavirus (COVID-19) studies registered worldwide as of May 19, 2020, by region* | Statista" style="width: 100%; height: auto !important; max-width:1000px;-ms-interpolation-mode: bicubic;"/></a>

# Potential biochemical markers to identify severe cases among COVID-19 patients

JIANLIN XIANG, JING WEN, XIAOQING YUAN, Shun Xiong, XUE ZHOU, CHANGJIN LIU, XUN MIN doi: https://doi.org/10.1101/2020.03.19.20034447

Abstract There is a high mortality and long hospitalization period for severe cases with 2019 novel coronavirus disease (COVID-19) pneumonia. Therefore, it makes sense to search for a potential BIOMARKER that could rapidly and effectively identify severe cases early. Clinical samples from 28 cases of COVID-19 (8 severe cases, 20 mild cases) in Zunyi District from January 29, 2020 to February 21, 2020 were collected and otherwise statistically analysed for biochemical markers. 

Serum urea, creatinine (CREA) and cystatin C (CysC) concentrations in severe COVID-19 patients were significantly higher than those in mild COVID-19 patients (P<0.001), and there were also significant differences in serum direct bilirubin (DBIL), cholinesterase (CHE) and lactate dehydrogenase (LDH) concentrations between severe and mild COVID-19 patients (P<0.05). 

Serum urea, CREA, CysC, DBIL, CHE and LDH could be used to distinguish severe COVID-19 cases from mild COVID-19 cases. In particular, serum biomarkers, including urea, CREA, CysC, which reflect glomerular filtration function, may have some significance as potential indicators for the early diagnosis of severe COVID-19 and to distinguish it from mild COVID-19. Glomerular filtration function injury in severe COVID-19 patients should also be considered by clinicians.

In [None]:
df1 = pd.read_csv("../input/corona-virus-capillary-and-liver-tumor-samples/both_clean_liver_capillary_CoV.csv")
df1.head().style.background_gradient(cmap='PuBuGn')

# Let's Wrangle the Web

The following details are the best to describe and answer the problem statement asked for this question under the UNCOVER COVID-19 Challenge. The details for the same are obtained from Biopharma Drive's website. The link for the same is mentioned here - [Link](https://www.biopharmadive.com/news/coronavirus-clinical-trial-disruption-biotech-pharma/574609/)



Since the start of March, nearly 100 companies have reported some sort of disruption to a clinical trial as a result of the coronavirus pandemic. Such news is now common, as medical centers across the world have focused their precious resources on treating people infected with the coronavirus. But there are some signs of a cautious restart: Pfizer said recently it would resume enrollment in trials where permitted by conditions.

We broke down drugmakers into three categories, defined by market value. 

1. Small biotech indicates companies worth less than 1 billion dollars as of April 6.
2. Mid-sized biotech refers to those valued between 1 billion dollars and 10 billion dollars
3. Large biotech or pharma means drugmakers with a market capitalization of 10 billion dollars or more. 

For counting trials by study phase, we only included those trials specifically mentioned by companies. This has the effect of likely undercounting trials run by major drugmakers, which more often disclosed COVID-19 effects in broad terms.

# How are the clinical Trials affected w.r.t companies and their size?

<img src="https://i.ibb.co/85TkxyT/Trials.png" alt="Trials" border="0">

Out of 99 Companies tested under this trail 54 of them (54.54%) was small biotech companies. We observe that the small biotech companies are the most hit by COVID-19 and they have delayed clinical trials because of the COVID-19 Situation.

# What phase of Clinical Trials are most affected because of COVID-19 Situation?
<img src="https://i.ibb.co/Wy3D5Vf/Trials1.png" alt="Trials1" border="0">

We get to know that Phase 1 of the trails are the most affected becuase of COVID-19 Situation. This is a potential marker as initial phases of trials are affected becuase of this situation and new trails/researches can't be created.

# The affected Clinial Trials for the top Biotech Companies (For Oncology Patients)

<h4> 1. Company Name: Argenx </h4>Three clinical trials  
Disease: Acute myeloid leukemia and atopic dermatitis  
Drug: Cusatuzumab and LP0145  
Action: Paused enrollment Argenx partners Janssen and LEO Pharma have paused trials of the two drugs. New enrollment will "depend on the trajectory of COVID-19 infection rates."

<h4> 2. Company Name: INmune Bio </h4>Four clinical trials  
Disease: NASH, MDS, breast and ovarian cancers  
Drug: INB03, LIVNate and INKmune  
Action: Delayed start of new studies. The pandemic has delayed the start of multiple trials for INmune, including mid-stage studies for NASH and breast cancer.

<h4> 3. Ideaya Biosciences </h4>  Disease: Solid tumors with GNAQ/11 mutations  
Drug: IDE-196  
Action: Delayed enrollment. The delay concerns a Phase 2 expansion arm to the ongoing study of IDE-196. Two of the four trial sites have suspended enrollment, which may delay Ideaya's timelines.

<h4> 4. Sun BioPharma </h4> Disease: Pancreatic cancer  
Drug: SBP-101  
Action: Paused new patient enrollment. Sun stopped enrolling new patients in the study in April, but expects to ramp up again during the second quarter.

<h4> 5. Merus </h4> Disease: NRG1 fusion cancers  
Drug: zenocutuzumab  
Action: Paused new site activation and slowdown in patient enrollment. Despite the disruption, Merus still intends to report results from the eNRGy trial by the end of the year.

<h4> 6. Macrogenics </h4>Two trials  
Disease: Advanced head and neck cancer, acute myeloid leukemia  
Drug: enoblituzumab and flotetuzumab  
Action: Delayed enrollment and new study start. Macrogenics has delayed the start of a planned Phase 2 study of enoblituzumab, and will provide an update in the second half of the year. The company has stopped enrolling patients in an early-stage trial of fotetuzumab in AML.

<h4> 7. Aduro Biotech </h4> Three clinical trials  
Disease: Nephropathy, squamous cell carcinoma and bladder cancer  
Drug: BION-1301 and ADU-S100  
Action: Delayed enrollment. Aduro now plans to report data from its study of BION-1301 next year, and the start of the first human test of ADU-S100 was also pushed back. The biotech still aims to disclose results from its Phase 2 trial in squamous cell carcinoma later this year, even though the pandemic has affected the study.


<h4> 8. Novartis </h4> PARAGLIDE-HF, PARACHUTE-HF and three other trials.  
Disease: Heart failure, Sjogren's syndrome, Type 1 diabetes and cancer  
Drug: Entresto, CFZ533 and a cancer radiotherapeutic  
Action: Paused recruitment and, in one study, treatment. Several trials run by Novartis, one of the largest drugmakers worldwide, are now marked as suspended on clinicaltrials.gov, hinting at the widening impact being felt by even the biggest pharmas. Most notable is an 800-patient Phase 3 study of Novartis' heart failure medicine Entresto.

<h4> 9. Agios Pharmaceuticals </h4> Multiple trials  
Disease: Pyruvate kinase deficiency, thalassemia, leukemia, myelodysplastic syndrome, cholangiocarcinoma, low-grade glioma and lung cancer  
Drug: Tibsovo, Idhifa, mitapivat and AG-636  
Action: Delayed enrollment in some studies and paused enrollment in others. The pandemic affects a slate of ongoing and planned studies for Agios, including the ACTIVATE and ACTIVATE-T late-stage trials for the most advanced experimental drug in its pipeline, mitapivat.


<h4> 10. Johnson & Johnson </h4> Disease: Non-small cell lung cancer  
Drug: YH25448  
Action: Suspended study. The early-stage trial is meant to test a targeted theapy for EGFR-positive lung cancer patients.


<h4> 11. Calithera Biosciences </h4> Disease: Lung cancer, kidney cancer, breast cancer and cystic fibrosis  
Drug: talaglenastat and CB-280  
Action: Delayed start of two studies, paused new patient enrollment in two others. The delays impact four Calithera trials. Dose escalation has been suspended in two combination trials testing talaglenastat with different Pfizer drugs. And Calithera won't start a Phase 2 trial in lung cancer or a Phase 1 study of CB-280 in cystic fibrosis until the third quarter.

<h4> 12. NextCure </h4> Two early-stage studies  
Disease: Lung cancer  
Drug: NC318 and NC410  
Action: Postponed start of two trials. NextCure will delay starting a combination trial of its lead lung cancer treatment NC318, as well as a Phase 1 study of its second drug candidate, NC410.

<h4> 13. Boehringer Ingelheim </h4>  Disease: Diabetes, kidney, lung and eye diseases, cancer, cystic fibrosis  
Action: Suspended patient screening and enrollment. The German pharma has stopped screening and enrollment in more than four dozen studies, including trials of its lung disease drug Ofev and of experimental cancer therapies like xentuzumab. Many of those affected are early stage, testing healthy volunteers, but three are in Phase 3.

<h4> 14. MaxCyte </h4>  Disease: Ovarian cancer and mesothelioma  
Drug: MCY-M11  
Action: Delayed completion of a Phase 1 trial. Due to a deprioritization of non-COVID-19 studies at institutions, MaxCyte said "timelines may be impacted" for MCY-M11, a cell therapy in testing for ovarian cancer and a form of mesothelioma.

<h4> 15. Trillium Therapeutics </h4>  Two Phase 1 clinical trials  
Disease: Cancer  
Drug: TTI-621 and TTI-622  
Action: Slowdown or pause in new patient enrollment. Trillium is expecting enrollment in the two studies to slow or pause altogether as "many clinical sites are putting enrollment of new patients on hold." The affected drugs, TTI-621 and TTI-622, are the only two Trillium has in human testing.


<h4> 16. Fate Therapeutics </h4>  Disease: Cancer  
Action: Delayed enrollment. Fate said the timelines for many of its ongoing trials will be impaced by "potential delays or disruptions in patient enrollment and site initiation." The drugmaker didn't say which studies will likely be affected, though.

<h4> 17. Novocure </h4>  Disease: Cancer  
Action: Delayed enrollment and completion of several studies. Novocure didn't provide specifics, but said the pandemic will delay the timing of enrollment and completion of its trials "by multiple quarters." The biotech currently has six studies ongoing.

<h4> 18. BioNTech </h4>  Disease: Cancer, rare and infectious diseases  
Drug: Eight experimental candidates  
Action: Postponed start of several trials. BioNTech's COVID-19 vaccine efforts have moved to the forefront during the pandemic. But the outbreak has led to delays for several of its other programs for cancer, rare diseases, and influenza.

<h4> 19. CytomX Therapeutics </h4>  Disease: Breast cancer and melanoma  
Drug: CX-2009 and CX-072  
Action: Paused new patient enrollment and new site activation; terminated trial. CytomX intends to resume the study of breast cancer drug CX-2009 as soon as possible. But COVID-19 has led CytomX to end its combination study of CX-072 and the immunotherapy Yervoy in melanoma. It will invest resources elsewhere.


<h4> 20. Synlogic </h4>  Two clinical trials  
Disease: Phenylketonuria and cancer  
Drug: SYNB1618, SYNB1891  
Action: Delayed enrollment in two clinical trials. Synlogic doesn't expect to enroll patients in a planned Phase 2 trial of its PKU drug SYNB1618 until "it is safe for patients to enter clinical trial sites." Its Phase 1 study of cancer drug SYNB1891 is already underway, but it'll be tough to recruit new patients for the trial, which makes it less likely that Synlogic will report data this year.

<h4> 21. Moderna </h4>   Disease: Rare diseases, cancer and infectious diseases  
Drug: mRNA-3704, mRNA-3927, mRNA-1653 and mRNA-1944  
Action: Paused new patient enrollment in several clinical trials. The pandemic has affected several of Moderna's trials. Enrollment in four studies has been suspended, and other ongoing trials may be disrupted because of recruitment delays or problems getting patients their next dose on time during the outbreak.


# The Conclusions

The following details mentioned above are fetched from the BioPharma's website. The nCov-19 has impacted the clinical trials of various important oncological research by a large margin and the companies that are hit the most because of it are highlighted in the dataset above.

# The next big steps
This notebook would be updated by me to check for much newer and diverse data to analyze more trends.

Contact LinkedIn - https://www.linkedin.com/in/amankumar01/

Do upvote and comment if you like or wish to suggest something.