#Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases

#Tao Ai*, Zhenlu Yang*, Hongyan Hou, Chenao Zhan, Chong Chen, Wenzhi Lv, Qian Tao, Ziyong Sun, Liming Xia, 
Tao Ai and Zhenlu Yang contributed equally to this work.

Published Online:Feb 26 2020https://doi.org/10.1148/radiol.2020200642

Since December 2019, a number of cases of “unknown viral pneumonia” related to a local Seafood Wholesale Market were reported in Wuhan City, Hubei Province, China (1). A novel coronavirus (SARS-CoV-2) was suspected to be the etiology with Phinolophus bat as the alleged origin.

In absence of specific therapeutic drugs or vaccines for 2019 novel coronavirus disease (COVID-19), it is essential to detect the diseases at an early stage, and immediately isolate the infected person from the healthy population. According to the latest `guideline of Diagnosis and Treatment of Pneumonitis Caused by 2019-nCoV` (trial sixth version) published by the China government, the diagnosis of COVID-19 must be confirmed by the reverse transcription polymerase chain reaction (RT-PCR) or gene sequencing for respiratory or blood specimens, as the key indicator for hospitalization. However, with limitations of sample collection and transportation, and kit performance, the total positive rate of RT-PCR for throat swab samples was reported to be about 30% to 60% at initial presentation . In the current emergency, the low sensitivity of RT-PCR implies that many COVID-19 `patients may not be identified` and may not receive appropriate treatment in time; such patients constitute a risk for infecting a larger population given the highly contagious nature of the virus. `Chest CT, as a routine imaging tool for pneumonia diagnosis`, is relatively easy to perform and can produce fast diagnosis. In this context, chest CT may provide benefit for diagnosis of COVID-19. As recently reported, chest CT demonstrates typical radiographic features in almost all COVID-19 patients, including ground-glass 4 opacities, multifocal patchy consolidation, and/or interstitial changes with a peripheral distribution. Those typical features were also observed in patients with negative RT-PCR results but clinical symptoms. It has been noted in small-scale studies that the current RT-PCR testing has limited sensitivity, while chest CT may reveal pulmonary abnormalities consistent with COVID-19 in patients with initial negative RT-PCR results.
To better understand the diagnostic value of chest CT compared with RT-PCR testing, they report the results of chest CT in comparison to the initial and serial RT-PCR results in 1014 patients with suspected COVID-19.

In conclusion, chest CT imaging has high sensitivity for diagnosis of COVID-19. Our data and analysis suggest that chest CT should be considered for the COVID-19 screening, comprehensive evaluation, and following-up, especially in epidemic areas with high pre-test probability for disease.
https://pubs.rsna.org/doi/10.1148/radiol.2020200642

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcTpnmOck-xZlKTvUcwp4rywVsgr34amgR_3AVCyLU3w7wRT7I-A',width=400,height=400)

youtube.com

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import json
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.offline as py
import cv2
from sklearn import feature_extraction, linear_model, model_selection, preprocessing
import plotly.graph_objs as go
import plotly.offline as py
import plotly.express as px

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://www.mobihealthnews.com/sites/default/files/SPHCC%20and%20Yitu%20Healthcare%20AI%20software_Mobi.jpg',width=400,height=400)

#SPHCC and Yitu develop `AI-powered Intelligent Evaluation System` of Chest CT for COVID-19

Since the outbreak of COVID-19 virus in China, Shanghai Public Health Clinical Center (SPHCC) has been actively mobilizing other parties to use the latest tech in the fight against the virus. Through the joint effort of SPHCC and Yitu Healthcare, a Shanghai-based AI startup, the Intelligent Evaluation System of Chest CT for COVID-19 was officially launched on 28 January. 

The system utilizes intelligent diagnosis and quantitative evaluation of CT images of COVID-19 through industry-leading image algorithms, and grades the severity of various pneumonia diseases of local lesions, diffuse lesions, and whole lung involvement. 

In addition, it accurately quantifies the cumulative pneumonia load of the disease through quantitative and omics analysis of key image features such as the morphology, range, and density of the lesion. The system can also render a dynamic 4D contrast of the whole lung lesions on CT, helping in clinical judgment of the condition, evaluation of the efficacy, and prediction of the prognosis. The system can complete a quantitative analysis of lung lesions in 2-3 seconds.

Above photo: A doctor in SPHCC using the Intelligent Evaluation System of Chest CT for COVID-19 for accurate diagnosis. Credit: Business Wire

https://www.mobihealthnews.com/news/asia-pacific/sphcc-and-yitu-develop-ai-powered-intelligent-evaluation-system-chest-ct-covid-19

#Codes from Anthony https://www.kaggle.com/anthony358/cord-19-simple-parsing-to-dataframes/comments

In [None]:
with open('/kaggle/input/CORD-19-research-challenge/2020-03-13/biorxiv_medrxiv/biorxiv_medrxiv/07e833d0917cace550853f72923856d0fe1a7120.json', 'r') as f:
    test = json.load(f)
test.keys()

Define utilities for parsing the data

In [None]:
def affiliation_parsing(x: dict) -> str:
    """Parse affiliation into string."""
    current = []
    for key in ['laboratory', 'institution']:
        if x['affiliation'].get(key):  # could also use try, except
            current.append(x['affiliation'][key])
        else:
            current.append('')
    for key in ['addrLine', 'settlement', 'region', 'country', 'postCode']:
        if x['affiliation'].get('location'):
            if x['affiliation']['location'].get(key):
                current.append(x['affiliation']['location'][key])
        else:
            current.append('')
    return ', '.join(current)

extract_key = lambda x, key: [[i[key] for i in x]]
extract_func = lambda x, func: [[func(i) for i in x]]
format_authors = lambda x: f"{x['first']} {x['last']}"
format_full_authors = lambda x: f"{x['first']} {''.join(x['middle'])} {x['last']} {x['suffix']}"
format_abstract = lambda x: "{}\n{}".format(x['section'], x['text'])
all_keys = lambda x, key: [[i[key] for i in x.values()]]

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://pbs.twimg.com/media/ESnTl26XkAY_RCj?format=jpg&name=small',width=400,height=400)

rebelem.com - Emergency Medicine blog. 

None of our current dx tools are great
RT-PCR assay has long turnaround time (currently 1 – 7d)
Availability (we simply don’t have enough tests to test everyone)
This (figure above)seems to be what makes the most sense

Parse all jsons into dataframes

In [None]:
for path in ['biorxiv_medrxiv', 'comm_use_subset', 'noncomm_use_subset', 'pmc_custom_license']:
    json_files = [file for file in os.listdir(f'/kaggle/input/CORD-19-research-challenge/2020-03-13/{path}/{path}') if file.endswith('.json')]
    df_list = []

    for js in json_files:
        with open(os.path.join(f'/kaggle/input/CORD-19-research-challenge/2020-03-13/{path}/{path}', js)) as json_file:
            paper = json.load(json_file)
        paper_df = pd.DataFrame({
            'paper_id': paper['paper_id'],
            'title': paper['metadata']['title'],
            'authors': extract_func(paper['metadata']['authors'], format_authors),
            'full_authors': extract_func(paper['metadata']['authors'], format_full_authors),
            'affiliations': extract_func(paper['metadata']['authors'], affiliation_parsing),
            'emails': extract_key(paper['metadata']['authors'], 'email'),
            'raw_authors': [paper['metadata']['authors']],
            'abstract': extract_func(paper['abstract'], format_abstract),
            'abstract_cite_spans': extract_key(paper['abstract'], 'cite_spans'),
            'abstract_ref_spans': extract_key(paper['abstract'], 'ref_spans'),
            'body': extract_func(paper['body_text'], format_abstract),
            'body_cite_spans': extract_key(paper['body_text'], 'cite_spans'),
            'body_ref_spans': extract_key(paper['body_text'], 'ref_spans'),
            'bib_titles': all_keys(paper['bib_entries'], 'title'),
            'raw_bib_entries': [paper['bib_entries']],
            'ref_captions': all_keys(paper['ref_entries'], 'text'),
            'raw_ref_entries': [paper['ref_entries']],
            'back_matter': [paper['back_matter']]
        })
        df_list.append(paper_df)
    temp_df = pd.concat(df_list)
    temp_df.to_csv(f'/kaggle/working/{path}.csv', index=False)

Chest CT plays an important role in the diagnosis of patients with suspected coronavirus disease-19 (COVID-19) infection. In a new study, researchers from Icahn School of Medicine at Mount Sinai, New York, and their counterparts in China, reviewed chest CTs of 121 COVID-19 patients for common CT findings in relationship to the time between symptom onset and the initial CT scan.

Their findings show that a pattern of ground-glass and consolidative pulmonary opacities, often with a bilateral and peripheral lung distribution, is emerging as the chest CT hallmark of COVID-19 infection. Below the Axial CT Scan of the Lung

https://healthmanagement.org/c/imaging/news/chest-ct-scans-probe-covid-19-symptoms

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://healthmanagement.org/uploads/from_cloud/cw/00116374_cw_image_wi_4b727520e6ffc2fb5088f4be9576f7b9.jpg.pagespeed.ce.oZjkUhglUi.jpg',width=400,height=400)

Create stacked dataframe

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcS6kGQzHbJKLbb5ZLG1ezKfpmUlcfUFAlp0O-ARn9cYcyxAiN3F',width=400,height=400)

Chest CT scan - apkpure.com

In [None]:
df_list = []
for path in ['biorxiv_medrxiv', 'comm_use_subset', 'noncomm_use_subset', 'pmc_custom_license']:
    temp_df = pd.read_csv(f'/kaggle/working/{path}.csv')
    temp_df['dataset'] = path
    df_list.append(temp_df)
    
aggregate_df = pd.concat(df_list)
aggregate_df.to_csv(f'/kaggle/working/all_df.csv', index=False)

#Seven measures that made Singapore an example in the fight against the disease:

ONE OF THE BEST HEALTH SYSTEMS IN THE WORLD 

Singapore was named by the Bloomberg Health-Efficiency Index as the 2nd most efficient healthcare system in the world. In 2000, WHO listed the city-state as the 6th best health in the world.

THERMOMETERS, MANY THERMOMETERS 

Airports and tourist attractions have cameras sensitive to body heat. Schools, restaurants and residential buildings measure the temperature of all entrants.

BORDER CONTROL 

Travelers who have been in China, South Korea, Iran and, most recently, Italy, France, Germany and Spain for the past 14 days are prohibited from entering the country.

SMALL DIMENSIONS 

A city-state of only 5.6 million: the government has a very small area of ​​activity.

NO CROWDS 

The government discourages crowds, especially religious agglomerations, one of the major sources of recent contagion in Asia. Mosques are closed after a post-cult outbreak in Malaysia, and Catholic churches follow the guidance to avoid crowds.
https://oglobo.globo.com/sociedade/coronavirus-sete-medidas-que-tornaram-cingapura-exemplo-na-luta-contra-doenca-1-24314038

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcRGKG7NakimAmL2UNPbjwMiR6c0xBzMkgg4jnrG1WmCrBt4YdhZ',width=400,height=400)

linkresearcher.com - Coronavirus: definition of cure challenges doctors who discharge patients

Infected persons discharged from hospitals are instructed to prolong isolation, and there have been cases where symptoms have reappeared.
Amid announcements of " cure ", "discharge" and "recovery", however, doctors warn that these words leave room for doubt .- The science of the new coronavirus is only ten weeks old, and a range of knowledge has not yet been built to assert assertively what healing means.

As a rule, doctors have dismissed patients when they stop showing a series of symptoms, but continue to follow them from a distance. The "discharge", technically, occurs after an extra period of observation for another 14 days, in addition to the two weeks of quarantine .

The most important criterion for dismissal is generic: it is the conviction by the clinical evaluation that the disease has stopped progressing, mainly in the respiratory part - explains Luis Fernando Aranha Camargo, doctor who treats patients with Covid-19 at Hospital Israelita Albert Einstein , an institution that received 98 patients with a confirmed diagnosis of virus infection.
 When the patient stops having a fever and has no evidence of progression in the respiratory condition, it is time to go home and follow up there - he says.
 
 The criterion for considering patients able to return to normal life is purely clinical, since laboratory tests (the PCR, which detects genetic material from the virus) are still expensive and in limited stock.
 
  Only patients recruited for scientific studies are being submitted to PCR again - says Naime. - Especially because we still don't know if the negative result for PCR means the cure of the disease, effectively.
  One of the concerns expressed by doctors is that there are already informal reports of patients who are discharged without symptoms and then present them again.
  Few specialists hope, however, that previous coronavirus infection will generate lasting immunity, which will protect the person from reinfection indefinitely.https://oglobo.globo.com/sociedade/coronavirus-servico/coronavirus-definicao-de-cura-desafia-medicos-que-concedem-alta-pacientes-24304792

#Artificial Intelligence Identifies High-Risk COVID-19 Patients

March 16, 2020 - Medical Home Network, an organization serving patients in the Chicago area, is using artificial intelligence to identify individuals who have a heightened vulnerability to severe complications from COVID-19.

For Medicaid beneficiaries who face challenges such as homelessness or lack of transportation access, it can be difficult to take measures to protect against or receive treatment for COVID-19. Medical Home Network is leveraging an AI-based predictive analytics model to prioritize care management outreach to patients most at risk from the virus.https://healthitanalytics.com/news/artificial-intelligence-identifies-high-risk-covid-19-patients


COVID-19 and the Risk to Health Care Workers: A Case Report https://annals.org/aim/fullarticle/2763329/covid-19-risk-health-care-workers-case-report

New analysis breaks down age-group risk for coronavirus — and shows millennials are not invincible https://www.statnews.com/2020/03/18/coronavirus-new-age-analysis-of-risk-confirms-young-adults-not-invincible/
That approach is interesting since data shows that seniors are the most vulnerable. The CDC (Centers for Disease Control and Prevention) data highlights that the young are not immune to getting seriously ill, with 38 percent of hospitalized patients between the ages of 20 and 54.

Risk Factors for Death From COVID-19 Identified in Wuhan Patients
Patients who did not survive hospitalization for COVID-19 in Wuhan were more likely to be older, have comorbidities, and elevated D-dimer, according to the first study to examine risk factors associated with death among adults hospitalized with COVID-19. https://www.medscape.com/viewarticle/926504

Older age, high Sequential Organ Failure Assessment (SOFA) score, and blood d-dimer levels >1 μg/mL on admission are significant early stage risk factors for poor prognosis and in-hospital mortality in patients with COVID-19, according to study findings published in The Lancet.
Approximately 48% (n=91) patients in the overall cohort had a comorbidity. The most common comorbidities in this patient population were hypertension (30%), diabetes (19%), and coronary heart disease (8%).https://www.pulmonologyadvisor.com/home/topics/lung-infection/covid-19-risk-factors-identified-for-poor-prognosis-in-hospital-mortality/

Zhou F, Yu T, Du R, et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study [published online March 11, 2020]. Lancet. doi:10.1016/S0140-6736(20)30566-3


In [None]:
aggregate_df  # view the aggregated data

OTHER DEVELOPMENTS IN CHINA TO COMBAT THE SPREAD OF COVID-19

SPHCC also worked with VivaLNK, a Santa Clara, California-based connected health startup, to use the latter’s continuous temperature sensor to combat the spread of coronavirus in China, MobiHealthNews reported.

In mid-February, the Chinese government released a public app to gauge potential coronavirus exposures – the tool acts as a way to collect data as well as to educate citizens on what to do if they have been in close contact with the virus — which is to stay at home and get advice from health authorities.

Baidu, a Chinese multinational tech company, made its online doctor consultation platform publicly free for users that want to consult with a doctor about COVID-19. According to the company, the platform has so far handled (as of 11 Feb) over 4.2 million inquiries from users about COVID-19 with over 300,000 inquires per day. 
https://www.mobihealthnews.com/news/asia-pacific/sphcc-and-yitu-develop-ai-powered-intelligent-evaluation-system-chest-ct-covid-19

In [None]:
aggregate_df.dtypes

#codes from Shivam Ralli @hoshi7

In [None]:
# Necessary Functions: 
def pie_plot(labels, values, colors, title):
    fig = {
      "data": [
        {
          "values": values,
          "labels": labels,
          "domain": {"x": [0, .48]},
          "name": "Job Type",
          "sort": False,
          "marker": {'colors': colors},
          "textinfo":"percent+label+value",
          "textfont": {'color': '#FFFFFF', 'size': 10},
          "hole": .6,
          "type": "pie"
        } ],
        "layout": {
            "title":title,
            "annotations": [
                {
                    "font": {
                        "size": 25,

                    },
                    "showarrow": False,
                    "text": ""

                }
            ]
        }
    }
    return fig

In [None]:
from collections import Counter
import json
from IPython.display import HTML
import altair as alt
from  altair.vega import v5

In [None]:
##-----------------------------------------------------------
# This whole section 
vega_url = 'https://cdn.jsdelivr.net/npm/vega@' + v5.SCHEMA_VERSION
vega_lib_url = 'https://cdn.jsdelivr.net/npm/vega-lib'
vega_lite_url = 'https://cdn.jsdelivr.net/npm/vega-lite@' + alt.SCHEMA_VERSION
vega_embed_url = 'https://cdn.jsdelivr.net/npm/vega-embed@3'
noext = "?noext"

paths = {
    'vega': vega_url + noext,
    'vega-lib': vega_lib_url + noext,
    'vega-lite': vega_lite_url + noext,
    'vega-embed': vega_embed_url + noext
}

workaround = """
requirejs.config({{
    baseUrl: 'https://cdn.jsdelivr.net/npm/',
    paths: {}
}});
"""

#------------------------------------------------ Defs for future rendering
def add_autoincrement(render_func):
    # Keep track of unique <div/> IDs
    cache = {}
    def wrapped(chart, id="vega-chart", autoincrement=True):
        if autoincrement:
            if id in cache:
                counter = 1 + cache[id]
                cache[id] = counter
            else:
                cache[id] = 0
            actual_id = id if cache[id] == 0 else id + '-' + str(cache[id])
        else:
            if id not in cache:
                cache[id] = 0
            actual_id = id
        return render_func(chart, id=actual_id)
    # Cache will stay outside and 
    return wrapped

@add_autoincrement
def render(chart, id="vega-chart"):
    chart_str = """
    <div id="{id}"></div><script>
    require(["vega-embed"], function(vg_embed) {{
        const spec = {chart};     
        vg_embed("#{id}", spec, {{defaultStyle: true}}).catch(console.warn);
        console.log("anything?");
    }});
    console.log("really...anything?");
    </script>
    """
    return HTML(
        chart_str.format(
            id=id,
            chart=json.dumps(chart) if isinstance(chart, dict) else chart.to_json(indent=None)
        )
    )



HTML("".join((
    "<script>",
    workaround.format(json.dumps(paths)),
    "</script>")))

In [None]:
def word_cloud(df, pixwidth=6000, pixheight=350, column="index", counts="count"):
    data= [dict(name="dataset", values=df.to_dict(orient="records"))]
    wordcloud = {
        "$schema": "https://vega.github.io/schema/vega/v5.json",
        "width": pixwidth,
        "height": pixheight,
        "padding": 0,
        "title": "Hover to see number of occureances from all the sequences",
        "data": data
    }
    scale = dict(
        name="color",
        type="ordinal",
        range=["cadetblue", "royalblue", "steelblue", "navy", "teal"]
    )
    mark = {
        "type":"text",
        "from":dict(data="dataset"),
        "encode":dict(
            enter=dict(
                text=dict(field=column),
                align=dict(value="center"),
                baseline=dict(value="alphabetic"),
                fill=dict(scale="color", field=column),
                tooltip=dict(signal="datum.count + ' occurrances'")
            )
        ),
            "transform": [{
            "type": "wordcloud",
            "text": dict(field=column),
            "size": [pixwidth, pixheight],
            "font": "Helvetica Neue, Arial",
            "fontSize": dict(field="datum.{}".format(counts)),
            "fontSizeRange": [10, 60],
            "padding": 2
        }]
    }
    wordcloud["scales"] = [scale]
    wordcloud["marks"] = [mark]
    
    return wordcloud

from collections import defaultdict

def wordcloud_create(df):
    ult = {}
    corpus = aggregate_df.affiliations.values.tolist()
    final = defaultdict(int) #Declaring an empty dictionary for count (Saves ram usage)
    for words in corpus:
        for word in words.split():
             final[word]+=1
    temp = Counter(final)
    for k, v in  temp.most_common(200):
        ult[k] = v
    corpus = pd.Series(ult) #Creating a dataframe from the final default dict
    return render(word_cloud(corpus.to_frame(name="count").reset_index(), pixheight=600, pixwidth=900))

In [None]:
wordcloud_create(aggregate_df)

#Codes from Bulent Siyah https://www.kaggle.com/bulentsiyah/learn-opencv-by-examples-with-python

In [None]:
image = cv2.imread('/kaggle/input/medical-masks-dataset/images/000_1ov3n5_0.jpeg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(20, 20))
plt.subplot(1, 2, 1)
plt.title("Original")
plt.imshow(image)

# Create our shapening kernel, we don't normalize since the 
# the values in the matrix sum to 1
kernel_sharpening = np.array([[-1,-1,-1], 
                              [-1,9,-1], 
                              [-1,-1,-1]])

# applying different kernels to the input image
sharpened = cv2.filter2D(image, -1, kernel_sharpening)


plt.subplot(1, 2, 2)
plt.title("Image Sharpening")
plt.imshow(sharpened)

plt.show()

#Health Promotion

Now that COVID-19 has been declared a global pandemic the All of Government (AoG) response has kicked in to centralise the public information work. This means the focus is not just on health, but on all areas eg, border, economics and gatherings, etc. 

Looking after your mental health during the COVID-19 pandemic

For children and young people:
Have open and honest conversations
Relay the facts, in a way that is appropriate for their age and temperament. 
Listen to their questions.
Let them know that they are okay and it’s normal to feel concerned.

For older parents, grandparents or friends:
Check in on them and stay in touch.
Help them with their physical and medical needs, if they need it, (with consideration to the latest advice from Health authorities)

Planning for self-isolation:
Make sure your wider health needs are being looked after such as having enough prescription medicines available to you.

If you are in self-isolation:
Develop a routine that suits you. Think about your needs such as meals, exercise, sleep, medication and how you will organise your day and home to create your new routine. Writing it out can help.
If you are well enough, find activities to keep your mind and body stimulated, including getting creative, exercising in your home, or self-care.
Connect with your friends and family via phone, email or social media
Stay informed with good quality factual information from credible sources (see above) but also switch off and watch favourite TV/films/reading books.
Don’t be afraid to rely on others to deliver medication, food or essential supplies to you to comply with your self-isolation requirements. Remember that you are helping others and potentially saving lives.

A huge thanks who have shared the initial messaging that got people washing their hands with soap and water often and using cough and sneeze etiquette. Messaging has expanded to include self-isolation, staying home if you’re sick, and being kind to others through this. Together we can slow the spread.

https://www.mhc.wa.gov.au/about-us/news-and-media/news-updates/looking-after-your-mental-health-during-the-covid-19-pandemic/


In [None]:
# Load our new image
image = cv2.imread('/kaggle/input//medical-masks-dataset/images/003_1024.jpeg', 0)

plt.figure(figsize=(30, 30))
plt.subplot(3, 2, 1)
plt.title("Original")
plt.imshow(image)

# Values below 127 goes to 0 (black, everything above goes to 255 (white)
ret,thresh1 = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)

plt.subplot(3, 2, 2)
plt.title("Threshold Binary")
plt.imshow(thresh1)

# It's good practice to blur images as it removes noise
image = cv2.GaussianBlur(image, (3, 3), 0)

# Using adaptiveThreshold
thresh = cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 5) 

plt.subplot(3, 2, 3)
plt.title("Adaptive Mean Thresholding")
plt.imshow(thresh)


_, th2 = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

plt.subplot(3, 2, 4)
plt.title("Otsu's Thresholding")
plt.imshow(th2)


plt.subplot(3, 2, 5)
# Otsu's thresholding after Gaussian filtering
blur = cv2.GaussianBlur(image, (5,5), 0)
_, th3 = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
plt.title("Guassian Otsu's Thresholding")
plt.imshow(th3)
plt.show()

In [None]:
image = cv2.imread('/kaggle/input/medical-masks-dataset/images/002_1024.jpeg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

plt.figure(figsize=(20, 20))
plt.subplot(3, 2, 1)
plt.title("Original")
plt.imshow(image)

# Let's define our kernel size
kernel = np.ones((5,5), np.uint8)

# Now we erode
erosion = cv2.erode(image, kernel, iterations = 1)

plt.subplot(3, 2, 2)
plt.title("Erosion")
plt.imshow(erosion)

# 
dilation = cv2.dilate(image, kernel, iterations = 1)
plt.subplot(3, 2, 3)
plt.title("Dilation")
plt.imshow(dilation)


# Opening - Good for removing noise
opening = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
plt.subplot(3, 2, 4)
plt.title("Opening")
plt.imshow(opening)

# Closing - Good for removing noise
closing = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)
plt.subplot(3, 2, 5)
plt.title("Closing")
plt.imshow(closing)

In [None]:
# Let's load a simple image with 3 black squares
image = cv2.imread('/kaggle/input/medical-masks-dataset/images/so(19).jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)


plt.figure(figsize=(20, 20))

plt.subplot(2, 2, 1)
plt.title("Original")
plt.imshow(image)


# Grayscale
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)

# Find Canny edges
edged = cv2.Canny(gray, 30, 200)

plt.subplot(2, 2, 2)
plt.title("Canny Edges")
plt.imshow(edged)

# Finding Contours
# Use a copy of your image e.g. edged.copy(), since findContours alters the image
contours, hierarchy = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

plt.subplot(2, 2, 3)
plt.title("Canny Edges After Contouring")
plt.imshow(edged)

print("Number of Contours found = " + str(len(contours)))

# Draw all contours
# Use '-1' as the 3rd parameter to draw all
cv2.drawContours(image, contours, -1, (0,255,0), 3)

plt.subplot(2, 2, 4)
plt.title("Contours")
plt.imshow(image)

#Codes from Paul Mooney https://www.kaggle.com/paultimothymooney/collections-of-paintings-from-50-artists/data

In [None]:
import numpy as np
import pandas as pd 
import cv2
from fastai.vision import *
from wordcloud import WordCloud, STOPWORDS
from collections import Counter
from nltk.corpus import stopwords
import matplotlib.pyplot as plt
import seaborn as sns
import os
import shutil
from glob import glob
%matplotlib inline
!pip freeze > '../working/dockerimage_snapshot.txt'

In [None]:
def makeWordCloud(df,column,numWords):
    topic_words = [ z.lower() for y in
                       [ x.split() for x in df[column] if isinstance(x, str)]
                       for z in y]
    word_count_dict = dict(Counter(topic_words))
    popular_words = sorted(word_count_dict, key = word_count_dict.get, reverse = True)
    popular_words_nonstop = [w for w in popular_words if w not in stopwords.words("english")]
    word_string=str(popular_words_nonstop)
    wordcloud = WordCloud(stopwords=STOPWORDS,
                          background_color='white',
                          max_words=numWords,
                          width=1000,height=1000,
                         ).generate(word_string)
    plt.clf()
    plt.imshow(wordcloud)
    plt.axis('off')
    plt.show()

def plotImages(artist,directory):
    print(artist)
    multipleImages = glob(directory)
    plt.rcParams['figure.figsize'] = (15, 15)
    plt.subplots_adjust(wspace=0, hspace=0)
    i_ = 0
    for l in multipleImages[:25]:
        im = cv2.imread(l)
        im = cv2.resize(im, (128, 128)) 
        plt.subplot(5, 5, i_+1) #.set_title(l)
        plt.imshow(cv2.cvtColor(im, cv2.COLOR_BGR2RGB)); plt.axis('off')
        i_ += 1

np.random.seed(7)

In [None]:
print(os.listdir("../input/medical-masks-dataset/images/"))

In [None]:
img_dir='../input/medical-masks-dataset/images'
path=Path(img_dir)
data = ImageDataBunch.from_folder(path, train=".", 
                                  valid_pct=0.2,
                                  ds_tfms=get_transforms(do_flip=False,flip_vert=False, max_rotate=0,max_lighting=0.3),
                                  size=299,bs=64, 
                                  num_workers=0).normalize(imagenet_stats)
print(f'Classes: \n {data.classes}')
data.show_batch(rows=8, figsize=(40,40))

In [None]:
cnt_srs = aggregate_df['dataset'].value_counts().head()
trace = go.Bar(
    y=cnt_srs.index[::-1],
    x=cnt_srs.values[::-1],
    orientation = 'h',
    marker=dict(
        color=cnt_srs.values[::-1],
        colorscale = 'Blues',
        reversescale = True
    ),
)

layout = dict(
    title='Dataset distribution',
    )
data = [trace]
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename="dataset")

In [None]:
cnt_srs = aggregate_df['abstract'].value_counts().head()
trace = go.Bar(
    y=cnt_srs.index[::-1],
    x=cnt_srs.values[::-1],
    orientation = 'h',
    marker=dict(
        color=cnt_srs.values[::-1],
        colorscale = 'Reds',
        reversescale = True
    ),
)

layout = dict(
    title='Abstracts distribution',
    )
data = [trace]
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename="abstract")

In [None]:
fig = px.pie( values=aggregate_df.groupby(['dataset']).size().values,names=aggregate_df.groupby(['dataset']).size().index)
fig.update_layout(
    title = "dataset",
    font=dict(
        family="Arial, monospace",
        size=15,
        color="#7f7f7f"
    )
    )   
    
py.iplot(fig)

In [None]:
fig = px.histogram(aggregate_df[aggregate_df.dataset.notna()],x="dataset",marginal="box",nbins=10)
fig.update_layout(
    title = "dataset",
    xaxis_title="dataset",
    yaxis_title="Number of datasets",
    barmode="group",
    bargap=0.1,
    xaxis = dict(
        tickmode = 'linear',
        tick0 = 0,
        dtick = 10),
    font=dict(
        family="Arial, monospace",
        size=15,
        color="#7f7f7f"
    )
    )
py.iplot(fig)

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcQ2WW7WwEFT0lnnLSu9j4XAGpgW7N-gYJODifJ02D0j6Q3z2MrK',width=400,height=400)

In [None]:
#codes from Rodrigo Lima  @rodrigolima82
from IPython.display import Image
Image(url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcT72jGRpDdXhCyNI7k28qbxadRkMeKMMC0-5wZjblpDj35loLuX',width=400,height=400)

#Health for all and all for health

#Codes from Anthony https://www.kaggle.com/anthony358/cord-19-simple-parsing-to-dataframes/comments
#codes from Shivam Ralli @hoshi7
#codes from Paul Mooney @paultimothymooney 
#codes from Helder Peixoto

Kaggle Notebook Runner: Marília Prata @mpwolke