# Representations, LDA and topics in CORD-19

In this notebook, representations of the papers will be calculated based on their text content.
Then, from these representations, a modeling of topics will be carried out using the LDA method.

Finally, the most relevant papers for each topic will be determined using the PageRank scores of each paper.

In [None]:
!pip install -q -r requirements.txt

In [None]:
from risotto.references import load_papers_from_metadata_file, build_papers_reference_graph, paper_as_markdown
from fastprogress.fastprogress import progress_bar

from pathlib import Path

import scispacy
import en_core_sci_sm
import networkx as nx
from collections import defaultdict
import numpy as np

Loading paper dataset and re-generating the graph of papers and the corresponding PageRank.

In [None]:
cord19_dataset_folder = "./datasets/CORD-19-research-challenge"

In [None]:
papers, _ = load_papers_from_metadata_file(cord19_dataset_folder)

In [None]:
G = build_papers_reference_graph(papers)

In [None]:
pageranks = nx.pagerank(G)

## Paper representations

In order to build a representation for each paper, the following libraries will be used:

- spaCy: https://spacy.io/
- scispaCy: https://allenai.github.io/scispacy/

The language model named`en_core_sci_sm` will be used, which has been trained with a corpus of biomedical text with a vocabulary of more than 100.000 words.
In case of needing a model with a larger vocabulary, there are some others available.

In [None]:
# Load the biomedical language pipeline
nlp = en_core_sci_sm.load()

In [None]:
# Select a paper to showcase spacy's features
sample_paper = list(pageranks.keys())[0]
sample_text = "\n".join([ paragraph["text"] for paragraph in sample_paper._file_contents["body_text"]])
sample_text

doc = nlp(sample_text, disable=["tagger", "parser", "ner"])
doc[17].lemma_

'be'

The document tokenized by the `spacy` pipeline is displayed.
An interesting thing about using `spacy` with the pretrained language model is that it automatically computes document and token representations vectors.
It's a pending task to find out which language model architecture it's used to compute those vectors.

A relevant aspect that influences downstream tasks is the number of out--of--vocabulary (OOV) tokens.
The following cell makes a quick inspection over a sample paper counting the number of OOV tokens.
A continuación, se realizará una iteración sobre los tokens para detectarlos.

In [None]:
num_oov = 0
for token in doc:
    if token.is_oov and token.string != "\n":
        if token.string.endswith("virus"):
            print(token, "not found")
        num_oov += 1
    else:
        if token.string.endswith("virus"):
            print(token, "found")
num_oov, 100 * num_oov / len(doc)

virus found
virus found
virus found


(336, 4.801371820520148)

In [None]:
# This cell tests the mechanisms used to remove stopwords,
# punctuation, spaces, and extract the token's lemma.
for token in doc:
    if not token.is_stop and not token.is_punct and not token.is_space:
        print(token.lemma_)

genetic
information
RNA
virus
organize
efficiently
Practically
nucleotide
genome
utilize
protein-coding
sequence
cis-acting
signal
translation
RNA
synthesis
RNA
encapsidation
genome
expression
strategy
group
positive-strand
RNA
+
RNA
virus
produce
subgenomic
sg
mRNAs
review
Miller
Koev
2000
replication
genomic
RNA
mRNA
viral
replicase
supplement
generation
sg
transcript
express
structural
auxiliary
protein
encode
downstream
replicase
gene
genome
Sg
mRNAs
+
RNA
virus
3′-co-terminal
genomic
RNA
different
mechanism
synthesis
virus
brome
mosaic
virus
initiate
sg
mRNA
synthesis
internally
full-length
minus
strand
RNA
template
Miller
et
al.
1985
exemplify
red
clover
necrotic
mosaic
virus
RCNMV
rely
premature
termination
minus
strand
synthesis
genomic
RNA
template
follow
synthesis
sg
plus
strand
truncate
minus
strand
template
Sit
et
al.
1998
Members
order
Nidovirales
include
coronavirus
arteriviruses
evolve
unique
mechanism
employ
discontinuous
RNA
synthesis
generation
extensive
set
sg
RNAs
r

defective
interfere
DI
RNAs
replicons
carry
body
TRSs
moderate
level
sg
mRNAs
produce
presence
helper
virus
system
Joo
Makino
1992
van
der
et
al.
1994
perform
body
TRS
mutagenesis
study
murine
coronavirus
MHV
Joo
Makino
systematically
mutagenized
core
MHV
body
TRS
contrast
result
find
21
body
TRS
mutant
sg
RNA
synthesis
DI
RNA
genome
abolish
support
normal
level
sg
RNA
production
possible
MHV
TRS
study
tolerant
single-nucleotide
mismatch
EAV
sg
RNA7
TRS
similar
study
van
der
et
al.
1994
observe
U
C
substitution
position
1
3
MHV
body
TRS
maintain
duplex
change
U
base
pair
U
G
base
pair
reduce
sg
RNA
level
strongly
substitution
disrupt
duplex
van
der
et
al.
1994
imply
case
EAV
leader
body
TRS
duplex
formation
factor
determine
coronavirus
sg
RNA
synthesis
limitation
DI
RNA
system
leader
TRS
mutagenized
study
body
TRS-specific
effect
distinguish
effect
level
leader
body
duplex
formation
recent
study
arterivirus
coronavirus
sg
RNA
synthesis
van
Marle
et
al.
1999a
Baric
Yount
2000
Sawicki
et

protein
sequence-specific
manner
body
TRS
well
candidate
serve
protein
recognition
site
protein
mediate
pause
nascent
strand
synthesis
and/or
nascent
strand
transfer
resemble
DNA-dependent
RNA
polymerase
termination
system
specific
DNA-binding
terminator
protein
bind
termination
sequence
Reeder
Lang
1997
function
HIV
nucleocapsid
protein
promote
minus
strand
strong-stop
DNA
transfer
Guo
et
al.
1997
EAV
replicase
component
nsp1
recently
show
possess
sg
RNA
synthesis-specific
activity
Tijms
et
al.
2001
good
candidate
regulatory
role
Residues
predict
form
zinc
finger
structure
nsp1
show
necessary
sg
RNA
synthesis
Interestingly
zinc
finger
structure
HIV
nucleocapsid
protein
facilitate
strand
transfer
Guo
et
al.
2000
Finally
note
RNA
structure
nascent
strand
influence
pause
strand
transfer
reinitiation
illustrate
fact
stable
hairpin
structure
nascent
strand
promote
termination
transcription
Escherichia
coli
RNA
polymerase
Wilson
von
Hippel
1995
Site-directed
mutagenesis
EAV
leader
body
TRSs

There are 95 occurrences of OOV tokens, which stands for 2.16% of the total amount of document's tokens.
Note that relevant tokens, such as `coronavirus` are included in the language model vocabulary.

## Latent Dirichlet Allocation (LDA)

The following cells will perform topic modelling experiments using the LDA technique.
The `scikit-learn` implementation of this model will be used.

In [None]:
# First, let's process all documents texts.
def process_papers_file_contents(papers):
    texts = []
    for paper in progress_bar(papers):
        text = " \n ".join([ paragraph["text"] for paragraph in paper._file_contents["body_text"]])
        """
        NB.: for development speed purposes, the only document's attributes
        considered for the topic modelling were the title and the abstract.
        Should the text be included in other experiments, the following line
        should be modified to include `{paper.text}`.
        """
        texts.append(f"{paper.title} \n {paper.abstract}")
    return texts

docs = process_papers_file_contents(
    papers=list(pageranks.keys()),
)


In [None]:
for doc in docs[:5]:
    print('>>', doc)

>> Sequence requirements for RNA strand transfer during nidovirus discontinuous subgenomic RNA synthesis 
 Nidovirus subgenomic mRNAs contain a leader sequence derived from the 5′ end of the genome fused to different sequences (‘bodies’) derived from the 3′ end. Their generation involves a unique mechanism of discontinuous subgenomic RNA synthesis that resembles copy-choice RNA recombination. During this process, the nascent RNA strand is transferred from one site in the template to another, during either plus or minus strand synthesis, to yield subgenomic RNA molecules. Central to this process are transcription-regulating sequences (TRSs), which are present at both template sites and ensure the fidelity of strand transfer. Here we present results of a comprehensive co-variation mutagenesis study of equine arteritis virus TRSs, demonstrating that discontinuous RNA synthesis depends not only on base pairing between sense leader TRS and antisense body TRS, but also on the primary sequenc

Vectors storing the token occurrence count will be used as document representations.
`tf-idf` vectors are purposefully not used because the document frequency normalization is already carried out by the LDA technique.

In [None]:
from sklearn.feature_extraction.text import CountVectorizer

def tokenizer(sentence):
    tokens = []
    for token in nlp(sentence, disable=["tagger", "parser", "ner"]):
        # Se descartan números, stopwords, puntuación, espacio y tokens de largo 1
        if not (token.like_num or token.is_stop or token.is_punct or token.is_space or len(token)==1):
            tokens.append(token.lemma_)
    return tokens

count_vectorizer = CountVectorizer(
    tokenizer=tokenizer,
    lowercase=True,
)
vectorized_docs = count_vectorizer.fit_transform(docs)

In [None]:
vectorized_docs

<44648x164664 sparse matrix of type '<class 'numpy.int64'>'
	with 3168863 stored elements in Compressed Sparse Row format>

In [None]:
count_vectorizer.vocabulary_

{'sequence': 137903,
 'requirement': 130614,
 'rna': 132466,
 'strand': 143531,
 'transfer': 151049,
 'nidovirus': 105209,
 'discontinuous': 51900,
 'subgenomic': 144172,
 'synthesis': 145952,
 'mrnas': 100208,
 'contain': 44122,
 'leader': 89237,
 'derive': 50233,
 '5′': 14413,
 'end': 57299,
 'genome': 67110,
 'fuse': 65319,
 'different': 51336,
 'body': 33356,
 '3′': 12213,
 'generation': 67003,
 'involve': 83901,
 'unique': 154307,
 'mechanism': 95218,
 'resemble': 130672,
 'copy-choice': 44580,
 'recombination': 129362,
 'process': 124103,
 'nascent': 103366,
 'site': 140083,
 'template': 147705,
 'plus': 120219,
 'minus': 98084,
 'yield': 161147,
 'molecule': 99191,
 'central': 38846,
 'transcription-regulating': 150990,
 'trss': 152146,
 'present': 123413,
 'ensure': 57825,
 'fidelity': 62774,
 'result': 131035,
 'comprehensive': 43284,
 'co-variation': 42097,
 'mutagenesis': 101415,
 'study': 143846,
 'equine': 58803,
 'arteritis': 27290,
 'virus': 157408,
 'demonstrate': 49856

A sparse matrix is built with 38.882 rows, one for each document, and 149.389 columns, one for each token.

In [None]:
from sklearn.decomposition import LatentDirichletAllocation

# Se ajusta un modelo LDA para identificar tópicos
lda = LatentDirichletAllocation(
    n_components=10,  # number of topics
    verbose=2, n_jobs=-1)
lda = lda.fit(vectorized_docs)

[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   49.8s remaining:   49.8s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   52.8s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 1 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   32.2s remaining:   32.2s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   33.8s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 2 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   26.6s remaining:   26.6s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   27.9s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 3 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   22.3s remaining:   22.3s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   23.4s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 4 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   19.6s remaining:   19.6s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   20.7s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 5 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   17.7s remaining:   17.7s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   18.8s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 6 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   16.9s remaining:   16.9s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   17.8s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 7 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   16.1s remaining:   16.1s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   17.1s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 8 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   17.7s remaining:   17.7s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   18.6s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 9 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   16.3s remaining:   16.3s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   17.0s finished
[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.


iteration: 10 of max_iter: 10


[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   12.8s remaining:   12.8s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   13.5s finished


In [None]:
lda

LatentDirichletAllocation(batch_size=128, doc_topic_prior=None,
                          evaluate_every=-1, learning_decay=0.7,
                          learning_method='batch', learning_offset=10.0,
                          max_doc_update_iter=100, max_iter=10,
                          mean_change_tol=0.001, n_components=10, n_jobs=-1,
                          perp_tol=0.1, random_state=None,
                          topic_word_prior=None, total_samples=1000000.0,
                          verbose=2)

The execution of the following cells will display the most relevant tokens for each identified topic.

In [None]:
def print_topic_words(topic_model, vectorizer, num_words):
    feature_names = vectorizer.get_feature_names()
    for topic_idx, topic in enumerate(topic_model.components_):
        message = "\nTopic #%d: " % topic_idx
        message += " ".join(
            [feature_names[i] for i in topic.argsort()[:-num_words - 1:-1]])
        print(message)

In [None]:
print_topic_words(lda, count_vectorizer, 20)


Topic #0: cell response mouse immune infection expression antibody induce human disease study antigen result gene cytokine increase effect role level protein

Topic #1: patient group calve treatment study day disease clinical increase level associate effect high blood control age result lung significantly serum

Topic #2: virus protein cell viral infection rna replication host coronavirus gene antiviral activity expression sars-cov human study bind membrane domain role

Topic #3: virus strain sequence sample gene assay isolate detect coronavirus detection analysis study porcine pig test result pedv disease genome infectious

Topic #4: de nan la en et des les el los las le un se chapter que infection con une del patient

Topic #5: respiratory virus patient infection influenza child viral clinical study pneumonia acute human test result case detect sample age severe associate

Topic #6: air study effect concentration temperature result method test reduce mask aerosol activity surface tr

The dataset papers will be classified into the different previously modelled topics.

In [None]:
docs_classified = lda.transform(vectorized_docs)
docs_classified[:5]

[Parallel(n_jobs=4)]: Using backend LokyBackend with 4 concurrent workers.
[Parallel(n_jobs=4)]: Done   2 out of   4 | elapsed:   12.9s remaining:   12.9s
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:   13.5s finished


array([[1.49689264e-02, 8.19863592e-04, 9.44715587e-01, 3.45768806e-02,
        8.19740956e-04, 8.19741833e-04, 8.19767263e-04, 8.19944548e-04,
        8.19776267e-04, 8.19771824e-04],
       [3.33557314e-01, 1.25001908e-03, 5.17972921e-01, 1.25009814e-03,
        1.25053607e-03, 4.61212638e-02, 1.25003355e-03, 9.48476030e-02,
        1.25011700e-03, 1.25009486e-03],
       [1.09907509e-03, 5.29271005e-02, 8.23947488e-01, 1.09928554e-03,
        1.09895796e-03, 1.09928650e-03, 1.09932241e-03, 1.02005968e-01,
        1.09919643e-03, 1.45243196e-02],
       [8.92929571e-04, 8.92916485e-04, 5.86757920e-01, 8.93033253e-04,
        8.92889038e-04, 8.92955232e-04, 5.82827871e-02, 3.48708643e-01,
        8.92947900e-04, 8.92978955e-04],
       [2.49737177e-02, 8.40394684e-04, 8.77864410e-01, 9.12787598e-02,
        8.40361663e-04, 8.40461005e-04, 8.40389795e-04, 8.40540520e-04,
        8.40479144e-04, 8.40485879e-04]])

Finalmente, para cada uno de los temas identifiados, se imprimen los top-5 papers pertenecientes al tema, ordenados por su *pagerank*.
Finally, the top-5 PageRank sorted papers belonging to each topic are displayed.

In [None]:
docs_topics = docs_classified.argmax(1)
topic_papers = defaultdict(list)
all_papers = list(pageranks.keys())
for idx, topic_id in enumerate(docs_topics):
    topic_papers[topic_id].append(all_papers[idx])
    
for topic_id, papers in sorted(topic_papers.items(), key=lambda t: t[0]):
    print(f"Topic ID #{topic_id}")
    sorted_papers = sorted(papers, reverse=True, key=lambda p: pageranks[p])
    for paper in sorted_papers[:5]:
        paper_as_markdown(paper)
    #print("\n", end="")

Topic ID #0



- **Title:** A human in vitro model system for investigating genome-wide host responses to SARS coronavirus infection
- **Authors:** Ng, Lisa FP; Hibberd, Martin L; Ooi, Eng-Eong; Tang, Kin-Fai; Neo, Soek-Ying; Tan, Jenny; Krishna Murthy, Karuturi R; Vega, Vinsensius B; Chia, Jer-Ming; Liu, Edison T; Ren, Ee-Chee
- **Publish date/time:** 2004-09-09
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The molecular basis of severe acute respiratory syndrome (SARS) coronavirus (CoV) induced pathology is still largely unclear. Many SARS patients suffer respiratory distress brought on by interstitial infiltration and frequently show peripheral blood lymphopenia and occasional leucopenia. One possible cause of this could be interstitial inflammation, following a localized host response. In this study, we therefore examine the immune response of SARS-CoV in human peripheral blood mononuclear cells (PBMCs) over the first 24 hours. METHODS: PBMCs from normal healthy donors were inoculated in vitro with SARS-CoV and the viral replication kinetics was studied by real-time quantitative assays. SARS-CoV specific gene expression changes were examined by high-density oligonucleotide array analysis. RESULTS: We observed that SARS-CoV was capable of infecting and replicating in PBMCs and the kinetics of viral replication was variable among the donors. SARS-CoV antibody binding assays indicated that SARS specific antibodies inhibited SARS-CoV viral replication. Array data showed monocyte-macrophage cell activation, coagulation pathway upregulation and cytokine production together with lung trafficking chemokines such as IL8 and IL17, possibly activated through the TLR9 signaling pathway; that mimicked clinical features of the disease. CONCLUSIONS: The identification of human blood mononuclear cells as a direct target of SARS-CoV in the model system described here provides a new insight into disease pathology and a tool for investigating the host response and mechanisms of pathogenesis.


- **Title:** The involvement of survival signaling pathways in rubella-virus induced apoptosis
- **Authors:** Cooray, Samantha; Jin, Li; Best, Jennifer M
- **Publish date/time:** 2005-01-04
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Rubella virus (RV) causes severe congenital defects when acquired during the first trimester of pregnancy. RV cytopathic effect has been shown to be due to caspase-dependent apoptosis in a number of susceptible cell lines, and it has been suggested that this apoptotic induction could be a causal factor in the development of such defects. Often the outcome of apoptotic stimuli is dependent on apoptotic, proliferative and survival signaling mechanisms in the cell. Therefore we investigated the role of phosphoinositide 3-kinase (PI3K)-Akt survival signaling and Ras-Raf-MEK-ERK proliferative signaling during RV-induced apoptosis in RK13 cells. Increasing levels of phosphorylated ERK, Akt and GSK3β were detected from 24–96 hours post-infection, concomitant with RV-induced apoptotic signals. Inhibition of PI3K-Akt signaling reduced cell viability, and increased the speed and magnitude of RV-induced apoptosis, suggesting that this pathway contributes to cell survival during RV infection. In contrast, inhibition of the Ras-Raf-MEK-ERK pathway impaired RV replication and growth and reduced RV-induced apoptosis, suggesting that the normal cellular growth is required for efficient virus production.


- **Title:** Protein secretion in Lactococcus lactis : an efficient way to increase the overall heterologous protein production
- **Authors:** Le Loir, Yves; Azevedo, Vasco; Oliveira, Sergio C; Freitas, Daniela A; Miyoshi, Anderson; Bermúdez-Humarán, Luis G; Nouaille, Sébastien; Ribeiro, Luciana A; Leclercq, Sophie; Gabriel, Jane E; Guimaraes, Valeria D; Oliveira, Maricê N; Charlier, Cathy; Gautier, Michel; Langella, Philippe
- **Publish date/time:** 2005-01-04
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Lactococcus lactis, the model lactic acid bacterium (LAB), is a food grade and well-characterized Gram positive bacterium. It is a good candidate for heterologous protein delivery in foodstuff or in the digestive tract. L. lactis can also be used as a protein producer in fermentor. Many heterologous proteins have already been produced in L. lactis but only few reports allow comparing production yields for a given protein either produced intracellularly or secreted in the medium. Here, we review several works evaluating the influence of the localization on the production yields of several heterologous proteins produced in L. lactis. The questions of size limits, conformation, and proteolysis are addressed and discussed with regard to protein yields. These data show that i) secretion is preferable to cytoplasmic production; ii) secretion enhancement (by signal peptide and propeptide optimization) results in increased production yield; iii) protein conformation rather than protein size can impair secretion and thus alter production yields; and iv) fusion of a stable protein can stabilize labile proteins. The role of intracellular proteolysis on heterologous cytoplasmic proteins and precursors is discussed. The new challenges now are the development of food grade systems and the identification and optimization of host factors affecting heterologous protein production not only in L. lactis, but also in other LAB species.


- **Title:** Expression profile of immune response genes in patients with Severe Acute Respiratory Syndrome
- **Authors:** Reghunathan, Renji; Jayapal, Manikandan; Hsu, Li-Yang; Chng, Hiok-Hee; Tai, Dessmon; Leung, Bernard P; Melendez, Alirio J
- **Publish date/time:** 2005-01-18
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Severe acute respiratory syndrome (SARS) emerged in later February 2003, as a new epidemic form of life-threatening infection caused by a novel coronavirus. However, the immune-pathogenesis of SARS is poorly understood. To understand the host response to this pathogen, we investigated the gene expression profiles of peripheral blood mononuclear cells (PBMCs) derived from SARS patients, and compared with healthy controls. RESULTS: The number of differentially expressed genes was found to be 186 under stringent filtering criteria of microarray data analysis. Several genes were highly up-regulated in patients with SARS, such as, the genes coding for Lactoferrin, S100A9 and Lipocalin 2. The real-time PCR method verified the results of the gene array analysis and showed that those genes that were up-regulated as determined by microarray analysis were also found to be comparatively up-regulated by real-time PCR analysis. CONCLUSIONS: This differential gene expression profiling of PBMCs from patients with SARS strongly suggests that the response of SARS affected patients seems to be mainly an innate inflammatory response, rather than a specific immune response against a viral infection, as we observed a complete lack of cytokine genes usually triggered during a viral infection. Our study shows for the first time how the immune system responds to the SARS infection, and opens new possibilities for designing new diagnostics and treatments for this new life-threatening disease.


- **Title:** Synergistic inhibition of human cytomegalovirus replication by interferon-alpha/beta and interferon-gamma
- **Authors:** Sainz, Bruno; LaMarca, Heather L; Garry, Robert F; Morris, Cindy A
- **Publish date/time:** 2005-02-23
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Recent studies have shown that gamma interferon (IFN-γ) synergizes with the innate IFNs (IFN-α and IFN-β) to inhibit herpes simplex virus type 1 (HSV-1) replication in vitro. To determine whether this phenomenon is shared by other herpesviruses, we investigated the effects of IFNs on human cytomegalovirus (HCMV) replication. RESULTS: We have found that as with HSV-1, IFN-γ synergizes with the innate IFNs (IFN-α/β) to potently inhibit HCMV replication in vitro. While pre-treatment of human foreskin fibroblasts (HFFs) with IFN-α, IFN-β or IFN-γ alone inhibited HCMV plaque formation by ~30 to 40-fold, treatment with IFN-α and IFN-γ or IFN-β and IFN-γ inhibited HCMV plaque formation by 163- and 662-fold, respectively. The generation of isobole plots verified that the observed inhibition of HCMV plaque formation and replication in HFFs by IFN-α/β and IFN-γ was a synergistic interaction. Additionally, real-time PCR analyses of the HCMV immediate early (IE) genes (IE1 and IE2) revealed that IE mRNA expression was profoundly decreased in cells stimulated with IFN-α/β and IFN-γ (~5-11-fold) as compared to vehicle-treated cells. Furthermore, decreased IE mRNA expression was accompanied by a decrease in IE protein expression, as demonstrated by western blotting and immunofluorescence. CONCLUSION: These findings suggest that IFN-α/β and IFN-γ synergistically inhibit HCMV replication through a mechanism that may involve the regulation of IE gene expression. We hypothesize that IFN-γ produced by activated cells of the adaptive immune response may potentially synergize with endogenous type I IFNs to inhibit HCMV dissemination in vivo.

Topic ID #1



- **Title:** Dynamic changes of serum SARS-Coronavirus IgG, pulmonary function and radiography in patients recovering from SARS after hospital discharge
- **Authors:** Xie, Lixin; Liu, Youning; Fan, Baoxing; Xiao, Yueyong; Tian, Qing; Chen, Liangan; Zhao, Hong; Chen, Weijun
- **Publish date/time:** 2005-01-08
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** OBJECTIVE: The intent of this study was to examine the recovery of individuals who had been hospitalized for severe acute respiratory syndrome (SARS) in the year following their discharge from the hospital. Parameters studied included serum levels of SARS coronavirus (SARS-CoV) IgG antibody, tests of lung function, and imaging data to evaluate changes in lung fibrosis. In addition, we explored the incidence of femoral head necrosis in some of the individuals recovering from SARS. METHODS: The subjects of this study were 383 clinically diagnosed SARS patients in Beijing, China. They were tested regularly for serum levels of SARS-CoV IgG antibody and lung function and were given chest X-rays and/or high resolution computerized tomography (HRCT) examinations at the Chinese PLA General Hospital during the 12 months that followed their release from the hospital. Those individuals who were found to have lung diffusion abnormities (transfer coefficient for carbon monoxide [D(L)CO] < 80% of predicted value [pred]) received regular lung function tests and HRCT examinations in the follow-up phase in order to document the changes in their lung condition. Some patients who complained of joint pain were given magnetic resonance imaging (MRI) examinations of their femoral heads. FINDINGS: Of all the subjects, 81.2% (311 of 383 patients) tested positive for serum SARS-CoV IgG. Of those testing positive, 27.3% (85 of 311 patients) were suffering from lung diffusion abnormities (D(L)CO < 80% pred) and 21.5% (67 of 311 patients) exhibited lung fibrotic changes. In the 12 month duration of this study, all of the 40 patients with lung diffusion abnormities who were examined exhibited some improvement of lung function and fibrosis detected by radiography. Of the individuals receiving MRI examinations, 23.1% (18 of 78 patients) showed signs of femoral head necrosis. INTERPRETATION: The lack of sero-positive SARS-CoV in some individuals suggests that there may have been some misdiagnosed cases among the subjects included in this study. Of those testing positive, the serum levels of SARS-CoV IgG antibody decreased significantly during the 12 months after hospital discharge. Additionally, we found that the individuals who had lung fibrosis showed some spontaneous recovery. Finally, some of the subjects developed femoral head necrosis.


- **Title:** Absence of association between angiotensin converting enzyme polymorphism and development of adult respiratory distress syndrome in patients with severe acute respiratory syndrome: a case control study
- **Authors:** Chan, KC Allen; Tang, Nelson LS; Hui, David SC; Chung, Grace TY; Wu, Alan KL; Chim, Stephen SC; Chiu, Rossa WK; Lee, Nelson; Choi, KW; Sung, YM; Chan, Paul KS; Tong, YK; Lai, ST; Yu, WC; Tsang, Owen; Lo, YM Dennis
- **Publish date/time:** 2005-04-09
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: It has been postulated that genetic predisposition may influence the susceptibility to SARS-coronavirus infection and disease outcomes. A recent study has suggested that the deletion allele (D allele) of the angiotensin converting enzyme (ACE) gene is associated with hypoxemia in SARS patients. Moreover, the ACE D allele has been shown to be more prevalent in patients suffering from adult respiratory distress syndrome (ARDS) in a previous study. Thus, we have investigated the association between ACE insertion/deletion (I/D) polymorphism and the progression to ARDS or requirement of intensive care in SARS patients. METHOD: One hundred and forty genetically unrelated Chinese SARS patients and 326 healthy volunteers were recruited. The ACE I/D genotypes were determined by polymerase chain reaction and agarose gel electrophoresis. RESULTS: There is no significant difference in the genotypic distributions and the allelic frequencies of the ACE I/D polymorphism between the SARS patients and the healthy control subjects. Moreover, there is also no evidence that ACE I/D polymorphism is associated with the progression to ARDS or the requirement of intensive care in the SARS patients. In multivariate logistic analysis, age is the only factor associated with the development of ARDS while age and male sex are independent factors associated with the requirement of intensive care. CONCLUSION: The ACE I/D polymorphism is not directly related to increased susceptibility to SARS-coronavirus infection and is not associated with poor outcomes after SARS-coronavirus infection.


- **Title:** Persistence of lung inflammation and lung cytokines with high-resolution CT abnormalities during recovery from SARS
- **Authors:** Wang, Chun-Hua; Liu, Chien-Ying; Wan, Yung-Liang; Chou, Chun-Liang; Huang, Kuo-Hsiung; Lin, Horng-Chyuan; Lin, Shu-Min; Lin, Tzou-Yien; Chung, Kian Fan; Kuo, Han-Pin
- **Publish date/time:** 2005-05-11
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: During the acute phase of severe acute respiratory syndrome (SARS), mononuclear cells infiltration, alveolar cell desquamation and hyaline membrane formation have been described, together with dysregulation of plasma cytokine levels. Persistent high-resolution computed tomography (HRCT) abnormalities occur in SARS patients up to 40 days after recovery. METHODS: To determine further the time course of recovery of lung inflammation, we investigated the HRCT and inflammatory profiles, and coronavirus persistence in bronchoalveolar lavage fluid (BALF) of 12 patients at recovery at 60 and 90 days. RESULTS: At 60 days, compared to normal controls, SARS patients had increased cellularity of BALF with increased alveolar macrophages (AM) and CD8 cells. HRCT scores were increased and correlated with T-cell numbers and their subpopulations, and inversely with CD4/CD8 ratio. TNF-α, IL-6, IL-8, RANTES and MCP-1 levels were increased. Viral particles in AM were detected by electron microscopy in 7 of 12 SARS patients with high HRCT score. On day 90, HRCT scores improved significantly in 10 of 12 patients, with normalization of BALF cell counts in 6 of 12 patients with repeat bronchoscopy. Pulse steroid therapy and prolonged fever were two independent factors associated with delayed resolution of pneumonitis, in this non-randomized, retrospective analysis. CONCLUSION: Resolution of pneumonitis is delayed in some patients during SARS recovery and may be associated with delayed clearance of coronavirus, Complete resolution may occur by 90 days or later.


- **Title:** The effects of injection of bovine vaccine into a human digit: a case report
- **Authors:** O'Neill, Jennifer K; Richards, Simon W; Ricketts, David M; Patterson, Marc H
- **Publish date/time:** 2005-10-11
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The incidence of needlestick injuries in farmers and veterinary surgeons is significant and the consequences of such an injection can be serious. CASE PRESENTATION: We report accidental injection of bovine vaccine into the base of the little finger. This resulted in increased pressure in the flexor sheath causing signs and symptoms of ischemia. Amputation of the digit was required despite repeated surgical debridement and decompression. CONCLUSION: There have been previous reports of injection of oil-based vaccines into the human hand resulting in granulomatous inflammation or sterile abscess and causing morbidity and tissue loss. Self-injection with veterinary vaccines is an occupational hazard for farmers and veterinary surgeons. Injection of vaccine into a closed compartment such as the human finger can have serious sequelae including loss of the injected digit. These injuries are not to be underestimated. Early debridement and irrigation of the injected area with decompression is likely to give the best outcome. Frequent review is necessary after the first procedure because repeat operations may be required.


- **Title:** Pneumothorax and mortality in the mechanically ventilated SARS patients: a prospective clinical study
- **Authors:** Kao, Hsin-Kuo; Wang, Jia-Horng; Sung, Chun-Sung; Huang, Ying-Che; Lien, Te-Cheng
- **Publish date/time:** 2005-06-22
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** INTRODUCTION: Pneumothorax often complicates the management of mechanically ventilated severe acute respiratory syndrome (SARS) patients in the isolation intensive care unit (ICU). We sought to determine whether pneumothoraces are induced by high ventilatory pressure or volume and if they are associated with mortality in mechanically ventilated SARS patients. METHODS: We conducted a prospective, clinical study. Forty-one mechanically ventilated SARS patients were included in our study. All SARS patients were sedated and received mechanical ventilation in the isolation ICU. RESULTS: The mechanically ventilated SARS patients were divided into two groups either with or without pneumothorax. Their demographic data, clinical characteristics, ventilatory variables such as positive end-expiratory pressure, peak inspiratory pressure, mean airway pressure, tidal volume, tidal volume per kilogram, respiratory rate and minute ventilation and the accumulated mortality rate at 30 days after mechanical ventilation were analyzed. There were no statistically significant differences in the pressures and volumes between the two groups, and the mortality was also similar between the groups. However, patients developing pneumothorax during mechanical ventilation frequently expressed higher respiratory rates on admission, and a lower PaO(2)/FiO(2 )ratio and higher PaCO(2 )level during hospitalization compared with those without pneumothorax. CONCLUSION: In our study, the SARS patients who suffered pneumothorax presented as more tachypnic on admission, and more pronounced hypoxemic and hypercapnic during hospitalization. These variables signaled a deterioration in respiratory function and could be indicators of developing pneumothorax during mechanical ventilation in the SARS patients. Meanwhile, meticulous respiratory therapy and monitoring were mandatory in these patients.

Topic ID #2



- **Title:** Sequence requirements for RNA strand transfer during nidovirus discontinuous subgenomic RNA synthesis
- **Authors:** Pasternak, Alexander O.; van den Born, Erwin; Spaan, Willy J.M.; Snijder, Eric J.
- **Publish date/time:** 2001-12-17
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Nidovirus subgenomic mRNAs contain a leader sequence derived from the 5′ end of the genome fused to different sequences (‘bodies’) derived from the 3′ end. Their generation involves a unique mechanism of discontinuous subgenomic RNA synthesis that resembles copy-choice RNA recombination. During this process, the nascent RNA strand is transferred from one site in the template to another, during either plus or minus strand synthesis, to yield subgenomic RNA molecules. Central to this process are transcription-regulating sequences (TRSs), which are present at both template sites and ensure the fidelity of strand transfer. Here we present results of a comprehensive co-variation mutagenesis study of equine arteritis virus TRSs, demonstrating that discontinuous RNA synthesis depends not only on base pairing between sense leader TRS and antisense body TRS, but also on the primary sequence of the body TRS. While the leader TRS merely plays a targeting role for strand transfer, the body TRS fulfils multiple functions. The sequences of mRNA leader–body junctions of TRS mutants strongly suggested that the discontinuous step occurs during minus strand synthesis.


- **Title:** Crystal structure of murine sCEACAM1a[1,4]: a coronavirus receptor in the CEA family
- **Authors:** Tan, Kemin; Zelus, Bruce D.; Meijers, Rob; Liu, Jin-huan; Bergelson, Jeffrey M.; Duke, Norma; Zhang, Rongguang; Joachimiak, Andrzej; Holmes, Kathryn V.; Wang, Jia-huai
- **Publish date/time:** 2002-05-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** CEACAM1 is a member of the carcinoembryonic antigen (CEA) family. Isoforms of murine CEACAM1 serve as receptors for mouse hepatitis virus (MHV), a murine coronavirus. Here we report the crystal structure of soluble murine sCEACAM1a[1,4], which is composed of two Ig-like domains and has MHV neutralizing activity. Its N-terminal domain has a uniquely folded CC′ loop that encompasses key virus-binding residues. This is the first atomic structure of any member of the CEA family, and provides a prototypic architecture for functional exploration of CEA family members. We discuss the structural basis of virus receptor activities of murine CEACAM1 proteins, binding of Neisseria to human CEACAM1, and other homophilic and heterophilic interactions of CEA family members.


- **Title:** Synthesis of a novel hepatitis C virus protein by ribosomal frameshift
- **Authors:** Xu, Zhenming; Choi, Jinah; Yen, T.S.Benedict; Lu, Wen; Strohecker, Anne; Govindarajan, Sugantha; Chien, David; Selby, Mark J.; Ou, Jing‐hsiung
- **Publish date/time:** 2001-07-16
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Hepatitis C virus (HCV) is an important human pathogen that affects ∼100 million people worldwide. Its RNA genome codes for a polyprotein, which is cleaved by viral and cellular proteases to produce at least 10 mature viral protein products. We report here the discovery of a novel HCV protein synthesized by ribosomal frameshift. This protein, which we named the F protein, is synthesized from the initiation codon of the polyprotein sequence followed by ribosomal frameshift into the −2/+1 reading frame. This ribosomal frameshift requires only codons 8–14 of the core protein‐coding sequence, and the shift junction is located at or near codon 11. An F protein analog synthesized in vitro reacted with the sera of HCV patients but not with the sera of hepatitis B patients, indicating the expression of the F protein during natural HCV infection. This unexpected finding may open new avenues for the development of anti‐HCV drugs.


- **Title:** Structure of coronavirus main proteinase reveals combination of a chymotrypsin fold with an extra α-helical domain
- **Authors:** Anand, Kanchan; Palm, Gottfried J.; Mesters, Jeroen R.; Siddell, Stuart G.; Ziebuhr, John; Hilgenfeld, Rolf
- **Publish date/time:** 2002-07-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** The key enzyme in coronavirus polyprotein processing is the viral main proteinase, M(pro), a protein with extremely low sequence similarity to other viral and cellular proteinases. Here, the crystal structure of the 33.1 kDa transmissible gastroenteritis (corona)virus M(pro) is reported. The structure was refined to 1.96 Å resolution and revealed three dimers in the asymmetric unit. The mutual arrangement of the protomers in each of the dimers suggests that M(pro) self-processing occurs in trans. The active site, comprised of Cys144 and His41, is part of a chymotrypsin-like fold that is connected by a 16 residue loop to an extra domain featuring a novel α-helical fold. Molecular modelling and mutagenesis data implicate the loop in substrate binding and elucidate S1 and S2 subsites suitable to accommodate the side chains of the P1 glutamine and P2 leucine residues of M(pro) substrates. Interactions involving the N-terminus and the α-helical domain stabilize the loop in the orientation required for trans-cleavage activity. The study illustrates that RNA viruses have evolved unprecedented variations of the classical chymotrypsin fold.


- **Title:** Discontinuous and non-discontinuous subgenomic RNA transcription in a nidovirus
- **Authors:** van Vliet, A.L.W.; Smits, S.L.; Rottier, P.J.M.; de Groot, R.J.
- **Publish date/time:** 2002-12-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Arteri-, corona-, toro- and roniviruses are evolutionarily related positive-strand RNA viruses, united in the order Nidovirales. The best studied nidoviruses, the corona- and arteriviruses, employ a unique transcription mechanism, which involves discontinuous RNA synthesis, a process resembling similarity-assisted copy-choice RNA recombination. During infection, multiple subgenomic (sg) mRNAs are transcribed from a mirror set of sg negative-strand RNA templates. The sg mRNAs all possess a short 5′ common leader sequence, derived from the 5′ end of the genomic RNA. The joining of the non-contiguous ‘leader’ and ‘body’ sequences presumably occurs during minus-strand synthesis. To study whether toroviruses use a similar transcription mechanism, we characterized the 5′ termini of the genome and the four sg mRNAs of Berne virus (BEV). We show that BEV mRNAs 3–5 lack a leader sequence. Surprisingly, however, RNA 2 does contain a leader, identical to the 5′-terminal 18 residues of the genome. Apparently, BEV combines discontinuous and non-discontinous RNA synthesis to produce its sg mRNAs. Our findings have important implications for the understanding of the mechanism and evolution of nidovirus transcription.

Topic ID #3



- **Title:** Relationship of SARS-CoV to other pathogenic RNA viruses explored by tetranucleotide usage profiling
- **Authors:** Yap, Yee Leng; Zhang, Xue Wu; Danchin, Antoine
- **Publish date/time:** 2003-09-20
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The exact origin of the cause of the Severe Acute Respiratory Syndrome (SARS) is still an open question. The genomic sequence relationship of SARS-CoV with 30 different single-stranded RNA (ssRNA) viruses of various families was studied using two non-standard approaches. Both approaches began with the vectorial profiling of the tetra-nucleotide usage pattern V for each virus. In approach one, a distance measure of a vector V, based on correlation coefficient was devised to construct a relationship tree by the neighbor-joining algorithm. In approach two, a multivariate factor analysis was performed to derive the embedded tetra-nucleotide usage patterns. These patterns were subsequently used to classify the selected viruses. RESULTS: Both approaches yielded relationship outcomes that are consistent with the known virus classification. They also indicated that the genome of RNA viruses from the same family conform to a specific pattern of word usage. Based on the correlation of the overall tetra-nucleotide usage patterns, the Transmissible Gastroenteritis Virus (TGV) and the Feline CoronaVirus (FCoV) are closest to SARS-CoV. Surprisingly also, the RNA viruses that do not go through a DNA stage displayed a remarkable discrimination against the CpG and UpA di-nucleotide (z = -77.31, -52.48 respectively) and selection for UpG and CpA (z = 65.79,49.99 respectively). Potential factors influencing these biases are discussed. CONCLUSION: The study of genomic word usage is a powerful method to classify RNA viruses. The congruence of the relationship outcomes with the known classification indicates that there exist phylogenetic signals in the tetra-nucleotide usage patterns, that is most prominent in the replicase open reading frames.


- **Title:** Bioinformatics analysis of SARS coronavirus genome polymorphism
- **Authors:** Pavlović-Lažetić, Gordana M; Mitić, Nenad S; Beljanski, Miloš V
- **Publish date/time:** 2004-05-25
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: We have compared 38 isolates of the SARS-CoV complete genome. The main goal was twofold: first, to analyze and compare nucleotide sequences and to identify positions of single nucleotide polymorphism (SNP), insertions and deletions, and second, to group them according to sequence similarity, eventually pointing to phylogeny of SARS-CoV isolates. The comparison is based on genome polymorphism such as insertions or deletions and the number and positions of SNPs. RESULTS: The nucleotide structure of all 38 isolates is presented. Based on insertions and deletions and dissimilarity due to SNPs, the dataset of all the isolates has been qualitatively classified into three groups each having their own subgroups. These are the A-group with "regular" isolates (no insertions / deletions except for 5' and 3' ends), the B-group of isolates with "long insertions", and the C-group of isolates with "many individual" insertions and deletions. The isolate with the smallest average number of SNPs, compared to other isolates, has been identified (TWH). The density distribution of SNPs, insertions and deletions for each group or subgroup, as well as cumulatively for all the isolates is also presented, along with the gene map for TWH. Since individual SNPs may have occurred at random, positions corresponding to multiple SNPs (occurring in two or more isolates) are identified and presented. This result revises some previous results of a similar type. Amino acid changes caused by multiple SNPs are also identified (for the annotated sequences, as well as presupposed amino acid changes for non-annotated ones). Exact SNP positions for the isolates in each group or subgroup are presented. Finally, a phylogenetic tree for the SARS-CoV isolates has been produced using the CLUSTALW program, showing high compatibility with former qualitative classification. CONCLUSIONS: The comparative study of SARS-CoV isolates provides essential information for genome polymorphism, indication of strain differences and variants evolution. It may help with the development of effective treatment.


- **Title:** Mutational dynamics of the SARS coronavirus in cell culture and human populations isolated in 2003
- **Authors:** Vega, Vinsensius B; Ruan, Yijun; Liu, Jianjun; Lee, Wah Heng; Wei, Chia Lin; Se-Thoe, Su Yun; Tang, Kin Fai; Zhang, Tao; Kolatkar, Prasanna R; Ooi, Eng Eong; Ling, Ai Ee; Stanton, Lawrence W; Long, Philip M; Liu, Edison T
- **Publish date/time:** 2004-09-06
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The SARS coronavirus is the etiologic agent for the epidemic of the Severe Acute Respiratory Syndrome. The recent emergence of this new pathogen, the careful tracing of its transmission patterns, and the ability to propagate in culture allows the exploration of the mutational dynamics of the SARS-CoV in human populations. METHODS: We sequenced complete SARS-CoV genomes taken from primary human tissues (SIN3408, SIN3725V, SIN3765V), cultured isolates (SIN848, SIN846, SIN842, SIN845, SIN847, SIN849, SIN850, SIN852, SIN3408L), and five consecutive Vero cell passages (SIN2774_P1, SIN2774_P2, SIN2774_P3, SIN2774_P4, SIN2774_P5) arising from SIN2774 isolate. These represented individual patient samples, serial in vitro passages in cell culture, and paired human and cell culture isolates. Employing a refined mutation filtering scheme and constant mutation rate model, the mutation rates were estimated and the possible date of emergence was calculated. Phylogenetic analysis was used to uncover molecular relationships between the isolates. RESULTS: Close examination of whole genome sequence of 54 SARS-CoV isolates identified before 14(th )October 2003, including 22 from patients in Singapore, revealed the mutations engendered during human-to-Vero and Vero-to-human transmission as well as in multiple Vero cell passages in order to refine our analysis of human-to-human transmission. Though co-infection by different quasipecies in individual tissue samples is observed, the in vitro mutation rate of the SARS-CoV in Vero cell passage is negligible. The in vivo mutation rate, however, is consistent with estimates of other RNA viruses at approximately 5.7 × 10(-6 )nucleotide substitutions per site per day (0.17 mutations per genome per day), or two mutations per human passage (adjusted R-square = 0.4014). Using the immediate Hotel M contact isolates as roots, we observed that the SARS epidemic has generated four major genetic groups that are geographically associated: two Singapore isolates, one Taiwan isolate, and one North China isolate which appears most closely related to the putative SARS-CoV isolated from a palm civet. Non-synonymous mutations are centered in non-essential ORFs especially in structural and antigenic genes such as the S and M proteins, but these mutations did not distinguish the geographical groupings. However, no non-synonymous mutations were found in the 3CLpro and the polymerase genes. CONCLUSIONS: Our results show that the SARS-CoV is well adapted to growth in culture and did not appear to undergo specific selection in human populations. We further assessed that the putative origin of the SARS epidemic was in late October 2002 which is consistent with a recent estimate using cases from China. The greater sequence divergence in the structural and antigenic proteins and consistent deletions in the 3' – most portion of the viral genome suggest that certain selection pressures are interacting with the functional nature of these validated and putative ORFs.


- **Title:** SARS Transmission Pattern in Singapore Reassessed by Viral Sequence Variation Analysis
- **Authors:** Liu, Jianjun; Lim, Siew Lan; Ruan, Yijun; Ling, Ai Ee; Ng, Lisa F. P; Drosten, Christian; Liu, Edison T; Stanton, Lawrence W; Hibberd, Martin L
- **Publish date/time:** 2005-02-22
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Epidemiological investigations of infectious disease are mainly dependent on indirect contact information and only occasionally assisted by characterization of pathogen sequence variation from clinical isolates. Direct sequence analysis of the pathogen, particularly at a population level, is generally thought to be too cumbersome, technically difficult, and expensive. We present here a novel application of mass spectrometry (MS)–based technology in characterizing viral sequence variations that overcomes these problems, and we apply it retrospectively to the severe acute respiratory syndrome (SARS) outbreak in Singapore. METHODS AND FINDINGS: The success rate of the MS-based analysis for detecting SARS coronavirus (SARS-CoV) sequence variations was determined to be 95% with 75 copies of viral RNA per reaction, which is sufficient to directly analyze both clinical and cultured samples. Analysis of 13 SARS-CoV isolates from the different stages of the Singapore outbreak identified nine sequence variations that could define the molecular relationship between them and pointed to a new, previously unidentified, primary route of introduction of SARS-CoV into the Singapore population. Our direct determination of viral sequence variation from a clinical sample also clarified an unresolved epidemiological link regarding the acquisition of SARS in a German patient. We were also able to detect heterogeneous viral sequences in primary lung tissues, suggesting a possible coevolution of quasispecies of virus within a single host. CONCLUSION: This study has further demonstrated the importance of improving clinical and epidemiological studies of pathogen transmission through the use of genetic analysis and has revealed the MS-based analysis to be a sensitive and accurate method for characterizing SARS-CoV genetic variations in clinical samples. We suggest that this approach should be used routinely during outbreaks of a wide variety of agents, in order to allow the most effective control.


- **Title:** Recombination Every Day: Abundant Recombination in a Virus during a Single Multi-Cellular Host Infection
- **Authors:** Froissart, Remy; Roze, Denis; Uzest, Marilyne; Galibert, Lionel; Blanc, Stephane; Michalakis, Yannis
- **Publish date/time:** 2005-03-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Viral recombination can dramatically impact evolution and epidemiology. In viruses, the recombination rate depends on the frequency of genetic exchange between different viral genomes within an infected host cell and on the frequency at which such co-infections occur. While the recombination rate has been recently evaluated in experimentally co-infected cell cultures for several viruses, direct quantification at the most biologically significant level, that of a host infection, is still lacking. This study fills this gap using the cauliflower mosaic virus as a model. We distributed four neutral markers along the viral genome, and co-inoculated host plants with marker-containing and wild-type viruses. The frequency of recombinant genomes was evaluated 21 d post-inoculation. On average, over 50% of viral genomes recovered after a single host infection were recombinants, clearly indicating that recombination is very frequent in this virus. Estimates of the recombination rate show that all regions of the genome are equally affected by this process. Assuming that ten viral replication cycles occurred during our experiment—based on data on the timing of coat protein detection—the per base and replication cycle recombination rate was on the order of 2 × 10(−5) to 4 × 10(−5). This first determination of a virus recombination rate during a single multi-cellular host infection indicates that recombination is very frequent in the everyday life of this virus.

Topic ID #4



- **Title:** SARS, Mars and chocolate bars
- **Authors:** Gannon, Frank
- **Publish date/time:** 2005-01-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan


- **Title:** What does the peripheral blood tell you in SARS?
- **Authors:** OPENSHAW, P J M
- **Publish date/time:** 2004-04-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan


- **Title:** Ménage à trois of bacterial and viral pulmonary pathogens delivers coup de grace to the lung
- **Authors:** HUSSELL, T; WILLIAMS, A
- **Publish date/time:** 2004-06-15
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan


- **Title:** The Cuyahoga Is Still Burning
- **Authors:** Silbergeld, Ellen K.; Graham, Jay P.
- **Publish date/time:** 2008-04-23
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan


- **Title:** Dengue and Relative Bradycardia
- **Authors:** Lateef, Aisha; Fisher, Dale Andrew; Tambyah, Paul Ananth
- **Publish date/time:** 2007-04-23
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan

Topic ID #5



- **Title:** Detection of hepatitis C virus in the nasal secretions of an intranasal drug-user
- **Authors:** McMahon, James M; Simm, Malgorzata; Milano, Danielle; Clatts, Michael
- **Publish date/time:** 2004-05-07
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: One controversial source of infection for hepatitis C virus (HCV) involves the sharing of contaminated implements, such as straws or spoons, used to nasally inhale cocaine and other powdered drugs. An essential precondition for this mode of transmission is the presence of HCV in the nasal secretions of intranasal drug users. METHODS: Blood and nasal secretion samples were collected from five plasma-positive chronic intranasal drug users and tested for HCV RNA using RT-PCR. RESULTS: HCV was detected in all five blood samples and in the nasal secretions of the subject with the highest serum viral load. CONCLUSIONS: This study is the first to demonstrate the presence of HCV in nasal secretions. This finding has implications for potential transmission of HCV through contact with contaminated nasal secretions.


- **Title:** A touchdown nucleic acid amplification protocol as an alternative to culture backup for immunofluorescence in the routine diagnosis of acute viral respiratory tract infections
- **Authors:** Coyle, Peter V; Ong, Grace M; O'Neill, Hugh J; McCaughey, Conall; De Ornellas, Dennis; Mitchell, Frederick; Mitchell, Suzanne J; Feeney, Susan A; Wyatt, Dorothy E; Forde, Marian; Stockton, Joanne
- **Publish date/time:** 2004-10-25
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Immunofluorescence and virus culture are the main methods used to diagnose acute respiratory virus infections. Diagnosing these infections using nucleic acid amplification presents technical challenges, one of which is facilitating the different optimal annealing temperatures needed for each virus. To overcome this problem we developed a diagnostic molecular strip which combined a generic nested touchdown protocol with in-house primer master-mixes that could recognise 12 common respiratory viruses. RESULTS: Over an 18 month period a total of 222 specimens were tested by both immunofluorescence and the molecular strip. The specimens came from 103 males (median age 3.5 y), 80 females (median age 9 y) and 5 quality assurance scheme specimens. Viruses were recovered from a number of specimen types including broncho-alveolar lavage, nasopharyngeal secretions, sputa, post-mortem lung tissue and combined throat and nasal swabs. Viral detection by IF was poor in sputa and respiratory swabs. A total of 99 viruses were detected in the study from 79 patients and 4 quality control specimens: 31 by immunofluorescence and 99 using the molecular strip. The strip consistently out-performed immunofluorescence with no loss of diagnostic specificity. CONCLUSIONS: The touchdown protocol with pre-dispensed primer master-mixes was suitable for replacing virus culture for the diagnosis of respiratory viruses which were negative by immunofluorescence. Results by immunofluorescence were available after an average of 4–12 hours while molecular strip results were available within 24 hours, considerably faster than viral culture. The combined strip and touchdown protocol proved to be a convenient and reliable method of testing for multiple viruses in a routine setting.


- **Title:** A novel pancoronavirus RT-PCR assay: frequent detection of human coronavirus NL63 in children hospitalized with respiratory tract infections in Belgium
- **Authors:** Moës, Elien; Vijgen, Leen; Keyaerts, Els; Zlateva, Kalina; Li, Sandra; Maes, Piet; Pyrc, Krzysztof; Berkhout, Ben; van der Hoek, Lia; Van Ranst, Marc
- **Publish date/time:** 2005-02-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Four human coronaviruses are currently known to infect the respiratory tract: human coronaviruses OC43 (HCoV-OC43) and 229E (HCoV-229E), SARS associated coronavirus (SARS-CoV) and the recently identified human coronavirus NL63 (HCoV-NL63). In this study we explored the incidence of HCoV-NL63 infection in children diagnosed with respiratory tract infections in Belgium. METHODS: Samples from children hospitalized with respiratory diseases during the winter seasons of 2003 and 2004 were evaluated for the presence of HCoV-NL63 using a optimized pancoronavirus RT-PCR assay. RESULTS: Seven HCoV-NL63 positive samples were identified, six were collected during January/February 2003 and one at the end of February 2004. CONCLUSIONS: Our results support the notation that HCoV-NL63 can cause serious respiratory symptoms in children. Sequence analysis of the S gene showed that our isolates could be classified into two subtypes corresponding to the two prototype HCoV-NL63 sequences isolated in The Netherlands in 1988 and 2003, indicating that these two subtypes may currently be cocirculating.


- **Title:** Croup Is Associated with the Novel Coronavirus NL63
- **Authors:** van der Hoek, Lia; Sure, Klaus; Ihorst, Gabriele; Stang, Alexander; Pyrc, Krzysztof; Jebbink, Maarten F; Petersen, Gudula; Forster, Johannes; Berkhout, Ben; Überla, Klaus
- **Publish date/time:** 2005-08-23
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The clinical relevance of infections with the novel human coronavirus NL63 (HCoV-NL63) has not been investigated systematically. We therefore determined its association with disease in young children with lower respiratory tract infection (LRTI). METHODS AND FINDINGS: Nine hundred forty-nine samples of nasopharyngeal secretions from children under 3 y of age with LRTIs were analysed by a quantitative HCoV-NL63-specific real-time PCR. The samples had been collected from hospitalised patients and outpatients from December 1999 to October 2001 in four different regions in Germany as part of the prospective population-based PRI.DE study and analysed for RNA from respiratory viruses. Forty-nine samples (5.2%), mainly derived from the winter season, were positive for HCoV-NL63 RNA. The viral RNA was more prevalent in samples from outpatients (7.9%) than from hospitalised patients (3.2%, p = 0.003), and co-infection with either respiratory syncytial virus or parainfluenza virus 3 was observed frequently. Samples in which only HCoV-NL63 RNA could be detected had a significantly higher viral load than samples containing additional respiratory viruses (median 2.1 × 10(6) versus 2.7 × 10(2) copies/ml, p = 0.0006). A strong association with croup was apparent: 43% of the HCoV-NL63-positive patients with high HCoV-NL63 load and absence of co-infection suffered from croup, compared to 6% in the HCoV-NL63-negative group, p < 0.0001. A significantly higher fraction (17.4%) of samples from croup patients than from non-croup patients (4.2%) contained HCoV-NL63 RNA. CONCLUSION: HCoV-NL63 infections occur frequently in young children with LRTI and show a strong association with croup, suggesting a causal relationship.


- **Title:** Time Course and Cellular Localization of SARS-CoV Nucleoprotein and RNA in Lungs from Fatal Cases of SARS
- **Authors:** Nicholls, John M; Butany, Jagdish; Poon, Leo L. M; Chan, Kwok H; Beh, Swan Lip; Poutanen, Susan; Peiris, J. S. Malik; Wong, Maria
- **Publish date/time:** 2006-01-03
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Cellular localization of severe acute respiratory syndrome coronavirus (SARS-CoV) in the lungs of patients with SARS is important in confirming the etiological association of the virus with disease as well as in understanding the pathogenesis of the disease. To our knowledge, there have been no comprehensive studies investigating viral infection at the cellular level in humans. METHODS AND FINDINGS: We collected the largest series of fatal cases of SARS with autopsy material to date by merging the pathological material from two regions involved in the 2003 worldwide SARS outbreak in Hong Kong, China, and Toronto, Canada. We developed a monoclonal antibody against the SARS-CoV nucleoprotein and used it together with in situ hybridization (ISH) to analyze the autopsy lung tissues of 32 patients with SARS from Hong Kong and Toronto. We compared the results of these assays with the pulmonary pathologies and the clinical course of illness for each patient. SARS-CoV nucleoprotein and RNA were detected by immunohistochemistry and ISH, respectively, primarily in alveolar pneumocytes and, less frequently, in macrophages. Such localization was detected in four of the seven patients who died within two weeks of illness onset, and in none of the 25 patients who died later than two weeks after symptom onset. CONCLUSIONS: The pulmonary alveolar epithelium is the chief target of SARS-CoV, with macrophages infected subsequently. Viral replication appears to be limited to the first two weeks after symptom onset, with little evidence of continued widespread replication after this period. If antiviral therapy is considered for future treatment, it should be focused on this two-week period of acute clinical disease.

Topic ID #6



- **Title:** Airborne rhinovirus detection and effect of ultraviolet irradiation on detection by a semi-nested RT-PCR assay
- **Authors:** Myatt, Theodore A; Johnston, Sebastian L; Rudnick, Stephen; Milton, Donald K
- **Publish date/time:** 2003-01-13
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Rhinovirus, the most common cause of upper respiratory tract infections, has been implicated in asthma exacerbations and possibly asthma deaths. Although the method of transmission of rhinoviruses is disputed, several studies have demonstrated that aerosol transmission is a likely method of transmission among adults. As a first step in studies of possible airborne rhinovirus transmission, we developed methods to detect aerosolized rhinovirus by extending existing technology for detecting infectious agents in nasal specimens. METHODS: We aerosolized rhinovirus in a small aerosol chamber. Experiments were conducted with decreasing concentrations of rhinovirus. To determine the effect of UV irradiation on detection of rhinoviral aerosols, we also conducted experiments in which we exposed aerosols to a UV dose of 684 mJ/m(2). Aerosols were collected on Teflon filters and rhinovirus recovered in Qiagen AVL buffer using the Qiagen QIAamp Viral RNA Kit (Qiagen Corp., Valencia, California) followed by semi-nested RT-PCR and detection by gel electrophoresis. RESULTS: We obtained positive results from filter samples that had collected at least 1.3 TCID(50 )of aerosolized rhinovirus. Ultraviolet irradiation of airborne virus at doses much greater than those used in upper-room UV germicidal irradiation applications did not inhibit subsequent detection with the RT-PCR assay. CONCLUSION: The air sampling and extraction methodology developed in this study should be applicable to the detection of rhinovirus and other airborne viruses in the indoor air of offices and schools. This method, however, cannot distinguish UV inactivated virus from infectious viral particles.


- **Title:** A systematic review of the effectiveness of antimicrobial rinse-free hand sanitizers for prevention of illness-related absenteeism in elementary school children
- **Authors:** Meadows, Emily; Le Saux, Nicole
- **Publish date/time:** 2004-11-01
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Absenteeism due to communicable illness is a major problem encountered by North American elementary school children. Although handwashing is a proven infection control measure, barriers exist in the school environment, which hinder compliance to this routine. Currently, alternative hand hygiene techniques are being considered, and one such technique is the use of antimicrobial rinse-free hand sanitizers. METHODS: A systematic review was conducted to examine the effectiveness of antimicrobial rinse-free hand sanitizer interventions in the elementary school setting. MEDLINE, EMBASE, Biological Abstract, CINAHL, HealthSTAR and Cochrane Controlled Trials Register were searched for both randomized and non-randomized controlled trials. Absenteeism due to communicable illness was the primary outcome variable. RESULTS: Six eligible studies, two of which were randomized, were identified (5 published studies, 1 published abstract). The quality of reporting was low. Due to a large amount of heterogeneity and low quality of reporting, no pooled estimates were calculated. There was a significant difference reported in favor of the intervention in all 5 published studies. CONCLUSIONS: The available evidence for the effectiveness of antimicrobial rinse-free hand sanitizer in the school environment is of low quality. The results suggest that the strength of the benefit should be interpreted with caution. Given the potential to reduce student absenteeism, teacher absenteeism, school operating costs, healthcare costs and parental absenteeism, a well-designed and analyzed trial is needed to optimize this hand hygiene technique.


- **Title:** Accuracy of parents in measuring body temperature with a tympanic thermometer
- **Authors:** Robinson, Joan L; Jou, Hsing; Spady, Donald W
- **Publish date/time:** 2005-01-11
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: It is now common for parents to measure tympanic temperatures in children. The objective of this study was to assess the diagnostic accuracy of these measurements. METHODS: Parents and then nurses measured the temperature of 60 children with a tympanic thermometer designed for home use (home thermometer). The reference standard was a temperature measured by a nurse with a model of tympanic thermometer commonly used in hospitals (hospital thermometer). A difference of ≥ 0.5 °C was considered clinically significant. A fever was defined as a temperature ≥ 38.5 °C. RESULTS: The mean absolute difference between the readings done by the parent and the nurse with the home thermometer was 0.44 ± 0.61 °C, and 33% of the readings differed by ≥ 0.5 °C. The mean absolute difference between the readings done by the parent with the home thermometer and the nurse with the hospital thermometer was 0.51 ± 0.63 °C, and 72 % of the readings differed by ≥ 0.5 °C. Using the home thermometer, parents detected fever with a sensitivity of 76% (95% CI 50–93%), a specificity of 95% (95% CI 84–99%), a positive predictive value of 87% (95% CI 60–98%), and a negative predictive value of 91% (95% CI 79–98 %). In comparing the readings the nurse obtained from the two different tympanic thermometers, the mean absolute difference was 0.24 ± 0.22 °C. Nurses detected fever with a sensitivity of 94% (95 % CI 71–100 %), a specificity of 88% (95% CI 75–96 %), a positive predictive value of 76% (95% CI 53–92%), and a negative predictive value of 97% (95%CI 87–100 %) using the home thermometer. The intraclass correlation coefficient for the three sets of readings was 0.80, and the consistency of readings was not affected by the body temperature. CONCLUSIONS: The readings done by parents with a tympanic thermometer designed for home use differed a clinically significant amount from the reference standard (readings done by nurses with a model of tympanic thermometer commonly used in hospitals) the majority of the time, and parents failed to detect fever about one-quarter of the time. Tympanic readings reported by parents should be interpreted with great caution.


- **Title:** A new paradigm in respiratory hygiene: increasing the cohesivity of airway secretions to improve cough interaction and reduce aerosol dispersion
- **Authors:** Zayas, Gustavo; Dimitry, John; Zayas, Ana; O'Brien, Darryl; King, Malcolm
- **Publish date/time:** 2005-09-02
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Infectious respiratory diseases are transmitted to non-infected subjects when an infected person expels pathogenic microorganisms to the surrounding environment when coughing or sneezing. When the airway mucus layer interacts with high-speed airflow, droplets are expelled as aerosol; their concentration and size distribution may each play an important role in disease transmission. Our goal is to reduce the aerosolizability of respiratory secretions while interfering only minimally with normal mucus clearance using agents capable of increasing crosslinking in the mucin glycoprotein network. METHODS: We exposed mucus simulants (MS) to airflow in a simulated cough machine (SCM). The MS ranged from non-viscous, non-elastic substances (water) to MS of varying degrees of viscosity and elasticity. Mucociliary clearance of the MS was assessed on the frog palate, elasticity in the Filancemeter and the aerosol pattern in a "bulls-eye" target. The sample loaded was weighed before and after each cough maneuver. We tested two mucomodulators: sodium tetraborate (XL"B") and calcium chloride (XL "C"). RESULTS: Mucociliary transport was close to normal speed in viscoelastic samples compared to non-elastic, non-viscous or viscous-only samples. Spinnability ranged from 2.5 ± 0.6 to 50.9 ± 6.9 cm, and the amount of MS expelled from the SCM increased from 47 % to 96 % adding 1.5 μL to 150 μL of XL "B". Concurrently, particles were inversely reduced to almost disappear from the aerosolization pattern. CONCLUSION: The aerosolizability of MS was modified by increasing its cohesivity, thereby reducing the number of particles expelled from the SCM while interfering minimally with its clearance on the frog palate. An unexpected finding is that MS crosslinking increased "expectoration".


- **Title:** How long do nosocomial pathogens persist on inanimate surfaces? A systematic review
- **Authors:** Kramer, Axel; Schwebke, Ingeborg; Kampf, Günter
- **Publish date/time:** 2006-08-16
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Inanimate surfaces have often been described as the source for outbreaks of nosocomial infections. The aim of this review is to summarize data on the persistence of different nosocomial pathogens on inanimate surfaces. METHODS: The literature was systematically reviewed in MedLine without language restrictions. In addition, cited articles in a report were assessed and standard textbooks on the topic were reviewed. All reports with experimental evidence on the duration of persistence of a nosocomial pathogen on any type of surface were included. RESULTS: Most gram-positive bacteria, such as Enterococcus spp. (including VRE), Staphylococcus aureus (including MRSA), or Streptococcus pyogenes, survive for months on dry surfaces. Many gram-negative species, such as Acinetobacter spp., Escherichia coli, Klebsiella spp., Pseudomonas aeruginosa, Serratia marcescens, or Shigella spp., can also survive for months. A few others, such as Bordetella pertussis, Haemophilus influenzae, Proteus vulgaris, or Vibrio cholerae, however, persist only for days. Mycobacteria, including Mycobacterium tuberculosis, and spore-forming bacteria, including Clostridium difficile, can also survive for months on surfaces. Candida albicans as the most important nosocomial fungal pathogen can survive up to 4 months on surfaces. Persistence of other yeasts, such as Torulopsis glabrata, was described to be similar (5 months) or shorter (Candida parapsilosis, 14 days). Most viruses from the respiratory tract, such as corona, coxsackie, influenza, SARS or rhino virus, can persist on surfaces for a few days. Viruses from the gastrointestinal tract, such as astrovirus, HAV, polio- or rota virus, persist for approximately 2 months. Blood-borne viruses, such as HBV or HIV, can persist for more than one week. Herpes viruses, such as CMV or HSV type 1 and 2, have been shown to persist from only a few hours up to 7 days. CONCLUSION: The most common nosocomial pathogens may well survive or persist on surfaces for months and can thereby be a continuous source of transmission if no regular preventive surface disinfection is performed.

Topic ID #7



- **Title:** Discovering human history from stomach bacteria
- **Authors:** Disotell, Todd R
- **Publish date/time:** 2003-04-28
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Recent analyses of human pathogens have revealed that their evolutionary histories are congruent with the hypothesized pattern of ancient and modern human population migrations. Phylogenetic trees of strains of the bacterium Helicobacter pylori and the polyoma JC virus taken from geographically diverse groups of human beings correlate closely with relationships of the populations in which they are found.


- **Title:** Viral Discovery and Sequence Recovery Using DNA Microarrays
- **Authors:** Wang, David; Urisman, Anatoly; Liu, Yu-Tsueng; Springer, Michael; Ksiazek, Thomas G; Erdman, Dean D; Mardis, Elaine R; Hickenbotham, Matthew; Magrini, Vincent; Eldred, James; Latreille, J. Phillipe; Wilson, Richard K; Ganem, Don; DeRisi, Joseph L
- **Publish date/time:** 2003-11-17
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Because of the constant threat posed by emerging infectious diseases and the limitations of existing approaches used to identify new pathogens, there is a great demand for new technological methods for viral discovery. We describe herein a DNA microarray-based platform for novel virus identification and characterization. Central to this approach was a DNA microarray designed to detect a wide range of known viruses as well as novel members of existing viral families; this microarray contained the most highly conserved 70mer sequences from every fully sequenced reference viral genome in GenBank. During an outbreak of severe acute respiratory syndrome (SARS) in March 2003, hybridization to this microarray revealed the presence of a previously uncharacterized coronavirus in a viral isolate cultivated from a SARS patient. To further characterize this new virus, approximately 1 kb of the unknown virus genome was cloned by physically recovering viral sequences hybridized to individual array elements. Sequencing of these fragments confirmed that the virus was indeed a new member of the coronavirus family. This combination of array hybridization followed by direct viral sequence recovery should prove to be a general strategy for the rapid identification and characterization of novel viruses and emerging infectious disease.


- **Title:** A model of tripeptidyl-peptidase I (CLN2), a ubiquitous and highly conserved member of the sedolisin family of serine-carboxyl peptidases
- **Authors:** Wlodawer, Alexander; Durell, Stewart R; Li, Mi; Oyama, Hiroshi; Oda, Kohei; Dunn, Ben M
- **Publish date/time:** 2003-11-11
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Tripeptidyl-peptidase I, also known as CLN2, is a member of the family of sedolisins (serine-carboxyl peptidases). In humans, defects in expression of this enzyme lead to a fatal neurodegenerative disease, classical late-infantile neuronal ceroid lipofuscinosis. Similar enzymes have been found in the genomic sequences of several species, but neither systematic analyses of their distribution nor modeling of their structures have been previously attempted. RESULTS: We have analyzed the presence of orthologs of human CLN2 in the genomic sequences of a number of eukaryotic species. Enzymes with sequences sharing over 80% identity have been found in the genomes of macaque, mouse, rat, dog, and cow. Closely related, although clearly distinct, enzymes are present in fish (fugu and zebra), as well as in frogs (Xenopus tropicalis). A three-dimensional model of human CLN2 was built based mainly on the homology with Pseudomonas sp. 101 sedolisin. CONCLUSION: CLN2 is very highly conserved and widely distributed among higher organisms and may play an important role in their life cycles. The model presented here indicates a very open and accessible active site that is almost completely conserved among all known CLN2 enzymes. This result is somehow surprising for a tripeptidase where the presence of a more constrained binding pocket was anticipated. This structural model should be useful in the search for the physiological substrates of these enzymes and in the design of more specific inhibitors of CLN2.


- **Title:** Base-By-Base: Single nucleotide-level analysis of whole viral genome alignments
- **Authors:** Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris
- **Publish date/time:** 2004-07-14
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. RESULTS: A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. CONCLUSION: Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.


- **Title:** Virology on the Internet: the time is right for a new journal
- **Authors:** Garry, Robert F
- **Publish date/time:** 2004-08-26
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Virology Journal is an exclusively on-line, Open Access journal devoted to the presentation of high-quality original research concerning human, animal, plant, insect bacterial, and fungal viruses. Virology Journal will establish a strategic alternative to the traditional virology communication process.

Topic ID #8



- **Title:** A new recruit for the army of the men of death
- **Authors:** Petsko, Gregory A
- **Publish date/time:** 2003-06-27
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** The army of the men of death, in John Bunyan's memorable phrase, has a new recruit, and fear has a new face: a face wearing a surgical mask.


- **Title:** Association of HLA class I with severe acute respiratory syndrome coronavirus infection
- **Authors:** Lin, Marie; Tseng, Hsiang-Kuang; Trejaut, Jean A; Lee, Hui-Lin; Loo, Jun-Hun; Chu, Chen-Chung; Chen, Pei-Jan; Su, Ying-Wen; Lim, Ken Hong; Tsai, Zen-Uong; Lin, Ruey-Yi; Lin, Ruey-Shiung; Huang, Chun-Hsiung
- **Publish date/time:** 2003-09-12
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: The human leukocyte antigen (HLA) system is widely used as a strategy in the search for the etiology of infectious diseases and autoimmune disorders. During the Taiwan epidemic of severe acute respiratory syndrome (SARS), many health care workers were infected. In an effort to establish a screening program for high risk personal, the distribution of HLA class I and II alleles in case and control groups was examined for the presence of an association to a genetic susceptibly or resistance to SARS coronavirus infection. METHODS: HLA-class I and II allele typing by PCR-SSOP was performed on 37 cases of probable SARS, 28 fever patients excluded later as probable SARS, and 101 non-infected health care workers who were exposed or possibly exposed to SARS coronavirus. An additional control set of 190 normal healthy unrelated Taiwanese was also used in the analysis. RESULTS: Woolf and Haldane Odds ratio (OR) and corrected P-value (Pc) obtained from two tails Fisher exact test were used to show susceptibility of HLA class I or class II alleles with coronavirus infection. At first, when analyzing infected SARS patients and high risk health care workers groups, HLA-B*4601 (OR = 2.08, P = 0.04, Pc = n.s.) and HLA-B*5401 (OR = 5.44, P = 0.02, Pc = n.s.) appeared as the most probable elements that may be favoring SARS coronavirus infection. After selecting only a "severe cases" patient group from the infected "probable SARS" patient group and comparing them with the high risk health care workers group, the severity of SARS was shown to be significantly associated with HLA-B*4601 (P = 0.0008 or Pc = 0.0279). CONCLUSIONS: Densely populated regions with genetically related southern Asian populations appear to be more affected by the spreading of SARS infection. Up until recently, no probable SARS patients were reported among Taiwan indigenous peoples who are genetically distinct from the Taiwanese general population, have no HLA-B* 4601 and have high frequency of HLA-B* 1301. While increase of HLA-B* 4601 allele frequency was observed in the "Probable SARS infected" patient group, a further significant increase of the allele was seen in the "Severe cases" patient group. These results appeared to indicate association of HLA-B* 4601 with the severity of SARS infection in Asian populations. Independent studies are needed to test these results.


- **Title:** The Virus That Changed My World
- **Authors:** Poh Ng, Lisa Fong
- **Publish date/time:** 2003-12-22
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** Personal account of a young virologist working in Singapore at the height of the 2003 SARS pandemic


- **Title:** Pro/con clinical debate: Steroids are a key component in the treatment of SARS
- **Authors:** Gomersall, Charles D; Kargel, Marcus J; Lapinsky, Stephen E
- **Publish date/time:** 2004-01-26
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** SARS (severe acute respiratory syndrome) proved an enormous physical and emotional challenge to frontline health care workers throughout the world in late 2002 through to mid 2003. A large percentage of patients (many being health care workers themselves) became critically ill. Unfortunately, clinicians caring for these individuals did not have the advantage of previous experience or research data on which to base treatment decisions. As a result, at least early in the outbreak, a 'best guess approach' and/or anecdotes drove therapy. In many centres systemic steroids, which carry many potential downsides, became a mainstay of therapy. In this issue of Critical Care, two groups that have frontline experience of SARS debate the role of steroids. Let us hope and pray together that we never have the patient population needed to resolve the questions the two sides raise.


- **Title:** 8th Annual Toronto Critical Care Medicine Symposium, 30 October–1 November 2003, Toronto, Ontario, Canada
- **Authors:** Granton, Jeff; Granton, John
- **Publish date/time:** 2004-01-02
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan

Topic ID #9



- **Title:** A double epidemic model for the SARS propagation
- **Authors:** Ng, Tuen Wai; Turinici, Gabriel; Danchin, Antoine
- **Publish date/time:** 2003-09-10
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: An epidemic of a Severe Acute Respiratory Syndrome (SARS) caused by a new coronavirus has spread from the Guangdong province to the rest of China and to the world, with a puzzling contagion behavior. It is important both for predicting the future of the present outbreak and for implementing effective prophylactic measures, to identify the causes of this behavior. RESULTS: In this report, we show first that the standard Susceptible-Infected-Removed (SIR) model cannot account for the patterns observed in various regions where the disease spread. We develop a model involving two superimposed epidemics to study the recent spread of the SARS in Hong Kong and in the region. We explore the situation where these epidemics may be caused either by a virus and one or several mutants that changed its tropism, or by two unrelated viruses. This has important consequences for the future: the innocuous epidemic might still be there and generate, from time to time, variants that would have properties similar to those of SARS. CONCLUSION: We find that, in order to reconcile the existing data and the spread of the disease, it is convenient to suggest that a first milder outbreak protected against the SARS. Regions that had not seen the first epidemic, or that were affected simultaneously with the SARS suffered much more, with a very high percentage of persons affected. We also find regions where the data appear to be inconsistent, suggesting that they are incomplete or do not reflect an appropriate identification of SARS patients. Finally, we could, within the framework of the model, fix limits to the future development of the epidemic, allowing us to identify landmarks that may be useful to set up a monitoring system to follow the evolution of the epidemic. The model also suggests that there might exist a SARS precursor in a large reservoir, prompting for implementation of precautionary measures when the weather cools down.


- **Title:** Air pollution and case fatality of SARS in the People's Republic of China: an ecologic study
- **Authors:** Cui, Yan; Zhang, Zuo-Feng; Froines, John; Zhao, Jinkou; Wang, Hua; Yu, Shun-Zhang; Detels, Roger
- **Publish date/time:** 2003-11-20
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** BACKGROUND: Severe acute respiratory syndrome (SARS) has claimed 349 lives with 5,327 probable cases reported in mainland China since November 2002. SARS case fatality has varied across geographical areas, which might be partially explained by air pollution level. METHODS: Publicly accessible data on SARS morbidity and mortality were utilized in the data analysis. Air pollution was evaluated by air pollution index (API) derived from the concentrations of particulate matter, sulfur dioxide, nitrogen dioxide, carbon monoxide and ground-level ozone. Ecologic analysis was conducted to explore the association and correlation between air pollution and SARS case fatality via model fitting. Partially ecologic studies were performed to assess the effects of long-term and short-term exposures on the risk of dying from SARS. RESULTS: Ecologic analysis conducted among 5 regions with 100 or more SARS cases showed that case fatality rate increased with the increment of API (case fatality = - 0.063 + 0.001 * API). Partially ecologic study based on short-term exposure demonstrated that SARS patients from regions with moderate APIs had an 84% increased risk of dying from SARS compared to those from regions with low APIs (RR = 1.84, 95% CI: 1.41–2.40). Similarly, SARS patients from regions with high APIs were twice as likely to die from SARS compared to those from regions with low APIs. (RR = 2.18, 95% CI: 1.31–3.65). Partially ecologic analysis based on long-term exposure to ambient air pollution showed the similar association. CONCLUSION: Our studies demonstrated a positive association between air pollution and SARS case fatality in Chinese population by utilizing publicly accessible data on SARS statistics and air pollution indices. Although ecologic fallacy and uncontrolled confounding effect might have biased the results, the possibility of a detrimental effect of air pollution on the prognosis of SARS patients deserves further investigation.


- **Title:** Towards evidence-based, GIS-driven national spatial health information infrastructure and surveillance services in the United Kingdom
- **Authors:** Boulos, Maged N Kamel
- **Publish date/time:** 2004-01-28
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** The term "Geographic Information Systems" (GIS) has been added to MeSH in 2003, a step reflecting the importance and growing use of GIS in health and healthcare research and practices. GIS have much more to offer than the obvious digital cartography (map) functions. From a community health perspective, GIS could potentially act as powerful evidence-based practice tools for early problem detection and solving. When properly used, GIS can: inform and educate (professionals and the public); empower decision-making at all levels; help in planning and tweaking clinically and cost-effective actions, in predicting outcomes before making any financial commitments and ascribing priorities in a climate of finite resources; change practices; and continually monitor and analyse changes, as well as sentinel events. Yet despite all these potentials for GIS, they remain under-utilised in the UK National Health Service (NHS). This paper has the following objectives: (1) to illustrate with practical, real-world scenarios and examples from the literature the different GIS methods and uses to improve community health and healthcare practices, e.g., for improving hospital bed availability, in community health and bioterrorism surveillance services, and in the latest SARS outbreak; (2) to discuss challenges and problems currently hindering the wide-scale adoption of GIS across the NHS; and (3) to identify the most important requirements and ingredients for addressing these challenges, and realising GIS potential within the NHS, guided by related initiatives worldwide. The ultimate goal is to illuminate the road towards implementing a comprehensive national, multi-agency spatio-temporal health information infrastructure functioning proactively in real time. The concepts and principles presented in this paper can be also applied in other countries, and on regional (e.g., European Union) and global levels.


- **Title:** Descriptive review of geographic mapping of severe acute respiratory syndrome (SARS) on the Internet
- **Authors:** Boulos, Maged N Kamel
- **Publish date/time:** 2004-01-28
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** From geographic mapping at different scales to location-based alerting services, geoinformatics plays an important role in the study and control of global outbreaks like severe acute respiratory syndrome (SARS). This paper reviews several geographic mapping efforts of SARS on the Internet that employ a variety of techniques like choropleth rendering, graduated circles, graduated pie charts, buffering, overlay analysis and animation. The aim of these mapping services is to educate the public (especially travellers to potentially at-risk areas) and assist public health authorities in analysing the spatial and temporal trends and patterns of SARS and in assessing/revising current control measures.


- **Title:** Recently published papers: all the usual suspects and carbon dioxide
- **Authors:** Ball, Jonathan
- **Publish date/time:** 2004-01-02
- **Linked references:** 0
- **Linked referenced by:** 0
- **Abstract:** nan