First you need to link your Google Drive to the notebook in order to access the files needed for this module.

Run the cell below and follow instructions to mount the drive.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Installing Biopython

At the beginning of each module, we will install **Biopython**. Biopython is a large open-source application programming interface (API) used in both bioinformatics software development and in everyday scripts for common bioinformatics tasks. It contains several packages that you will need to import which will allow you to run the analyses required for this project. 

REF:
* Cock, P. J., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., Friedberg, I., Hamelryck, T., Kauff, F., Wilczynski, B., & de Hoon, M. J. (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics (Oxford, England), 25(11), 1422–1423. https://doi.org/10.1093/bioinformatics/btp163


In [3]:
!pip install biopython

Collecting biopython
  Downloading biopython-1.79-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (2.3 MB)
[?25l[K     |▏                               | 10 kB 22.0 MB/s eta 0:00:01[K     |▎                               | 20 kB 26.3 MB/s eta 0:00:01[K     |▍                               | 30 kB 29.9 MB/s eta 0:00:01[K     |▋                               | 40 kB 33.1 MB/s eta 0:00:01[K     |▊                               | 51 kB 36.1 MB/s eta 0:00:01[K     |▉                               | 61 kB 29.9 MB/s eta 0:00:01[K     |█                               | 71 kB 27.5 MB/s eta 0:00:01[K     |█▏                              | 81 kB 28.6 MB/s eta 0:00:01[K     |█▎                              | 92 kB 29.8 MB/s eta 0:00:01[K     |█▍                              | 102 kB 31.6 MB/s eta 0:00:01[K     |█▋                              | 112 kB 31.6 MB/s eta 0:00:01[K     |█▊                              | 122 kB 31.6 MB/s eta 0:00:01[K     |█▉              

# Investigating the biological impact of the mutation and its possible role in human disease
For this section, your research will focus on investigating the biological impact of the mutation you are studying. To do this, you will use the OMIM and KEGG databases.

## OMIM Search for information on genetic diseases

The **OMIM** (Online Mendelian Inheritance of Man) database contains short, referenced reviews about genetic loci and genetic diseases. It
can be a very useful resource for finding out what type of research has been done on a gene or a disease.

REF:
* http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM

## Install and import the necessary packages:

The **romim package** was created to query the OMIM database but it runs in R. 

**R** is another programming language so you will need to install **rpy2** to run R code in Google Colab.

**Methods** and **remotes** are R packages that help us both install the package and use the functions in the code.

The **XML** package will be used to read the results that you obtain from your database searches. 

REF:
* https://github.com/davetang/romim

In [4]:
# Install the rpy2 interface to run R code
%load_ext rpy2.ipython

In [5]:
%%R # This must precede all R code in Colab, to allow R code to run 

# Installing the main package
# Note how different it is from Python code
remotes::install_github('davetang/romim')

# Import the library associated with the package
library(romim)

# Intalling several packages
install.packages('XML')
install.packages('methods')
install.packages("remotes")

# Press 1 and ENTER when prompted

R[write to console]: Downloading GitHub repo davetang/romim@HEAD



These packages have more recent versions available.
It is recommended to update all of them.
Which would you like to update?

1: All                         
2: CRAN packages only          
3: None                        
4: xml2 (1.3.2 -> 1.3.3) [CRAN]

Enter one or more numbers, or an empty line to skip updates: 1
xml2 (1.3.2 -> 1.3.3   ) [CRAN]
XML  (NA    -> 3.99-0.8) [CRAN]


R[write to console]: Installing 2 packages: xml2, XML

R[write to console]: Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

R[write to console]: trying URL 'https://cran.rstudio.com/src/contrib/xml2_1.3.3.tar.gz'

R[write to console]: Content type 'application/x-gzip'
R[write to console]:  length 283965 bytes (277 KB)

R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =

✔  checking for file ‘/tmp/RtmpTkQPXi/remotes3b57378b95/davetang-romim-594c6b0/DESCRIPTION’
─  preparing ‘romim’:
✔  checking DESCRIPTION meta-information
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
   Omitted ‘LazyData’ from DESCRIPTION
─  building ‘romim_1.0.1.tar.gz’
   


R[write to console]: Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

R[write to console]: Loading required package: XML

R[write to console]: 
Attaching package: ‘XML’


R[write to console]: The following object is masked from ‘package:tools’:

    toHTML


R[write to console]: Loading required package: magrittr

R[write to console]: Loading required package: xml2

R[write to console]: Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)

R[write to console]: trying URL 'https://cran.rstudio.com/src/contrib/XML_3.99-0.8.tar.gz'

R[write to console]: Content type 'application/x-gzip'
R[write to console]:  length 970694 bytes (947 KB)

R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console]: =
R[write to console

## Obtaining the ID number (called mim number) associated with the ATP synthase gene entry in OMIM

In [5]:
%%R # This must precede all R code in Colab, to allow R code to run 

# To access OMIM, we will use this key which will work as our password to access the database
set_key('4PUvWRqSSD2BuprIVAP_VQ') 

# Write MTATP6 in the parenthesis below to obtain the entry for CASP1
gene_to_omim('MTATP6', show_query=TRUE)


R[write to console]: https://api.omim.org/api/entry/search?search=gene_symbol:MTATP6&include=geneMap&apiKey=4PUvWRqSSD2BuprIVAP_VQ



NULL


Now, the link above is a file with our information. Let's download and parse it.

In [6]:
%%R
# Saving the file:

# Paste the url between the quotes below
url <- 'https://api.omim.org/api/entry/search?search=gene_symbol:MTATP6&include=geneMap&apiKey=4PUvWRqSSD2BuprIVAP_VQ'

# Set destination (write the file path and a file name followed by .xml)
destfile <- '/content/drive/MyDrive/Colab_Notebooks/atp_project_files/file_atp.xml'

# Download the file
download.file(url, destfile)


R[write to console]: trying URL 'https://api.omim.org/api/entry/search?search=gene_symbol:MTATP6&include=geneMap&apiKey=4PUvWRqSSD2BuprIVAP_VQ'

R[write to console]: downloaded 774 bytes




In [7]:
%%R
# Parse the file for readability
result <- xmlParse(file = '/content/drive/MyDrive/Colab_Notebooks/atp_project_files/file_atp.xml')

# Read the file
read <- read_xml('/content/drive/MyDrive/Colab_Notebooks/atp_project_files/file_atp.xml' )

# Find the mim number
num <- xml_find_all(read, ".//mimNumber")

# Display the number
num

{xml_nodeset (1)}
[1] <mimNumber>516060</mimNumber>


Write down the mim number for the entry from the results above.

Answer here

## Using OMIM to obtain more information about the gene
This time you will search the OMIM 'Casp1' entry for information.

The function 'get_omim' helps you do just that since we can set certain arguments to 'TRUE' and obtain specific information about the entry.

Run the next cell to see a list of Arguments that you can access.

In [None]:
%%R
help(get_omim)

# Obtain information from the entry

### Read information about allelic variants 
An allele is a variant of a gene were the DNA sequence differs between two or more variants. 

Allelic variation describes the presence or number of different allele forms at a particular locus (locus or loci = place) on a chromosome.

REF:  
* https://warwick.ac.uk/fac/sci/lifesci/research/vegin/geneticimprovement/diversitycollection/allelicvariation/


In [8]:
# Search OMIM again but with ATP6 mim Number to obtain description info
%%R
set_key('4PUvWRqSSD2BuprIVAP_VQ')

# Set allelicVariantList to TRUE
omim_result <- get_omim(516060, allelicVariant = TRUE)

saveXML(omim_result, file='all.xml')

[1] "all.xml"


### Display the results in the form of a table

In [10]:
#@title Load the results by providing the file name in this form (include file extension .xml)

# MAKING RESULTS LOOK GOOD
import xml.etree.ElementTree as ET
import csv
import pandas as pd

file_name = "all.xml" #@param {type:"string"}

tree = ET.parse(file_name)
root = tree.getroot()
 
Ref_data4 = open('refdata4.csv', 'w')
 
csvwriter = csv.writer(Ref_data4)
allele_head = []

 
count = 0
for member in root.findall('.//allelicVariant'):
    allele = []
    ref_list = []

    if count == 0:
      des = member.find('.//mutations').tag
      allele_head.append(des)
      
      mut = member.find('.//text').tag
      allele_head.append(mut)
     
      csvwriter.writerow(allele_head)
      count = count + 1
       
    ementa = getattr(member.find('.//mutations'), 'text', None)
    allele.append(ementa)

    
    ementa2 = getattr(member.find('.//text'), 'text', None)
    allele.append(ementa2)

    csvwriter.writerow(allele)
  
Ref_data4.close()

data4= pd.read_csv("refdata4.csv")
pd.set_option('display.max_colwidth',10000)

data4



Unnamed: 0,mutations,text
0,"MTATP6, 8993T-G, LEU156ARG","{24:Holt et al. (1990)} found a heteroplasmic T-to-G transversion at nucleotide pair 8993 in a maternal pedigree which resulted in the change of a hydrophobic leucine to a hydrophilic arginine at position 156 in subunit 6 of mitochondrial H(+)-ATPase. The clinical symptoms varied in proportion to the percentage of mutant mtDNAs but the most common clinical presentation included neurogenic muscle weakness, ataxia, and retinitis pigmentosa, leading to the designation of NARP syndrome ({551500}). The insertion of an arginine in the hydrophobic sequence of ATPase 6 probably interferes with the hydrogen ion channel formed by subunits 6 and 9 of the ATPase, thus causing failure of ATP synthesis. {22:Harding et al. (1992)} demonstrated that prenatal diagnosis was possible, although the approach was hampered by incomplete knowledge concerning the proportion of mutant mtDNA and its relationship to disease severity, how it may change during fetal and postnatal development, and its tissue distribution.\n\nIn families with mitochondrial complex V (ATP synthase) deficiency mitochondrial type 1 (MC5DM1; {500015}) resulting in Leigh syndrome (see {256000}), {46:Tatuch et al. (1992)} and {43:Shoffner et al. (1992)} identified a nucleotide 8993 mutation in the MTAPT6 gene. {46:Tatuch et al. (1992)} found the heteroplasmic mtDNA mutation in a female infant showing lactic acidemia, hypotonia, and neurodegenerative disease leading to death at the age of 7 months. Autopsy revealed lesions typical of Leigh disease, both in the basal ganglia and in the brainstem. A maternal uncle and aunt died 5 months and 1 year, respectively, after a similar clinical course, while another maternal uncle, 33 years of age, had retinitis pigmentosa, ataxia, and mental retardation. The index patient had more than 95% abnormal mtDNA in her skin fibroblasts, brain, kidney, and liver tissues, as measured by laser densitometry. The maternal aunt who died at 1 year likewise had more than 95% abnormal mtDNA in her lymphoblasts. The uncle with retinitis pigmentosa had 78% and 79% abnormal mtDNA in his skin fibroblasts and lymphoblasts, respectively, while an asymptomatic maternal aunt and her son had no trace of the mutation. The mother of the index case had 71% and 39% abnormal mtDNA in her skin fibroblasts and lymphoblasts, respectively. {43:Shoffner et al. (1992)} reported a family which was heteroplasmic for the ATPase 6 nucleotide 8993 mutation in which 2 daughters died at ages 2.5 years and 14 months. Pathologic analyses showed classic basal ganglial lesions, vascular proliferation, and glioses. Two brothers manifested psychomotor retardation, ataxia, hypotonia, and retinal degeneration. The mother had retinal degeneration and experienced migraine headaches. The mother's 2 sisters were normal. The 4 affected children had high levels of mutant mtDNA, in excess of 95% by Southern blot. The mother had a 78% level of mutant mtDNA while her 2 sisters had 100% normal mtDNA.\n\n{9:Ciafaloni et al. (1993)} described 2 sisters with Leigh syndrome who had a T-to-G transversion at nucleotide 8993 in the MTATP6 gene. The asymptomatic mother had the same mutation. All 3 were heteroplasmic. The proportion of mutant genomes was lower in the mother's blood than in the blood of the more mildly affected sister, whereas all tissues from the other sister were almost homoplasmic for the mutation.\n\n{40:Santorelli et al. (1993)} found the T-to-G point mutation at nucleotide 8993 in 12 patients with Leigh syndrome from 10 unrelated families.\n\n{36:Pastores et al. (1994)} expanded the clinical phenotype of the nucleotide 8993 mtDNA mutations to include hypertrophic cardiomyopathy and confirmed its role in producing Leigh syndrome. The patient was a boy of Chinese descent who presented at the age of 6 months with a history of developmental delay and hypotonia and who had recurrent lactic acidosis. The mother's first pregnancy resulted in the birth of a stillborn female; an apparently healthy older brother had died suddenly at age 2 months. The 8993T-G mutation was heteroplasmic in the patient's skeletal muscle (90%) and fibroblasts (90%). The identical mutation was present in leukocytes (38%) isolated from the mother, but not from the father or maternal grandmother.\n\n{16:Degoul et al. (1995)} found the 8993T-G in a family with Leigh syndrome. The proband, who died at 9 years of age, developed hypotonia in the first 6 months of life and developmental retardation was noted. At 3 years of age he showed ataxia, dysmetria, myopathic weakness, nystagmus and ptosis. The electroretinogram was altered. She became deaf and developed progressive spasticity. Blood lactate concentration was normal. In contrast, lactate concentration in the CSF was always elevated. Two brothers died with acute apnea during infectious episodes, before the end of their first year. An older sister was mentally retarded, with ataxia, dysarthria, dystonia, and pes cavus, and had retinal degeneration. The mother's brother was mentally retarded and severely handicapped. Except for the father, all members of the family showed the mutation in all tissues studied, with high percentages in the 2 symptomatic sisters and even in 1 asymptomatic boy.\n\n{19:Ferlin et al. (1997)} reported a child with Leigh syndrome who died at age 14 months. Genetic analysis identified the 8993T-G mutation in 3 generations of the family and showed that the percentage of mutant mtDNA increased through each generation. The maternal grandmother of the proband, the mother, and the eldest aunt had 10%, 52%, and 50% mutant mtDNA in lymphocytes, respectively. The proband's mother and the proband had 84% and 90% mutant mtDNA in skin fibroblasts, respectively. The eldest aunt terminated a pregnancy when the 8993T-G mutation was identified in chorionic villi. In fetal tissues, the mutation load ranged from 91 to 96%. {19:Ferlin et al. (1997)} concluded that the findings in this family were consistent with a threshold effect, in which over 90% mutant mtDNA load results in clinical disease, and noted that prenatal diagnosis is feasible.\n\n{2:Blok et al. (1997)} analyzed mtDNA in oocytes from an asymptomatic mother of 3 children exhibiting heteroplasmic expression of the 8993T-G mutation associated with Leigh syndrome. The mother had 50% mutant mtDNA in her blood. One of the 7 oocytes analyzed showed no evidence of the mutation, while the remaining 6 had a mutant load of more than 95%. {2:Blok et al. (1997)} suggested that this observation reflected preferential amplification of the mtDNA variant during oogenesis. During formation of the zygote, mtDNA is derived exclusively from the oocyte; thus, it is possible that a de novo mutation may arise during oogenesis. A first carrier of a de novo mutation may be a mother who exhibits mosaicism for the mutation restricted to oocytes. However, the usual finding is that mothers of patients with Leigh syndrome and the 8993T-G mutation have substantial levels of the mutant mtDNA (38 to 76%). {45:Takahashi et al. (1998)} reported the case of a 1-year-old boy with Leigh syndrome associated with the 8993T-G mutation whose mother did not have the mutant mtDNA in her blood or urine sediment cells. Thus, a de novo mutation had occurred at a high level in oocytes, thereby causing Leigh syndrome in the boy. Generalized hypotonia was noted at birth. He developed apnea attacks and altered consciousness after upper respiratory infections at the ages of 2 and 4 months. At the age of 7 months, he showed symptoms of brainstem dysfunction, such as irregular respiration and swallowing difficulty. At the age of 9 months, growth retardation and microcephaly were obvious. Laboratory examinations showed increased lactate and pyruvate levels in blood and cerebrospinal fluid.\n\nIn plants, cytoplasmic male sterility (CMS) is a mitochondrially inherited inability to produce viable pollen, and has been observed in more than 150 different plant species. {26:Kempken et al. (1998)} pointed out that in sorghum RNA editing is required to generate codons that encode leucine residues at positions equivalent to human 156 and 217. Loss of ATP6 RNA editing, as it occurs in sorghum, thus mimics mutations in human mitochondrial diseases. In all ATP6 protein sequences found in databases, including protists, plants (edited sequence), fungi, and animals, both amino acid positions are completely conserved.\n\n{54:White et al. (1999)} performed prenatal diagnosis in 2 mothers at risk of having affected children. One was the sister of a severely affected individual, and had previously had an unaffected child and a stillborn child. The other mother had 2 unaffected children and 2 affected children. The 8993T-G transversion was not found in the chorionic villus sample from 1 fetus or in the amniocytes from the other fetus. Both pregnancies were continued, and the resulting children were healthy at 2 years and 5 years of age.\n\nIn 3 patients from 2 unrelated families, {1:Baracca et al. (2000)} investigated the biochemical phenotype associated with the 8993T-G mutation in the MTATP6 gene. All 3 carried more than 80% mutant genome in platelets and were manifesting clinically various degrees of the NARP syndrome phenotype. Their results suggested that the 8993T-G mutation induces a structural defect in F1F0-ATPase that causes a severe impairment of ATP synthesis.\n\n{27:Kerrison et al. (2000)} described the progression of retinopathy in NARP syndrome due to the T-to-G point mutation at the mtDNA nucleotide position 8993 in the MTATP6 gene. Prior to the onset of visual field constriction, ophthalmoscopy revealed salt-and-pepper retinopathy. After the visual fields had become constricted, fundus examination showed diffuse peripheral bone spicule formation, optic nerve pallor, and arteriolar attenuation consistent with retinitis pigmentosa. The authors stressed that mild mottling of the peripheral retinal pigment epithelium (salt-and-pepper retinopath..."
1,"MTATP6, 8993T-C, LEU156PRO","In 4 sibs with mitochondrial complex V (ATP synthase) deficiency mitochondrial type 1 (MC5DM1; {500015}) resulting in Leigh syndrome (see {256000}), originally reported by {50:van Erven et al. (1987)}, {14:de Vries et al. (1993)} found a T-to-C transition at nucleotide 8993 in the gene for ATPase 6. The mutation was predicted to cause a substitution of proline for leucine. The 4 sibs were severely affected and 1 of them died at the age of 17 years. The possibility of a mitochondrial basis was suggested by the fact that all children were affected; furthermore, beginning at the age of 56, the mother complained of weakness in her left leg, easy fatigability, and sensory disturbances in her feet. Neurologic examination demonstrated pyramidal signs in both legs, and ancillary investigations yielded results compatible with the diagnosis of Leigh syndrome. All patients, including the mother, were heteroplasmic for the mutation. The oldest of the 4 sibs died at age 17 after a progressive neurologic deterioration for 8 to 10 years. The other 3 sibs were living at ages 25, 23, and 20 years. Thus, the clinical picture did not agree strictly with that of infantile subacute necrotizing encephalopathy of Leigh. In the 3 living sibs there was no abnormality of pyruvate metabolism detected by study of serum and urine, but all 3 had marked elevation of CSF pyruvate and lactate concentration. Furthermore, pyruvate oxidation rates were normal in fibroblasts and leukocytes. A defect restricted to brain was suggested.\n\n{8:Chakrapani et al. (1998)} described another family with the 8993T-C mutation causing Leigh syndrome. A brother and sister were found to be homoplasmic for the 8993T-C mutation; the asymptomatic mother was heteroplasmic. The features in the boy were those of NARP ({551500}). Beside delayed motor development, ataxia, and raised CSF lactate, developmental regression followed acute illnesses in early childhood, with slow reacquisition of skills and pronounced ataxia thereafter.\n\n{20:Fujii et al. (1998)} reported a patient with Leigh syndrome with the 8993T-C mutation. They reviewed 9 other Leigh syndrome patients with the 8993T-C mutation and compared them with 18 reported cases with Leigh syndrome caused by the 8993T-G mutation ({516060.0001}). Leigh syndrome with the 8993T-C mutation was characterized by a significantly higher frequency of ataxia (P less than 0.01). None of the reviewed 8993T-C Leigh syndrome patients had retinitis pigmentosa, which is one of the characteristic findings in Leigh syndrome caused by the 8993T-G mutation. The milder symptoms of 8993T-C Leigh syndrome may be explained by the milder complex V dysfunction; however, the higher frequency of ataxia in association with 8993T-C requires more study.\n\n{51:Vilarinho et al. (2001)} reported 4 new 8993T-C patients. One was a 17-year-old girl with remitting-relapsing neurodegenerative disease since age 16 months which worsened during fevers or infectious disease. She had elevated CSF lactate and brain MRI was compatible with Leigh syndrome. Proton magnetic resonance spectroscopy showed slight elevation of lactate in the basal ganglia. Her mother and maternal aunt showed a progressive cerebellar ataxia. The second case was of a 16 year-old boy who experienced episodes of loss of consciousness and awkward gait during febrile illness in childhood with a slow recovery. Blood and CSF lactate concentrations were elevated. Brain MRI showed basal ganglia involvement. The third case was of a 21 year-old girl who experienced her first episode of lethargy and hypotonia at age 5 months during a fever. Similar episodes reappeared in her first 10 years. At age 11 years, examination showed mental deficiency, severe dysarthria, and vertical gaze palsy. Blood lactate was elevated. Brain MRI showed hyperlucencies in the putamen and head caudate nucleus. Two older sisters had peripheral neuropathy with normal MRI and blood lactate. The fourth case was of a 16 year-old cousin of case 3 who had a subacute episode of leg weakness, ataxia and dysarthria during a fever at age 3 years. She improved but had permanent motor disability. Similar episodes recurred and always had a slow recovery. At age 9 she had elevated blood and CSF lactate and brain MRI was compatible with maternally inherited Leigh syndrome.\n\n{15:Debray et al. (2007)} reported long-term follow-up on a patient who met the stringent criteria for Leigh syndrome established by {38:Rahman et al. (1996)}. At age 4 years, the patient presented with respiratory distress, unexplained tachypnea, and a 2-day history of ptosis. On day 5 of hospitalization, he deteriorated with apnea and severe hypercapnia and required mechanical ventilation for 5 days. Ophthalmologic examination revealed nystagmus and supranuclear ophthalmoplegia. CT scan showed bilateral basal ganglia hypodensities. Blood lactate was 1.7 mmol/L (normal less than 2.2) and CSF lactate was 4.1 mmol/L (normal less than 1.8). He recovered without sequelae and functioned normally throughout childhood and early adolescence. Follow-up at age 18 revealed a slight cognitive decline in nonverbal tasks. The patient's leukocyte DNA revealed a greater than 95% 8993T-C mutant DNA; in contrast, the mutation was undetectable in his mother. {15:Debray et al. (2007)} reviewed 20 Leigh syndrome patients with the 8993T-C mutation. Only half (10/20) of the patients fulfilled the criteria of {38:Rahman et al. (1996)} for typical Leigh syndrome. Eighty-five percent (17/20) survived a median follow-up time of 16 years and 41% (7/20) did not have mental retardation. {15:Debray et al. (2007)} concluded that a favorable outcome can be observed in a significant percentage of Leigh syndrome patients with the 8993T-C mtDNA mutation.\n\n{39:Rantamaki et al. (2005)} reported 4 sibs with adult-onset ataxia and polyneuropathy ({500010}) and a heteroplasmic 8993T-C mutation. One of the sibs had early-onset severe ataxia and moderate mental impairment and died at age 22 years. The remaining 3 sibs had adult-onset of variable gait abnormalities, axonal sensorimotor polyneuropathy, abnormal eye movements, and dysarthria. Genetic analysis of the 3 surviving sibs showed mutant mtDNA ranging from 64 to 89%. {39:Rantamaki et al. (2005)} emphasized the unique phenotypic presentation in this family.\n\n{10:Craig et al. (2007)} identified a 3-generation family with slowly progressive adult-onset ataxia associated with the heteroplasmic 8993T-C mutation. A mother, daughter, and granddaughter were affected, with 86%, 82%, and 83% mutation heteroplasmy, respectively, in the blood. Other features included cerebellar dysarthria, axonal sensory neuropathy, and gaze-evoked horizontal nystagmus. The daughter and granddaughter reported intermittent exacerbations of ataxia, associated with migraine in 1 case. The daughter had optic atrophy without retinal degeneration. The 8993T-C mutation was not identified in 191 additional patients with episodic ataxia, 307 patients with ataxia, or 96 patients with suspected Charcot-Marie-Tooth disease (see, e.g., CMT1A; {118220}) suggesting that it is not a common finding in these phenotypic conditions."
2,"MTATP6, 9101T-C, ILE192THR","In 1 of 24 Finnish Leber hereditary optic atrophy ({535000}) families, {29:Lamminen et al. (1995)} found a single affected male with a typical acute stage with peripapillary microangiopathy; onset was at age 21. A T-to-C base substitution at nucleotide 9101 in the MTATP6 gene was found that resulted in the replacement of an isoleucine by a threonine at residue 192. Using restriction site changes resulting from the base substitution, the mutation was detected in all maternal members of the proband's family but not in other individuals tested and was not found in any of the other Finnish LHON families or in 100 unrelated control individuals of Finnish origin."
3,,
4,"MTATP6, 9176T-C","In 2 Jewish brothers with mitochondrially inherited bilateral striatal necrosis ({500003}), {48:Thyagarajan et al. (1995)} identified a 9176T-C transition in the MTATP6 gene. In the more severely affected patient, the mutation was homoplasmic in muscle, leukocytes, and fibroblasts; 98% of mtDNA was mutant in leukocytes from his affected brother. The mother and 2 other sibs were asymptomatic, with varying degrees of heteroplasmy for the mutation.\n\nIn an Italian family, {17:Dionisi-Vici et al. (1998)} found that the 9176T-C mutation in the MTATP6 gene was associated with mitochondrial complex V (ATP synthase) deficiency mitochondrial type 1 (MC5DM1; {500015}), resulting in early-onset fulminant Leigh syndrome (see {256000}) and with sudden unexpected death in 2 sibs, respectively. PCR-SSCP analysis and direct sequencing showed that the mutation was homoplasmic in the mitochondrial DNA of the proband. The 9176T-C mutation changed the highly conserved leucine to proline in the MTATP6 gene and was maternally inherited, but maternal relatives were asymptomatic.\n\nAmong 80 patients with clinical and brain imaging characteristics of Leigh syndrome, {31:Makino et al. (1998)} found that 11 had the well-known 8993T-G mutation in the mitochondrial DNA ({516060.0001}). In addition, 3 patients had the 9176T-C mutation. In the 3 patients reported by {31:Makino et al. (1998)}, 1 had the typical clinical characteristics of Leigh syndrome from early infancy, and 2 had later onset of neurologic deficits. All had a slowly progressive course and basal ganglia abnormalities by neuroimaging. As both nucleotide 8993 and nucleotide 9176 are located in the ATPase 6 coding region, altered ATPase function may be one of the enzyme abnormalities in Leigh syndrome and other similar conditions with bilateral striatal necrosis."
5,"MTATP6, 8851T-C","In a boy with bilateral striatal necrosis ({500003}), {13:De Meirleir et al. (1995)} identified an 8851T-C transition in the MTATP6 gene. The patient had less than 3% normal mtDNA in fibroblasts and his unaffected mother had 15% normal mtDNA. The mtDNA of the grandmother had no trace of the mutation."
6,"MTATP6, 2-BP DEL, 9205TA","{41:Seneca et al. (1996)} described a female newborn who presented with seizures and episodic lactic acidemia, symptoms consistent with mitochondrial dysfunction. They found a 2-bp deletion at positions 9204/5 or 9205/6 at the junction between the 2 genes MTATP6 and MTCO3 ({516050}) that removed the termination codon for RNA14, the ATPase 8- and 6-encoding bicistronic mRNA unit. The deletion removed the termination codon for MTATP6 and set MTCO3 immediately in-frame, generating a predicted ATPase6/COX3 fusion protein. {47:Temperley et al. (2003)} showed that accurate processing at this site still occurred, but there was a markedly decreased steady-state level of RNA14. The majority of mutated RNA14 terminated with short poly(A) extensions, and a second, partially truncated population was also present. Initial maturation of mutated RNA14 was unaffected, but deadenylation occurred rapidly. Inhibition of mitochondrial protein synthesis showed that the deadenylation was dependent on translation; deadenylation also enhanced mRNA decay. {47:Temperley et al. (2003)} referred to the deletion as mu-delta-9205."
7,"MTATP6, 9185T-C, LEU220PRO","In a patient with mitochondrial complex V (ATP synthase) deficiency mitochondrial type 1 (MC5DM1; {500015}), resulting in Leigh syndrome (see {256000}), {7:Castagna et al. (2007)} identified a heteroplasmic 9185T-C transition in the MTATP6 gene, resulting in a leu220-to-pro (L220P) substitution in the fifth transmembrane helix at the inner surface of the outer mitochondrial membrane. After a normal development, the boy presented at age 8.5 years with a 3-month history of frequent falls, ataxia, slowed speech, poor concentration, bilateral pes cavus, and absent ankle reflexes. Three months later, he developed saccadic paresis and nystagmus and rapidly deteriorated into a comatose state, followed by death. Brain MRI showed symmetric hyperintense signals in the basal ganglia with prominent cerebellar involvement, consistent with Leigh syndrome. The proband had a similarly affected brother, and both boys had greater than 90% mutant DNA levels. The mother and a maternal uncle had isolated peripheral neuropathy and ataxia with 86% and 85% heteroplasmy for the mutation, respectively. Family history revealed 4 additional maternal relatives with the mutation: 2 had Leigh syndrome, and 2 had isolated ataxia. Percentage of heteroplasmy correlated with the severity of the phenotype. Studies of the proband's mitochondria showed a 30% decrease in ATPase activity, although the overall process of ATP synthesis was not affected."
8,"MTATP6, 1-BP INS, 8618T","In a man with NARP syndrome ({551500}), {30:Lopez-Gallardo et al. (2009)} identified a 1-bp insertion (8618insT) in the MTATP6 gene, resulting in a frameshift and a truncated protein of 63 amino acids instead of the 227 residues of the mature wildtype protein. The mutation was heteroplasmic, present in 26% and 85% of blood and muscle, respectively. Western blot analysis showed decreased levels of MTATP6 protein but no truncated protein. The patient had delayed development, psychomotor retardation, and irritability in childhood, and later developed other neurologic signs, including hearing loss, blindness due to optic atrophy and retinitis pigmentosa, ataxia, and clonic spasms."
9,"MTATP6, MET1THR","In 4 unrelated infants who presented with hypertrophic cardiomyopathy and congestive heart failure ({500006}), {52:Ware et al. (2009)} identified a heteroplasmic 8528T-C transition, resulting in concurrent substitutions in the overlapping MTATP6 and MTATP8 genes: a met1-to-thr (M1W) substitution in MTATP6, predicted to abrogate the start of translation, and a trp55-to-arg (W55R) substitution at a highly conserved residue in MTATP8 ({516070.0003}). The alteration appeared homoplasmic on sequence analysis; however, tissue analysis of 1 patient and her asymptomatic mother and maternal aunt revealed that the patient carried a high degree of heteroplasmic mutation (92 to 98%) in all 5 tissues examined, whereas her mother carried the heteroplasmic mutation at a much lower level (15 to 25%), and the mutation was not detected in her maternal aunt. Functional analysis in skin fibroblasts from this patient and her mother indicated a significant decrease in ATP synthesis in the patient."


## Answer the following questions:
Input your answer in the cell below each question and press SHIFT+ENTER.

1. What hypothesis do researchers have for the loss of function in ATP Synthase 6 with the L156R mutation?


Answer here

2. Does the hypothesis that you formed from studying the protein’s structure relate to your answer in question 1?  Please explain.


Answer here

Read the text under “Allelic Variants” associated with the “Takahashi et al. 1998”.


3. Why would increased levels of pyruvate be found in patients with the L156R mutation?




Answer here

## Using OMIM to obtain information about Leigh Syndrome
This time you will search the OMIM 'LEIGH SYNDROME' entry for information.

The function 'get_omim' helps you do just that since we can set certain arguments to 'TRUE' and obtain specific information about the entry.

Run the next cell to see a list of Arguments that you can access.

In [6]:
# Search OMIM again to obtain description information
%%R
set_key('4PUvWRqSSD2BuprIVAP_VQ')

# Using mim number to get the entry, set 'text' argument to true
# Use mim number 256000
omim_result <- get_omim(256000, text = TRUE)

# Save the xml to a file called molgen.xml
saveXML(omim_result, file="molgen.xml")

[1] "molgen.xml"


In [7]:
#@title Load the results by providing the file name in this form (include file extension .xml)

# MAKING RESULTS LOOK GOOD
import xml.etree.ElementTree as ET
import csv
import pandas as pd

file_name = "molgen.xml" #@param {type:"string"}

tree = ET.parse(file_name)
root = tree.getroot()
 
Ref_data4 = open('refdata4.csv', 'w')
 
csvwriter = csv.writer(Ref_data4)
allele_head = []

 
count = 0
for member in root.findall('.//textSection'):
    allele = []
    ref_list = []

    if count == 0:
      des = member.find('.//textSectionTitle').tag
      allele_head.append(des)
      
      mut = member.find('.//textSectionContent').tag
      allele_head.append(mut)
     
      csvwriter.writerow(allele_head)
      count = count + 1
       
    des = member.find('.//textSectionTitle').text
    allele.append(des)
    
    mut = member.find('.//textSectionContent').text
    allele.append(mut)

    csvwriter.writerow(allele)
  
Ref_data4.close()

data4= pd.read_csv("refdata4.csv")
pd.set_option('display.max_colwidth',10000)

data4

Unnamed: 0,textSectionTitle,textSectionContent
0,Text,"A number sign (#) is used with this entry because Leigh syndrome, which is a clinical diagnosis based primarily on characteristic brain imaging findings, can be caused by mutation in multiple different genes, indicating genetic heterogeneity. Leigh syndrome is a presentation of numerous genetic disorders resulting from defects in the mitochondrial OXPHOS complex. Accordingly, the genes implicated in Leigh syndrome most commonly encode structural subunits of the OXPHOS complex or proteins required for their assembly, stability, and activity. Mutations in both nuclear and mitochondrial genes have been identified."
1,Description,"Leigh syndrome is a clinically and genetically heterogeneous disorder resulting from defective mitochondrial energy generation. It most commonly presents as a progressive and severe neurodegenerative disorder with onset within the first months or years of life, and may result in early death. Affected individuals usually show global developmental delay or developmental regression, hypotonia, ataxia, dystonia, and ophthalmologic abnormalities, such as nystagmus or optic atrophy. The neurologic features are associated with the classic findings of T2-weighted hyperintensities in the basal ganglia and/or brainstem on brain imaging. Leigh syndrome can also have detrimental multisystemic affects on the cardiac, hepatic, gastrointestinal, and renal organs. Biochemical studies in patients with Leigh syndrome tend to show increased lactate and abnormalities of mitochondrial oxidative phosphorylation (summary by {16:Lake et al., 2015}).\n\n<Subhead> Genetic Heterogeneity of Leigh Syndrome\n\nLeigh syndrome may be a clinical presentation of a primary deficiency caused by genes in any of the mitochondrial respiratory chain complexes: complex I deficiency (see {252010}), complex II deficiency (see {252011}), complex III deficiency (see {124000}), complex IV deficiency (cytochrome c oxidase; see {220110}), and complex V deficiency (see {604273}) (summary by {16:Lake et al., 2015}).\n\nMutations in genes encoding mitochondrial tRNA proteins have also been identified in patients with Leigh syndrome: see MTTV ({590105}), MTTK ({590060}), MTTW ({590095}), and MTTL1 ({590050}).\n\nLeigh syndrome may also be caused by mutations in components of the pyruvate dehydrogenase complex (e.g., DLD, {238331} and PDHA1, {300502}).\n\nDeficiency of coenzyme Q10 ({607426}) can present as Leigh syndrome.\n\nSome forms of combined oxidative phosphorylation deficiency can present as Leigh syndrome (see, e.g., {617664})."
2,Clinical Features,"This condition was first described by {17:Leigh (1951)} in a patient with foci of necrosis and capillary proliferation in the brainstem. {9:Feigin and Wolf (1954)} observed 2 affected sibs from a consanguineous mating. Because of similarity to Wernicke encephalopathy ({277730}), they suggested that a genetic defect in some way related to thiamine was present (see HISTORY). {10:Ford (1960)} referred to 2 affected sibs, and {1:Clark (1964)} pictured the histopathology of 1 of them. The main biochemical findings were high pyruvate and lactate in the blood and slightly low glucose levels in blood and cerebrospinal fluid. {12:Hommes et al. (1968)}, who studied a family with 3 affected sibs, found absence of pyruvate carboxylase in the liver and concluded that gluconeogenesis was impaired. {2:Clayton et al. (1967)} demonstrated therapeutic benefit of lipoic acid. {20:Montpetit et al. (1971)} pointed out similarity in the distribution and histology of the lesions of SNE to those of Wernicke disease. They tabulated instances of affected sibs and consanguineous parents. {13:Kohlschutter et al. (1978)} reported 2 sisters and a brother born of consanguineous parents.\n\n{11:Gordon et al. (1974)} noted that since oxidation of pyruvate is dependent on a multienzyme complex (the pyruvate dehydrogenase complex), it is likely that a number of apoenzyme and coenzyme deficiencies could lead to this disorder. Whereas {15:Kustermann-Kuhn et al. (1984)} had found that activity of the pyruvate dehydrogenase complex was not deficient in the brain of 3 autopsied cases of Leigh disease, {14:Kretzschmar et al. (1987)} reported a patient with well-documented clinical and biochemical pyruvate dehydrogenase complex deficiency who at postmortem examination was found to have the specific CNS pathologic changes of Leigh disease.\n\n{29:Rutledge et al. (1981)} pointed out that hypertrophic cardiomyopathy (CMH; see {192600}) is a frequent associated finding in Leigh syndrome. Of 12 autopsy cases, 7 (including a pair of sibs) had hypertrophic cardiomyopathy, and 4 of these had asymmetric septal hypertrophy. The authors suggested that this feature may be useful in premortem diagnosis.\n\n{33:Van Erven et al. (1987)} reported 4 sibs (1 male, 3 female) of unrelated parents with what the authors considered to be an autosomal recessive juvenile form of Leigh syndrome. They detected no abnormalities of pyruvate metabolism in urine and serum, but all patients had marked elevations of CSF pyruvate and lactate concentrations. Although the affected sibs lived to adulthood, they were severely affected and 1 of them died at age 17 years. The mother had the onset of neurologic signs and symptoms at age 56 years. The authors suggested a defect restricted to the brain."
3,Molecular Genetics,"{7:DiMauro and De Vivo (1996)} reviewed the genetic heterogeneity of Leigh syndrome and noted that multiple defects had been described in association with Leigh syndrome, including mutations in PDHA1, mutations in the mitochondrial MTATP6 gene, and defects in complex IV. Thus, there are at least 3 major causes of Leigh syndrome, each transmitted by a different mode of inheritance: X-linked recessive, mitochondrial, and autosomal recessive.\n\n{27:Rahman et al. (1996)} investigated Leigh syndrome in 67 Australian cases from 56 pedigrees, 35 with a firm diagnosis and 32 with some atypical features. Biochemical or DNA defects were determined in both groups: in 80% of the tightly defined group and 41% of the 'Leigh-like' group. Enzyme defects were found in 29 patients: in respiratory chain complex I in 13, in complex IV in 9, and in the pyruvate dehydrogenase complex (PDHC) in 7. Complex I deficiency (see {252010}) was more common than had previously been recognized. Eleven patients had mitochondrial mutations, including point mutations in the MTATP6 gene (e.g., {516060.0001}) a mutation in the gene encoding mitochondrial transfer RNA-lysine (MTTK) ({590060.0001}), which is common in MERRF syndrome ({545000}), and a mitochondrial deletion. In 6 of the 7 PDHC-deficient patients, mutations were identified in the X-linked E1-alpha subunit of PDHC (PDHA1; {300502}). {27:Rahman et al. (1996)} found no strong correlation between the clinical features and basic defects. Parental consanguinity suggested autosomal recessive inheritance in 2 complex IV-deficient sibships. An assumption of autosomal recessive inheritance would have been wrong in nearly one-half of those in whom a cause was found: 11 of 28 tightly defined and 18 of 41 total patients. The experience illustrated that a specific defect must be identified if reliable genetic counseling is to be provided.\n\n{21:Morris et al. (1996)} reviewed the clinical features and biochemical cause of Leigh disease in 66 patients from 60 pedigrees. Biochemical or molecular defects were identified in 50% of the pedigrees, and in 74% of the 19 pedigrees with pathologically confirmed Leigh disease. Mutation in the MTATP6 gene ({516060.0001}) was found in only 2 patients. No correlation was found between the clinical features and etiologies. No defects were identified in the 8 patients with normal lactate concentrations in the cerebrospinal fluid.\n\n{21:Morris et al. (1996)} described complex I deficiency (see {252010}) as an important cause of Leigh syndrome. Identified in 7 of 25 patients, it was the second most common biochemical abnormality after complex IV deficiency.\n\n{5:Dahl (1998)} reviewed mutations of respiratory chain-enzyme genes that cause Leigh syndrome.\n\nIn a review of the mechanisms of mitochondrial respiratory chain diseases, {8:DiMauro and Schon (2003)} diagrammed the defects resulting from mutations in complexes I, II, III, IV, and V, all of which had Leigh syndrome as one of their pathologic consequences.\n\n<Subhead> Associations Pending Confirmation\n\nFor discussion of a possible association between a neurodegenerative disorder with clinical features of Leigh syndrome and variation in the GYG2 gene, see {300198.0001}.\n\nFor discussion of a possible association between Leigh syndrome and variation in the IARS2 gene, see {612801.0002}."
4,Clinical Management,"{18:Ma et al. (2015)} generated genetically corrected pluripotent stem cells (PSCs) from patients with mtDNA disease. Multiple induced pluripotent stem (iPS) cell lines were derived from patients with common heteroplasmic mutations including 3243A-G ({590050.0001}), causing MELAS, and 8993T-G ({516060.0001}) and 13513G-A, implicated in Leigh syndrome. Isogenic MELAS and Leigh syndrome iPS cell lines were generated containing exclusively wildtype or mutant mtDNA through spontaneous segregation of heteroplasmic mtDNA in proliferating fibroblasts. Furthermore, somatic cell nuclear transfer (SCNT) enabled replacement of mutant mtDNA from homoplasmic 8993T-G fibroblasts to generate corrected Leigh-NT1 PSCs. Although Leigh-NT1 PSCs contained donor oocyte wildtype mtDNA (human haplotype D4a) that differed from Leigh syndrome patient haplotype (F1a) at a total of 47 nucleotide sites, Leigh-NT1 cells displayed transcriptomic profiles similar to those in embryo-derived PSCs carrying wildtype mtDNA, indicative of normal nuclear-to-mitochondrial interactions. Moreover, genetically rescued patient PSCs displayed normal metabolic function compared to impaired oxygen consumption and ATP production observed in mutant cells. {18:Ma et al. (2015)} concluded that both reprogramming approaches offer complementary strategies for derivation of PSCs containing exclusively wildtype mtDNA, through spontaneous segregation of heteroplasmic mtDNA in individual iPS cell lines or mitochondrial replacement by SCNT in homoplasmic mtDNA-based disease."
5,History,"Denis Leigh was a registrar in the Department of Neuropathology, Institute of Psychiatry, Maudsley Hospital, London, at the time he described this condition and named it subacute necrotizing encephalomyelopathy, or SNE ({17:Leigh, 1951}). He pronounced his name 'Lee,' not 'Lay'({19:McHugh, 1993}).\n\nIt was originally suggested that the biochemical defect in Leigh syndrome was a block in thiamine metabolism. {3,4:Cooper et al. (1969, 1970)} found that patients with SNE elaborate a factor, found in the blood and urine, that inhibits the synthesis of thiamine triphosphate (TTP) in brain tissue. The enzyme responsible for TTP synthesis is called thiamine pyrophosphate-adenosine triphosphate phosphoryl transferase. TTP was completely absent in postmortem brain. They suggested that an assay for the inhibitor of TTP synthesis could be performed on urine or blood for diagnostic purposes. In the urine of obligatory or presumptive heterozygotes, {23:Murphy (1973)} found an inhibitor of thiamine triphosphate synthesis in vitro. {25:Pincus et al. (1969)} had described the inhibitor in untreated patients. Thiamine derivatives in therapy were studied by {24:Pincus et al. (1973)}. By direct examination of amniotic fluid for the inhibitor of TTP synthesis, {22:Murphy et al. (1975)} suggested that Leigh syndrome could probably be diagnosed antenatally.\n\n{26:Plaitakis et al. (1980)} studied the family of a patient who died at age 21 years. The patient came from an isolated Greek island with a population of 1,200. Studies of the family showed inhibitor of adenosine triphosphate-thiamine diphosphate phosphoryltransferase in several members of the family and many of these had a chronic neurologic illness compatible with Leigh disease. Several sibships had more than 1 affected member and the parents were demonstrably consanguineous in several instances."


Read the **Description** section. 

## Answer the following questions:##
Input your answer in the cell below each question and press SHIFT+ENTER.

4. Besides having high levels of pyruvate, what observations are important for diagnosing Leigh Syndrome?


Answer here

5. How can the same disease be caused by different mutations?




Answer here