# Convert Jama Glossary to LaTeX

1. In Jama, go to the glossary and choose *Export* $\to$ *Excel* to write out `CTA-Glossary.xls`

2. The ID column is not correctly read if you use XLS format, so open the result in Excel or Numbers, and export it in XLSX format

3. read it into a Pandas DataFrame:

Note that the header is on row 3, so we need to specify that (the rows before that will be ignored)

In [1]:
import pandas as pd
import numpy as np

In [2]:
glossary = pd.read_excel(
    "CTA-Glossary.xlsx", 
    header=3, 
    sheet_name='Sheet1',
    usecols=[0,1,2,3,4,5],
    converters={'ID': lambda x: str(x)}
)  

In [3]:
glossary.head()

Unnamed: 0,Modified Date,Last Activity Date,Name,Description,ID,Status
0,09/06/2017,09/06/2017,CTA Constituents,,CTA_-FLD-4,
1,16/05/2018,16/05/2018,CTAO,"The Cherenkov Telescope Array Observatory, an ...",CTA_-GLOS-206,Stable
2,16/05/2018,16/05/2018,CTA North,CTA Observation site hosting an Array of Chere...,CTA_-GLOS-207,Stable
3,16/05/2018,16/05/2018,CTA South,CTA Observation site hosting an Array of Chere...,CTA_-GLOS-208,Stable
4,16/05/2018,16/05/2018,Headquarters,"The primary centre for CTAO governance, admini...",CTA_-GLOS-209,Stable


In [4]:
glossary = glossary.dropna(subset=['Description']) # get rid of undefined terms

In [5]:
glossary_rec = """
\\newglossaryentry{{{name}}}{{
    name={{{name}}}, 
    description={{{description} (\emph{{{ident}}})}}
}}
"""

acronym_rec = """
\\newacronym{{{abbrev}}}{{{name}}}{{{description} (\emph{{{ident}}})}}
"""

In [7]:
import re
def convert_to_glossary(fullname, description, ident):
    """
    Deals with acronym-like defnitions and normal glossary entries. If the name contains (), it's 
    assumed to be the acronym
    """
    fullname = fullname.strip()
    description= description.strip()
    ident=ident.strip()
    
    is_acro = re.match(pattern='.*(\(.*\)).*', string=fullname)
    if is_acro: # if it's an acronym
        abbrev = is_acro.group(1)[1:-1]
        name = re.sub(pattern='\(.*\)', repl='', string=fullname).strip()
        return acronym_rec.format(
            name=name, abbrev=abbrev, description=description, ident=ident
        )
    
    # otherwise regular glossary entry
    return glossary_rec.format(name=fullname, description=description, ident=ident)
    

In [8]:
with open("cta-glossary-defs.inc", 'w') as outfile:
    for name, description, ident in zip(glossary.Name, glossary.Description, glossary.ID):
        outfile.write(convert_to_glossary(name, description, ident))


In [11]:
! tail -n 40 cta-glossary-defs.inc

\newacronym{OCC}{Occurrence Ranking}{OCC is a relative numerical scale estimating the probability that the cause, if it occurs, will produce the failure mode and its particular effect. (\emph{CTA_-GLOS-346})}

\newacronym{DET}{Detection Ranking}{DET is a relative numerical scale estimating the effectiveness of the controls to prevent or detect the cause or failure mode before the failure reaches the customer. The assumption is that the cause has occurred. (\emph{CTA_-GLOS-347})}

\newglossaryentry{Emergency situation}{
    name={Emergency situation}, 
    description={An immediately hazardous situation that needs to be ended or averted quickly in order to prevent injury or damage. (\emph{CTA_-GLOS-348})}
}

\newacronym{E-Stop}{Emergency stop}{An E-Stop is a function that is intended to avert harm or to reduce existing hazards to persons, machinery, or work in progress. E-Stop is not a safeguard but is considered to be a complementary protective measure. (\emph{CTA_-GLOS-349})}