# Convert table of questions to Markdown

This notebook can be use to transform a CSV file containing questions and answers to create a markdown file. The table should have four columns: ```Category name```, ```Question```, ```Answer```, ```Similar questions```. The questions and answers will be grouped according to their category. If a several questions are similar, the additional questions can be added in the ```Similar questions``` columns, separated by ```; ``` (don't forget the space after ```;```).

To generate the CVS file, you can just export a Google Sheet using File->Download->Comma separated. An example of such a file can be found [here](https://docs.google.com/spreadsheets/d/1aL77bMOsQeYoWBA8_jZx_mu13KeL_bgJHNVMOWPHHHA/edit?usp=sharing).

In [1]:
import pandas as pd
import numpy as np
from IPython.display import display, Markdown, Latex, HTML

In [2]:
# import CSV. Use the appropriate path
questions = pd.read_csv('questions.csv')

# group questions by topic 
groups = questions.groupby('Category name')

In [3]:
# creata a list of topics that should be ordered
ordered_topics = ['Category3', 'Category1']

# add any missing topic at the end of the list
final_list = ordered_topics + [x for x in questions['Category name'].dropna().unique() if x not in ordered_topics]

# keep only headers that are actually used
final_list = [f for f in final_list if f in groups.groups.keys()]

In [4]:
# add a specific suffix to tags, in case the same headers are used in multiple posts
suffix = '_2'

In [5]:
# create content
mdstring = '## Table of contents  \n'
for g in final_list:
    mdstring +="* <a href=\"#" + g.lower() + suffix + "\">" + g + "</a>  \n"

mdstring += '  \n'
for g in final_list:
    
    g2 = groups.get_group(g)
    mdstring +="## <a name=\"" + g.lower()+ suffix + "\" /></a>"+ g +"  \n"
        
    for q in range(len(g2)):
        if g2.iloc[q]['Question'] is not None and g2.iloc[q]['Answer'] is not None and g2.iloc[q]['Answer'] is not np.nan:
            mdstring = mdstring + '**' + g2.iloc[q]['Question'] + '**  \n  \n'
            
            if not pd.isnull(g2.iloc[q]['Similar questions']):
                additional_q = g2.iloc[q]['Similar questions']
                additional_q = additional_q.split('; ')
                for ad in additional_q:
                    mdstring = mdstring + '**' + ad + '**  \n  \n'
                
            mdstring = mdstring + g2.iloc[q]['Answer'] + '  \n  \n'
        

In [6]:
display(Markdown(mdstring))

## Table of contents  
* <a href="#category3_2">Category3</a>  
* <a href="#category1_2">Category1</a>  
* <a href="#category2_2">Category2</a>  
  
## <a name="category3_2" /></a>Category3  
**Is this my fourth question?**  
  
Yes it is.  
  
**Is this my seventh question?**  
  
Yes it is.  
  
## <a name="category1_2" /></a>Category1  
**Is this my first question?**  
  
Yes it is.  
  
**Is this my second question?**  
  
Yes it is.  
  
**Is this my fifth question?**  
  
Yes it is.  
  
## <a name="category2_2" /></a>Category2  
**Is this my third question?**  
  
**I have a similar question?**  
  
**I also have a similar question?**  
  
Yes it is.  
  
**Is this my sixth question?**  
  
Yes it is.  
  


In [7]:
with open('QA.md','w+') as f:
    f.writelines(mdstring)