#  Conversion of spreadsheet of abstracts to abstracts booklet

## Philip Machanick
### 18 June 2020

Free to use under terms of the MIT license: https://opensource.org/licenses/MIT – copyright &copy; Philip Machanick 2020

## Input

Assume data to create an abstract booklet is in a spreadsheet structured as follows (“|” separates fields):

`\# | track # | track name | title | authors | submitted | last updated | form fields | keywords | decision | notified |	reviews sent | abstract | day | time`

as created by EasyChair but with the addition of fields for conference `day` and `time`. You also need to sort in order of presentation.

Only those with exact text “`accept`” in the decision field are included.

## Output
Output is a LaTeX file; to access it, use the **File** menu and select *Open* – the file name is as set in the initialization code in the cell that starts with the comment
`# setup for this conference`
(variable `texfile`).

You can also send the output to the screen by making the file name the empty string. See the comment line starting with
`# if you want the final LaTeX to go to a file`

Save the file locally if running off the repository: binders time out relatively quickly.

## Usage hints
The time field is in text mode to avoid having to do conversion to Numpy time format. You should sort the spreadsheet in order of the date and time to get the abstracts in the correct order; this code does not do so.

It is worth checking for stray hyphens as some authors copy and paste from the PDF of their paper’s abstract and don’t notice that they have left in hyphens that should only be there for a line break.

Note also that LaTeX in Unicode (UTF-8) mode should be able handle extended character sets but something can be lost in translation – e.g., saving a text file may not reliably record the character set mode. To be safe, in the Excel file systematically change all:

* “ –> `` (open single quote)
* ” –> '' (close double quote)
* ‘ –> ` (open single quote)
* ’ –> ' (close single quote)
* – –> -- (end dash)
* — –> --- (em dash)
* ﬁ –> fi (fi ligature)
* ﬂ –> fl (fl ligature)
* ﬀ –> ff (fl ligature) … and others like that

Accented characters should also be searched for and replaced: a table of them can be found here: https://en.wikibooks.org/wiki/LaTeX/Special_Characters (the enclosing { } is not stricrly necessary in the main body of the document but is in BibTeX – for example, you can write \'e to get é and \'{e} should get the same result). You may also need to search for special symbols (degrees, Greek letters, mathematical notation, etc.).

This could be coded easily in Python (see use of replace in the code below) but the benefit of doing it in the spreadsheet is you would be more inclined to check. To be sure you do this right, copy and past the before and after characters into the Excel search and replace box.

Note: if position of the times on the abstracts are garbled, run LaTeX again (up to three runs may be required).

### Source
LaTeX template based on https://www.overleaf.com/latex/examples/a-basic-conference-abstract-booklet/tkjfcvzgjrnd

In [7]:
%%capture
## initialize state for this notebook
import os
owd = os.getcwd()
os.chdir('../')
%run setup.py install
os.chdir(owd)

In [8]:
import xlrd
import numpy as np
from datetime import time

In [9]:
# setup for this conference -- all changes for a new event go here
filename     = 'EasyChairDownload_columns_added_sort.xlsx' # EDIT TO NAME OF YOUR ABSTRACT SPREADSHEET

# if you want the final LaTeX to go to a file, you can access it via Open in the File menu once done
# comment out one of the next 2 lines depending whether you want to save the LaTeX file or send output to the notebook
texfile      = 'conf2020Abstracts.tex'  # change this to the correct conference file name
#texfile      = '' # name this to save, otherwise the output appears at the end of the page

conferencename = "49th Annual Conference of the Latin Nostalgics Association"
conferenceauthor = 'LNA 2020' # needed for LaTeX
conferencedate = '6--9 July 2020'
conferencemonth = 'July'        # empty string if month included in spreadsheet entries

#################### generic setup – things to change end here ####################

pathseparator = '/'             # change if not a Unix-style path
filepath     = 'data'  # path to spreadsheet relative to root directory of notebook
preamblefile = 'AbstractsOpening.tex'

In [10]:
# Extract each abstract and build LaTeX
# Check here what the field names are and that they match code later

# ideas from https://www.geeksforgeeks.org/reading-excel-file-using-python/
# load abstract data

my_abstracts = xlrd.open_workbook(filepath + pathseparator + filename)
sheet = my_abstracts.sheet_by_index(0)

f = open(filepath + pathseparator + preamblefile, 'r')
preamble = f.read()
preamble = preamble.replace('##CONFERENCENAME##', conferencename)
preamble = preamble.replace('##AUTHORNAME##', '')
preamble = preamble.replace('##DATE##', conferencedate)

# commented out for printing one-sided: works best for computer display
# if you want to print the booklet, uncomment the following line

# preamble = preamble.replace(',oneside', '')


# https://stackoverflow.com/questions/26951538/python-using-xlrd-to-obtain-column-heading-and-using-a-loop-to-create-variables
headers = [str(cell.value) for cell in sheet.row(0)]

# check the Excel column header names as they must match field names below
print (headers)

arr = []
for rowind in range(sheet.nrows)[1:]:
    arr.append([ cell.value for cell in sheet.row(rowind)])

# https://machinelearningmastery.com/index-slice-reshape-numpy-arrays-machine-learning-python/
data = np.rec.fromrecords(arr, names=headers)


['#', 'track #', 'track name', 'title', 'authors', 'submitted', 'last updated', 'form fields', 'keywords', 'decision', 'notified', 'reviews sent', 'abstract', 'day', 'time', 'timesort']


In [11]:
openabstract  = '\\begin{conf-abstract}['
closeabstract = '\\end{conf-abstract}'

endabstracts = '\\end{document}'

In [12]:
# print preamble
# note field names used here as indexes -- must match Excel column headings
# field names used: decision day title authors abstract

# if you have affiliation as well in the spreadsheet add that to the placeholder
# otherwise it must be output as empty in { }

outputstring = preamble

# extract each abstract

for a in range (data.shape[0]):
    if (data[a]['decision'] == 'accept'):  # the value accept is assumed to flag abstracts to be used
        outputstring = outputstring + openabstract
        dayval = data[a]['day'].replace('.0', '')
        outputstring = outputstring + dayval + ' ' + conferencemonth + '\\\\' + data[a]['time'] + ']' + '\n'
        outputstring = outputstring + '{' + data[a]['title'] + '}' + '\n'
        outputstring = outputstring + '{' + data[a]['authors'] + '}' + '\n'
        outputstring = outputstring + '{' +  '}' + '\n' # placeholder for affiliation data[a]['affiliation'] +
        outputstring = outputstring + '{' + data[a]['abstract'] + '}' + '\n'
        outputstring = outputstring + closeabstract + '\n'

# complete the LaTeX file
outputstring = outputstring + endabstracts + '\n'

# if you don’t want an output file, the output will appear next
if texfile == '':
    print(outputstring)
else: # if you produce an output file, choose Open from the File menu to find it
    tex_file = open(texfile, 'w')
    n = tex_file.write(outputstring)
    tex_file.close()