# Meeting planner
This notebook is controlled by the SICB Program Officer to manage the development of an annual meeting.
This can be run on Google CoLab, or the PO's personal machine, provided it is configured to run Python code.

## System configuration
Execute the following cell on the first occasion to install the required packages and libraries.

In [None]:
# The following libraries are needed for preprocessing
import nltk
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('averaged_perceptron_tagger')

# The following is needed to run GPT-4
# ! pip install openai # This version works on my local machine
! pip install openai==0.28 # This version works with Google Colab

## Preliminary items

You'll have to execute this cell each time you work with the notebook because it defines the paths and imports the essential packages.
The first time you run it, you will want to adjust the paths for data_root and code_root for your system. 
Also, make sure that the abstracts downloaded from X-CD are stored in the data_root directory in the data_root path. 
The root_path should also have [keywords.xlsx](), which details the initial keywords to be used for the GPT ratings.
A template for [keywords.xlsx]() should be included in the conference_planner repository.

In [None]:
# Import outside packages
import os, sys

# Mount Google Drive, if running on Google Colab
if 'COLAB_GPU' in os.environ:
    # Mount Google Drive
    from google.colab import drive
    drive.mount('/content/drive')

    # Set the data root to a Google Drive folder
    data_root = '/content/drive/MyDrive/meeting_planning_2024'
    code_root = '/content/drive/MyDrive/Colab Notebooks/conference_planner'

    # Add code to path
    sys.path.append(code_root)

# If running locally, set the data root
else:
    data_root = '/Users/mmchenry/Documents/Projects/meeting_planner_test'

# Import conference_planner code
import make_sessions as ms
import preprocessing as pp

# Abstract data file, without its extension, saved at data_root (an xlsx file, downloaded from X-CD)
abstract_filename = 'abstracts_123852'

# Check the directory structure and create if necessary
pp.setup_directories(data_root, abstract_filename)

## Preprocess the abstracts
This section is intended to flag and filter out duplicate and otherwise problematic abstract submissions.
Only needs to be run once, when the abstracts come in. 

In [None]:
# Adds columns to the abstracts data, renames some columns, and saves the result as abstracts_revised.xlsx
pp.process_abstracts(data_root, abstract_filename)

# Flag abstracts to exclude, due to duplicate titles or IDs. Save abstracts_revised.xlsx with the column 'exclude' added. Also save abstracts_excluded.xlsx with the excluded abstracts.
pp.flag_duplicates(data_root)

# Identify authors that submitted multiple abstracts. Save list to 'duplicate_primary_contacts.xlsx'
pp.find_duplicate_authors(data_root)

## Create divisional files
This distributes the abstract and keyword files to the divisional directories.
You will want to be sure to set the edit permission of each directory to enable the corresponding DPOs to edit these files. 
The DPOs can then edit the keywords, if they like, before GPT's ratings.

In [None]:
# Load each type of abstract, save each type of presentation in separate csv files
pp.distribute_abstracts(data_root)

# Creates a csv file for recording the keywords ratings used by GPT
pp.setup_ratings(data_root)

# Creates XLSX files for each division to adjust the weights for each keyword
pp.setup_weights(data_root)

## Running GPT-4

Once the DPOs have approved of their keywords, the cell below uses GPT-4 to rate how well each keyword characterizes each abstract.
The ratings are provided on a scale from 0 to 1.
Note that this step costs money, so you will ideally run this only once.

This cell does use the [OpenAI API](https://openai.com/blog/openai-api) and so requires an account. 
The account number should stored in a text file that should keep the number protected, as anyone who has it could charge queries to the OpenAI servers, which incurs charges.

Note that if the code fails (e.g., the OpenAI account runs out of money or the servers are down), then you can simply restart it and it will pick up from where it left off.

In [None]:
import gpt_work as gpt

# Path to text file for the OPEN-AI API key
path_to_API_key = '/Users/mmchenry/Documents/code/openai_api_key.txt'

# Run GPT to generate keyword ratings
gpt.analyze_abstracts(data_root, path_to_API_key, max_attempts=5)

After GPT has generated keyword ratings for each abstract, the DPOs can use [meeting_planner](meeting_planner.ipynb) to adjust their sessions.
It would be a good idea to download a copy of all folders and files as a backup. 
If any of the GPT ratings files are accidentally deleted, then you will want to be able to restore them without having to re-run GPT-4.