# Meeting planner
This notebook is controlled by the SICB Program Officer to manage the development of an annual meeting.

## Preliminary items

You'll have to execute this cell each time you work with the notebook because it defines the paths and imports the essential packages.
The first time you run it, you will want to make sure that the abstracts downloaded from X-CD are stored in the data_root directory in the data_root path. 
The root_path should also have [keywords.xlsx](), which details the initial keywords to be used for the GPT ratings.

In [None]:
# Import outside packages
import os, sys

# Mount Google Drive, if running on Google Colab
if 'COLAB_GPU' in os.environ:
    # Mount Google Drive
    from google.colab import drive
    drive.mount('/content/drive')

    # Set the data root to a Google Drive folder
    data_root = '/content/drive/MyDrive/meeting_planning_2024'
    code_root = '/content/drive/MyDrive/Colab Notebooks/conference_planner'

    # Add code to path
    sys.path.append(code_root)
    # os.chdir(data_root)

# If running locally, set the data root
else:
    data_root = '/Users/mmchenry/Documents/Projects/meeting_planner_2024'

# Import conference_planner code
import make_sessions as ms
import preprocessing as pp

# Abstract data file, without its extension, saved at data_root (an xlsx file, downloaded from X-CD)
abstract_filename = 'abstracts_123852'

# Check the directory structure and create if necessary
pp.setup_directories(data_root, abstract_filename)

## Preprocess the abstracts
This section is intended to flag and filter out duplicate and otherwise problematic abstract submissions.
Only needs to be run once, when the abstracts come in. 

In [None]:
# Adds columns to the abstracts data, renames some columns, and saves the result as abstracts_revised.xlsx
pp.process_abstracts(data_root, abstract_filename)

# Flag abstracts to exclude, due to duplicate titles or IDs. Save abstracts_revised.xlsx with the column 'exclude' added. Also save abstracts_excluded.xlsx with the excluded abstracts.
pp.flag_duplicates(data_root)

# Identify authors that submitted multiple abstracts. Save list to 'duplicate_primary_contacts.xlsx'
pp.find_duplicate_authors(data_root)

## Create divisional files
This distributes the abstract and keyword files to the divisional directories.
You will want to be sure to set the edit permission of each directory to enable the corresponding DPOs to edit these files. 
The DPOs can then edit the keywords, if they like, before GPT's ratings.

In [None]:
# Load each type of abstract, save each type of presentation in separate csv files
pp.distribute_abstracts(data_root)

# Creates a csv file for recording the keywords ratings used by GPT
pp.setup_ratings(data_root)

## Setting up GPT-4
Once the DPOs have approved of their keywords, this cell uses GPT-4 to rate how well each keyword characterizes each abstract.
The ratings are provided on a scale from 0 to 1.
Note that this step costs money, so you will ideally run this only once.

This cell does use the [OpenAI API](https://openai.com/blog/openai-api) and so requires an account. The account number should be protected, as anyone who has it could charge queries to the OpenAI servers, which incurs charges.

You will additionally want to install the Open AI package to connect with GPT-4. 
For this, you need to run the following cell only once:

In [None]:
! pip install openai

## Running GPT-4
Next, you'll run the following cell to get GPT-4's ratings of keywords for each abstract. 
If the code fails (e.g., the OpenAI account runs out of money or the servers are down), then you can simply restart it and it will pick up from where it left off.

In [4]:
import gpt_work as gpt

# Path to text file for the OPEN-AI API key
path_to_API_key = '/Users/mmchenry/Documents/code/openai_api_key.txt'

# Run GPT to generate keyword ratings
gpt.analyze_abstracts(data_root, path_to_API_key, max_attempts=5)

Processing talks_ratings for division edu ...
   abstract 1 completed. 23 remaining.
   abstract 2 completed. 22 remaining.
   abstract 3 completed. 21 remaining.
   abstract 4 completed. 20 remaining.
   abstract 5 completed. 19 remaining.
   abstract 6 completed. 18 remaining.
   abstract 7 completed. 17 remaining.
   abstract 8 completed. 16 remaining.
   abstract 9 completed. 15 remaining.
   abstract 10 completed. 14 remaining.
   abstract 11 completed. 13 remaining.
   abstract 12 completed. 12 remaining.
   abstract 13 completed. 11 remaining.
   abstract 14 completed. 10 remaining.
   abstract 15 completed. 9 remaining.
   abstract 16 completed. 8 remaining.
   abstract 17 completed. 7 remaining.
   abstract 18 completed. 6 remaining.
   abstract 19 completed. 5 remaining.
   abstract 20 completed. 4 remaining.
   abstract 21 completed. 3 remaining.
   abstract 22 completed. 2 remaining.
   abstract 23 completed. 1 remaining.
   abstract 24 completed. 0 remaining.
Processing ta

After GPT has generated keyword ratings for each abstract, the DPOs can use [meeting_planner](meeting_planner.ipynb) to adjust their sessions.
It would be a good idea to download a copy of all folders and files as a backup. 
If any of the GPT ratings files are accidentally deleted, then you will want to be able to restore them without having to re-run GPT-4.


## Create draft program
The draft program can be generated at any time after the GPT ratings have been generated.