In the opening plenary at the ESIP Summer Meeting 2023, one of our exercises was looking at the different barriers to open science and coming up with some ideas via a sli.do poll on the big things we should do about it. The list was pretty huge and overwhelming, so I thought I'd throw it at an LLM with a prompt to give us a synthesis of essentially the most repeated (not counting popularity via up-voting) ideas. I use Llama Index here with the davinci 3 model from OpenAI.

In [1]:
import os
import getpass
from llama_index import SimpleDirectoryReader, ListIndex, LLMPredictor, ServiceContext
from langchain import OpenAI
import streamlit as st


In [6]:
# You need an OPENAI key to run the call to the model; this could be improved to operate on another framework but this is quick and dirty
if not os.environ.get('OPENAI_API_KEY'):
    os.environ['OPENAI_API_KEY'] = getpass.getpass(prompt="Enter your OpenAI API key: ")

In [7]:
DEFAULT_TERM_STR = (
    "Use the lines in the provided documents that do not start with either 'Anonymous' or a number "
    "to provide a synthesis of common ideas between the lines. "
    "Structure your response as a list of the top 5 ideas summarized from the text. "
    "Based on that set of synthesized ideas, show me a secondary list of 5 outliers, "
    "prioritizing those lines with the greatest length of text."
)

term_extract_str = st.text_area("The query to extract terms and definitions with.", value=DEFAULT_TERM_STR)

llm = OpenAI(
    temperature=0, 
    model_name="text-davinci-003", 
    max_tokens=1024
)

For the documents here, I just used the expedient of a copy/paste from sli.do into a text document that I read in here. I include a filter in the prompt to only focus on the lines I care about - the poll responses.

In [8]:
documents = SimpleDirectoryReader('./data/open_science_poll').load_data()

In [9]:
# Set the context
service_context = ServiceContext.from_defaults(llm_predictor=LLMPredictor(llm=llm), chunk_size=1024)
index = ListIndex.from_documents(documents, service_context=service_context)
query_engine = index.as_query_engine(response_mode="tree_summarize")

In [10]:
# Run the prompt
sum_synth = str(query_engine.query(term_extract_str))

The results aren't bad and show some interesting dynamics from the discussion. The synthesis ideas come across as pretty simplistic, but that's maybe what we would expect at this level - lots of basically the same ideas written out quickly without a lot of depth. I also find a look at the outliers to be kind of interesting. Really, we should have this kind of capacity immediately available behind Sli.do or any of these kinds of tools - use AI to summarize and synthesize what a big group of HI comes up with. It would also be interesting to incorporate vote counts into the mix here, looking at what adding that dimension does to the results.

In [11]:
sum_synth.split("\n")

['',
 'Top 5 Ideas:',
 '1. Increase access to data, tools, and resources for all.',
 '2. Improve communication and understanding between different categories of science.',
 '3. Provide training and empowerment opportunities for underrepresented and excluded backgrounds.',
 '4. Reduce jargon and make language more succinct and understood.',
 '5. Create community standards within disciplines.',
 '',
 'Outliers:',
 '1. Demonstrate how an open science researcher can sustainably foster competitive, productive leading-edge research while still being regarded as an expert in their field throughout their career.',
 '2. Equity (not the same as equal) in access to build knowledge and skills and access to tools must be a baseline standard.',
 '3. Do wider outreach outside the science community — bring science to schools, libraries, etc. Make data accessible in ways a non-scientist can understand and appreciate, especially younger people who might pursue it in the future.',
 '4. Create interfaces,