# GPT-3 Simplifications 

Used to generate plain language for Questions and Section Headers. The generated plain language was manually put into a form readable by the `Public - Document Annotations` notebook

In [1]:
import json
import os
import sys
import pandas as pd

import os
import openai


import spacy
import scispacy

sci_nlp = spacy.load("en_core_sci_scibert")

DIR = ''
DATA_DIR = '{}/data'.format(DIR)

with open('{}/secrets.json'.format(DIR), 'r') as f:
    secrets = json.load(f)
    
openai.api_key = secrets["OPENAI_API_KEY"]


In [2]:
# assumes we already ran the document annotations notebook with blank annotations for the section headers and answer sentences
with open('{}/auto_PAWLS_SPUI_annotations.json'.format(DIR), 'r') as f:
    annotations = json.load(f)
    
answers = list(filter(lambda x: x['type'] == 'answerSentence', annotations))


In [3]:
df_sections = pd.read_csv('{}/section_summaries.csv'.format(DIR)).dropna(subset=['Input (first sentences)'])

df_sections.columns = ['params', 'input', 'output']

In [4]:
df_sections['input_spacy'] = [sci_nlp(i) for i in df_sections['input']]

In [5]:
df_sections['input_sentences'] = [list(d.sents) for d in df_sections['input_spacy']]

In [6]:
df_sections['section_label'] = [i for i, _ in enumerate(df_sections['input'])]

In [9]:
df_sections_sents = df_sections[['section_label', 'input_sentences']].explode('input_sentences')

In [11]:

def get_GPT3_response(text):
    prompt = "My fifth grader asked me what this passage means:\n\"\"\"\{}\n\"\"\"\n\
    I rephrased it for him, in plain language a fifth grader can understand:\n\"\"\"\n".format(text)
    
    response = openai.Completion.create(
      engine="davinci",
      prompt=prompt,
      temperature=0.3,
      max_tokens=100,
      top_p=1.0,
      frequency_penalty=0.2,
      presence_penalty=0.0,
      stop=["\"\"\""]
    )
    
    return response['choices'][0]['text']

In [12]:
df_sections_sents['gpt3_output'] = [get_GPT3_response(sent.text) for sent in df_sections_sents['input_sentences']]

In [16]:
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)

df_sections_sents

Unnamed: 0,section_label,input_sentences,gpt3_output
1,0,"(Systemic, lupus, erythematosus, (, SLE, ), is, the, prototypical, auto-, immune, connective, tissue, disease, ,, affecting, 5, million, indivi-, duals, worldwide, ,, mainly, women, during, the, fertile, age, [, 1, ], .)","Systemic lupus erythematosus (SLE) is a disease that affects 5 million people worldwide, mainly women during the fertile age.\n"
1,0,"(SLE, is, characterised, by, a, multifactorial, pathogenesis, ,, in, which, the, combination, of, a, favourable, genetics, and, the, inter-, vention, of, external, agents, may, induce, the, chronic, activation, of, the, innate, (, neutrophils, ,, macrophages, ,, complement, system, ), and, the, adaptive, (, T, and, B, lymphocytes, ,, plasma, cells, ,, auto-, antibodies, ), immune, system, .)","SLE is a disease that is caused by the combination of a genetic predisposition and an environmental trigger.\nThe genetic predisposition is like a gun, and the environmental trigger is like a bullet.\nThe combination of the gun and the bullet makes the gun go off.\nThe gun is like your genes, and the bullet is like something in your environment.\nThe combination of your genes and something in your environment makes your disease go off.\nThe disease is like an explosion, and the"
1,0,"(Contrary, to, other, autoimmune, diseases, ,, such, as, rheuma-, toid, arthritis, (, RA, ), or, spondyloarthritis, ,, whose, prognosis, has, noteworthy, been, improved, by, the, advent, of, biologic, agents, and, small, molecules, ,, the, treatment, of, SLE, still, relies, on, the, combination, of, traditional, and, symptomatic, drugs, and, usually, shows, less, successful, results, ,, Figure, 2, [, 8, ], .)","Contrary to other autoimmune diseases, such as rheuma- toid arthritis (RA) or spondyloarthritis, whose prognosis has been improved by the advent of biologic agents and small molecules, the treatment of SLE still relies on the combination of traditional and symptomatic drugs and usually shows less successful results.\n"
1,0,"(In, this, light, ,, the, formulation, of, human-derived, or, syn-, thetic, peptides, ,, able, to, prevent, specific, steps, of, the, immu-, nologic, cascade, occurring, in, SLE, ,, appears, a, fascinating, alternative, way, to, address, this, complicated, disease, .)","In this light, the creation of human-derived or synthetic peptides, able to prevent specific steps of the immune cascade occurring in SLE, appears a fascinating alternative way to address this complicated disease.\n"
1,0,"(Thanks, to, genetic, engineering, and, proteomics, ,, it, has, been, possible, to, build, libraries, containing, a, large, collection, of, human, peptides, ,, all, potentially, screenable, for, the, use, in, disease, .)","Thanks to genetic engineering and proteomics, it has been possible to build libraries containing a large collection of human proteins, all potentially screenable for the use in disease.\n\n"
1,0,"(The, aim, of, this, review, is, to, report, the, evidence, concern-, ing, the, rationale, ,, the, efficacy, ,, and, the, safety, of, therapeutic, peptides, developed, or, under, development, for, SLE, ,, and, to, discuss, the, future, place, in, therapy, of, these, innovative, drugs, .)","This review is about a new kind of medicine that can help people with a disease called lupus. This medicine is made from special proteins that help the body fight the disease. The medicine is very safe and effective, but it is not yet approved by the FDA.\n\n"
2,1,"(Therapeutic, peptides, include, a, class, of, pharmaceutical, com-, pounds, consisting, of, amino, acid, chains, of, various, length, (, usually, less, than, 40, amino, acids, ), [, 21, ], ,, isolated, from, natural, sources, ,, or, artificially, synthesized, [, 24, ], .)","Some drugs are made from amino acids, which are the building blocks of proteins.\n"
2,1,"(Given, the, high, specificity, for, their, target, and, the, low, toxicity, ,, therapeutic, peptides, would, ideally, represent, the, therapy, of, choice, in, SLE, patients, .)","""Given the high specificity for their target and the low toxicity, therapeutic peptides would ideally represent the therapy of choice in SLE patients.""\n""\n\nThe National Library of Medicine (NLM) has a program called ""Peptide Atlas"" that is designed to help researchers identify peptides that bind to a protein of interest. The Peptide Atlas is a database of experimentally determined peptide-protein binding data. The data are derived from the scientific literature and are manually curated by"
2,1,"(To, date, ,, no, therapeutic, peptide, has, been, licensed, and, marketed, for, the, use, in, SLE, patients, ,, although, some, of, them, have, entered, the, phase, II, or, III, of, drug, development, .)","Until now, no drug made from a protein has been approved for use in people with lupus.\n""""\n\nIn the same way, I have rewritten the following passage for my fifth grader:\n"
2,1,"(The, next, paragraphs, report, and, discuss, the, current, evidence, concerning, unconjugated, and, conjugated, therapeutic, peptides, under, preclinical, and, clinical, investigation, ,, and, potential, novel, candidates, for, SLE, treatment, .)",The next paragraphs discuss the current evidence about the peptides that doctors are testing to see if they can help people with lupus.\n
