Prerequisite: prepare your API key.

Here we provide the key via environment variables because LangChain will try to use environment variables to initialize the key. Otherwise we have to pass the key to `ChatOpenAI` instance.

In [None]:
import os
os.environ['OPENAI_API_BASE'] = ''
os.environ['OPENAI_API_KEY'] = ''

Step 1. Load the pdf file and check its section titles

In [None]:
from paper_reader.paper import Paper

paper = Paper('./glib.pdf')

The program will guess what are the section titles in the pdf document. If the titles are not correct, you can manually assign correct section titles (a list of str) to `titles`

In [None]:
titles = paper.parse_pdf_title()
paper.split_paper_by_titles(titles)
for title, content in paper.paper_parts:
    print(title)

Step 2. Declare some key points that you want GPT to focus on and see the estimated api cost

In [None]:
points = [
    "What is the scope of the problem that this paper addresses?",
    "What are the key insights that this paper provides into the problem it addresses?",
    "What are the limitations of this study?"
]
paper.estimate_cost('')

Step 3. Let GPT generate summaries for you

In [None]:
from IPython.display import display, Markdown, clear_output
text_to_be_rendered = ''
# Here I define a callback function that displays rendered text to notebook.
def show_response(resp, title, num, total):
    global text_to_be_rendered
    text_to_be_rendered += f'**Summary for the {title} (part {num+1} of {total})**\n\n'
    text_to_be_rendered += resp
    text_to_be_rendered += '\n\n'
    clear_output(wait=False)
    display(Markdown(text_to_be_rendered))
paper.read_paper(points, callback=show_response)

Step 4. Save the results to local file

In [None]:
import pickle
with open('paper-glib.pkl', 'wb') as f:
    pickle.dump(paper, f)

Use these functionalities for non-interactive scripts.

In [None]:
def read_and_summarize(pdf_path):
    paper = Paper(pdf_path)
    points = [
        "What is the scope of the problem that this paper addresses?",
        "What are the key insights that this paper provides into the problem it addresses?",
        "What are the limitations of this study?"
    ]
    paper.read_paper(points)
    with open(pdf_path+'.pkl', 'wb') as f:
        pickle.dump(paper, f)
    return paper.paper_summaries