# AI PDF Reader Assistant

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#1.-OpenAI-API-KEY" data-toc-modified-id="1.-OpenAI-API-KEY-1">1. OpenAI API KEY</a></span></li><li><span><a href="#2.-Testing-GPT4-from-LangChain" data-toc-modified-id="2.-Testing-GPT4-from-LangChain-2">2. Testing GPT4 from LangChain</a></span></li><li><span><a href="#3.-Loading-PDF-file" data-toc-modified-id="3.-Loading-PDF-file-3">3. Loading PDF file</a></span></li></ul></div>

## 1. OpenAI API KEY

To carry out this project, we will need an API KEY from OpenAI to use the GPT-4 Turbo model. This API KEY can be obtained at https://platform.openai.com/api-keys. It is only displayed once, so it must be saved at the moment it is obtained. Of course, we will need to create an account to get it.

We store the API KEY in a `.env` file to load it with the dotenv library and use it as an environment variable. This file is added to the `.gitignore` to ensure that it cannot be seen if we upload the code to GitHub, for example.

In [1]:
# import API KEY

import os                           # operating system library
from dotenv import load_dotenv      # load environment variables  


load_dotenv()


OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

## 2. Testing GPT4 from LangChain

We are going to test the connection from LangChain to the GPT-4 model.

In [4]:
from langchain_openai.chat_models import ChatOpenAI   # LangChain connection to OpenAI

model = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-4-turbo")

response = modelo.invoke("Who is Apple's CEO?")

response.content

"As of my last update in 2023, Apple's CEO is Tim Cook. He has been in this position since August 2011, following the resignation of Apple's co-founder Steve Jobs."

## 3. Loading PDF file

In [11]:
os.listdir("../pdfs")

['_10-K-Q4-2023-As-Filed.pdf']

In [5]:
from langchain_community.document_loaders import PyPDFDirectoryLoader

In [12]:
# loads PDF file page by page

loader = PyPDFDirectoryLoader("../pdfs/")

pages = loader.load()

In [13]:
len(pages)

80

In [16]:
pages[0]  # f

Document(page_content='UNITED STATES\nSECURITIES AND EXCHANGE COMMISSION\nWashington, D.C. 20549\nFORM 10-K\n(Mark One)\n☒    ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934\nFor the fiscal year ended September\xa030, 2023\nor\n☐    TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934\nFor the transition period from \xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0  to \xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0 .\nCommission File Number: 001-36743\nApple Inc.\n(Exact name of Registrant as specified in its charter)\nCalifornia 94-2404110\n(State or other jurisdiction\nof incorporation or organization)(I.R.S. Employer Identification No.)\nOne Apple Park Way\nCupertino , California 95014\n(Address of principal executive offices) (Zip Code)\n(408) 996-1010\n(Registrant’s telephone number, including area code)\nSecurities registered pursuant to Section 12(b) of the Act:\nTitle of each classTrading \nsymbol(s) Name