## 🧠 Summarize & Ask PDF using OpenAI API

### 📌 What does this app do?
This Streamlit app allows users to:
1. Upload any PDF file
2. Read and extract text from it
3. Ask questions or request a summary
4. Get real-time responses from OpenAI's LLM models

### 🧰 Tools & Technologies
- Python
- Streamlit
- OpenAI API
- PyPDF2 / pdfplumber
- dotenv

### 🧪 How it works (Summary)
- Load the PDF using PyPDF2
- Display its content in the Streamlit interface
- Let user select a part of the text or ask a question
- Send it to OpenAI ChatCompletion API
- Show the result as summary or answer

### 🖼 Screenshot
![App Screenshot](Screenshot.png)

### 🔗 Resources
- [GitHub Repo](https://github.com/elbazhazem/summarize_ask_pdf)
- [Video I learned from](https://www.youtube.com/watch?v=yq803m5ESXI)

### ✍️ What I Learned
This was my first hands-on project using the OpenAI API. I learned:
- How to create structured prompts
- How to use Streamlit for quick UI
- How to connect backend logic with OpenAI to process dynamic inputs

---

*Stay tuned for my next post where I connect this learning to cybersecurity logs processing using LLMs.*


## Code Building and Description 

### Step 01: Prepare our libraries 
- openai
- pyPDF2
- pdfplumber
- streamlit
- plotly
- io

in your terminal your can pip these libraries as follow:

In [1]:
#pip install openai pyPDF2 pdfplumber
#pip install --upgrade openai
#pip install streamlit
#pip install --upgrade plotly

After that you can impot them as shown:

In [None]:
#Import our libraries
import os
import openai
import pdfplumber
from openai import OpenAI
from PIL import Image
from io import BytesIO
import streamlit as st

### Step 02: Prepare OpenAI API key
in this step you should have your own openai API key before. if you dont have before, you can gain one form the link : [http://platform.openai.com/api-keys] note:(you should have account on openai site.). after you gain your API key ypou will use in the next code.

In [None]:
# Set your OpenAI API key   
os.environ["OPENAI_API_KEY"] = ""

# Initialize OpenAI client
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

### Step 03: Extract text from PDF file
this step you will code to extract text form pdf file, to manipulate later and use the text reslut as you need.
the code to do that is here.

In [None]:
# Extact text from PDF file
def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
    return text

### Step 04: Summarize the text using OpenAI's API
in the following code you will use the result from previous function as input and make summarize the text as output. the code to do that  is here.

In [None]:
# Summazie the text using OpenAI's GPT-3.5 Turbo model
def summarize_text(text):
    """Summarizes the text using OpenAI's latest API."""
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that summarizes text."},
            {"role": "user", "content": f"Summarize the following text: {text}"}
        ],
        max_tokens=300,
        temperature=0.7
    )
    # summary = response['choices'][0]['message']['content']
    summary = response.choices[0].message.content
    return summary

### Step 05: Ask a question about text using OpenAI's API
here the app allow you to ask a question about the pdf file and it will answer your question. the code to do that is here.

In [None]:
# Ask a question about the PDF file
def ask_question(text, question):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that answers questions about the provided text."},
            {"role": "user", "content": f"Please answer the following question based on the text:\n\n{text}\n\nQuestion: {question}"}
        ],
        max_tokens=300,
        temperature=0.7,
    )
    # answer = response['choices'][0]['message']['content']
    answer = response.choices[0].message.content.strip()
    return answer       

### Step 06: Create main function for the application
the main function will contain the whole previous functions and collerate these to run as one application. the code to do that is here.

In [None]:
# Main function to process the PDF and answer questions
def main():
    st.title("ChatGPT AI - PDF Chatbot summarize and asking")
    uploaded_file = st.file_uploader("Upload a PDF", type="pdf")
    
    if uploaded_file is not None:
        text = extract_text_from_pdf(uploaded_file)
        if text:
            st.write("Text extracted from the PDF:")
            st.text_area("Extracted Text", text, height=150)
            
            # Show a summary of the text
            if st.button("Summarize Text"):
                summary = summarize_text(text)
                st.subheader("Summary:")
                st.write(summary)

            # Ask questions about the text
            question = st.text_input("Ask a question based on the text")
            if st.button("Get Answer"):
                answer = ask_question_about_text(text, question)
                st.subheader("Answer:")
                st.write(answer)
        else:
            st.error("No text could be extracted from this PDF.")

if __name__ == "__main__":
    main()

## Final Result in one file 

In [None]:
#Import our libraries
import os
import openai
import pdfplumber
from openai import OpenAI
from PIL import Image
from io import BytesIO
import streamlit as st


# Set your OpenAI API key   
os.environ["OPENAI_API_KEY"] = ""

# Initialize OpenAI client
client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),
)

# Extact text from PDF file
def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text + "\n"
    return text

 # Summazie the text using OpenAI's GPT-3.5 Turbo model
def summarize_text(text):
    """Summarizes the text using OpenAI's latest API."""
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that summarizes text."},
            {"role": "user", "content": f"Summarize the following text: {text}"}
        ],
        max_tokens=300,
        temperature=0.7
    )
    # summary = response['choices'][0]['message']['content']
    summary = response.choices[0].message.content
    return summary

# Ask a question about the PDF file
def ask_question(text, question):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant that answers questions about the provided text."},
            {"role": "user", "content": f"Please answer the following question based on the text:\n\n{text}\n\nQuestion: {question}"}
        ],
        max_tokens=300,
        temperature=0.7,
    )
    # answer = response['choices'][0]['message']['content']
    answer = response.choices[0].message.content.strip()
    return answer       

# Main function to process the PDF and answer questions
def main():
    st.title("Chat GPT AI - PDF Chatbot")
    uploaded_file = st.file_uploader("Upload a PDF", type="pdf")
    
    if uploaded_file is not None:
        text = extract_text_from_pdf(uploaded_file)
        if text:
            st.write("Text extracted from the PDF:")
            st.text_area("Extracted Text", text, height=150)
            
            # Show a summary of the text
            if st.button("Summarize Text"):
                summary = summarize_text(text)
                st.subheader("Summary:")
                st.write(summary)

            # Ask questions about the text
            question = st.text_input("Ask a question based on the text")
            if st.button("Get Answer"):
                answer = ask_question_about_text(text, question)
                st.subheader("Answer:")
                st.write(answer)
        else:
            st.error("No text could be extracted from this PDF.")

if __name__ == "__main__":
    main()