# Problem Statement
__You have been tasked with developing an AI system that can analyze a PDF file and generate a summary. of its contents. Additionally, the system should be integrated with a Chat GPT API to allow users to ask. questions related to the PDF file and received relevant answers__.

# Objective:
__The objective of this assignment is to develop an AI system that can__:
1. Analyze a PDF file and generate a summary of its contents.
2. Integrate with a Chat GPT API to answer user questions related to the PDF file.
3. Suggest questions based on the PDF file content.
4. Provide successive questions based on user questions.

# Requirements:
__The following are the requirements for the AI system__:
1. The system should be able to analyze a PDF file and generate a summary of its contents using NLP techniques.
2. The system should be integrated with a Chat GPT API to allow users to ask questions related to the PDF file and receive relevant answers.
3. The system should be able to suggest questions based on the content of the PDF file.
4. The system should provide successive questions based on user questions to allow for a more natural conversation flow.
5. The system should be able to handle multiple users concurrently.
6. The system should be able to handle errors and exceptions gracefully.

# Steps to Run-:
1. __Install the modules__. 
2. __Replace the openai key on line 3__. 
3. __Replace the Pdf file path on line 80__. 

# Module Installation

In [4]:
! pip install openai
# !pip install pdfplumber



In [6]:
! pip install pip --upgrade
! pip install pyopenssl --upgrade

Collecting pyopenssl
  Downloading pyOpenSSL-23.1.1-py3-none-any.whl (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.9/57.9 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pyopenssl
  Attempting uninstall: pyopenssl
    Found existing installation: pyOpenSSL 20.0.1
    Uninstalling pyOpenSSL-20.0.1:
      Successfully uninstalled pyOpenSSL-20.0.1
Successfully installed pyopenssl-23.1.1


# Importing Modules

In [7]:
import openai
import pdfplumber
openai.api_key = 'Your_OpenAI_key'

In [3]:
 # Extracts the text content from a PDF file and return in one string
def extract_pdf_text(file_path):
    with pdfplumber.open(file_path) as pdf:
        text = ""
        for page in pdf.pages:
            text += page.extract_text()
        return text
    
# Generates a summary of the input text using OpenAI's text-davinci-003 engine
def generate_summary(text):
    prompt = f"Summarize:\n\n{text[:1000]}"
    response = openai.Completion.create(
        engine='text-davinci-003',
        prompt=prompt,
        max_tokens=300,
        temperature=0.5,
        n=1,
        stop=None
    )
    summary = response.choices[0].text.strip()
    return summary
    
# Generates a specified number of questions based on the given context using OpenAI's text-davinci-003 engine
def generate_questions(context, num_questions):
    response = openai.Completion.create(
        engine='text-davinci-003',
        prompt=context + "\nQ:",
        max_tokens=200,
        temperature=0.5,
        n=num_questions,
        stop=None
    )
    questions = [choice.text.strip() for choice in response.choices]
    return questions

def process_pdf(file_path):
    try:
        # Extract the text from the PDF file
        pdf_text = extract_pdf_text(file_path)

        # Generate the summary of the extracted text
        summary = generate_summary(pdf_text)

        # Generate initial questions
        num_initial_questions = 5
        initial_questions = generate_questions(pdf_text, num_initial_questions)

        # Prepare output
        output = {
            'file_path': file_path,
            'summary': summary,
            'initial_questions': initial_questions
        }

        # Print the summary and initial questions
        print(f"\nSummary for file '{file_path}':\n{summary}")

        print(f"\nInitial Questions for file '{file_path}':")
        for i, question in enumerate(initial_questions):
            print(f"Question {i+1}: {question}")

        # Ask the user for additional questions
        user_questions = []
        while True:
            user_input = input("\nDo you have any questions? (Y/N): ")
            if user_input.lower() == 'n':
                break
            elif user_input.lower() == 'y':
                question = input("Your Question: ")
                user_questions.append(question)
            else:
                print("Invalid input. Please enter 'Y' for yes or 'N' for no.")

        output['user_questions'] = user_questions

        return output

    except Exception as e:
        error_message = f"An error occurred: {str(e)}"
        return {'file_path': file_path, 'error': error_message}

# Specify the paths to the PDF files
pdf_paths = ['FrenchInput.pdf', 'Analysis Report_ Comparison of Different Models for Dispense Amount Prediction.pdf', 'hindiinput.pdf']

# Process the PDF files
results = []
for pdf_path in pdf_paths:
    result = process_pdf(pdf_path)
    results.append(result)

# Print the final results
for result in results:
    file_path = result['file_path']
    if 'error' in result:
        print(f"Error processing file '{file_path}': {result['error']}")
    else:
        print(f"\nFile: '{file_path}'")
        print("\nUser Questions:")
        for i, question in enumerate(result['user_questions']):
            print(f"Question {i+1}: {question}")



Summary for file 'FrenchInput.pdf':
ées pour entraîner le
modèle, tandis que les données de test sont utilisées pour évaluer la précision du modèle.
4. Entraînement du modèle : Une fois que les données sont prêtes, le modèle est entraîné.
5. Évaluation du modèle : Enfin, le modèle est évalué en utilisant des métriques spécifiques pour
mesurer sa précision et sa performance.

La modélisation prédictive est une technique de science des données qui consiste à identifier des modèles et des relations dans des données afin de faire des prédictions. Le processus de modélisation prédictive comprend la compréhension des données, la sélection des caractéristiques pertinentes, la division des données en ensembles d'entraînement et de test, l'entraînement du modèle et l'évaluation du modèle.

Initial Questions for file 'FrenchInput.pdf':
Question 1: Quels sont les avantages de la modélisation prédictive ?

R: Les avantages de la modélisation prédictive comprennent : 
1. Prédire le comportement et