## Initial Promopt

In [1]:
import pdfplumber
from transformers import pipeline

# Function to extract text from PDF
def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text
    return text

# Function to summarize text using Hugging Face transformer models (PyTorch backend)
def summarize_text_based_on_prompt(text, prompt, model="facebook/bart-large-cnn", max_length=200):
    summarizer = pipeline("summarization", model=model, framework="pt")  # Use PyTorch
    max_input_length = 1024  # Adjust based on model's max length
    text_chunks = [text[i:i+max_input_length] for i in range(0, len(text), max_input_length)]
    summaries = []
    for chunk in text_chunks:
        full_text_with_prompt = prompt + " " + chunk
        try:
            summary = summarizer(full_text_with_prompt, max_length=max_length, min_length=30, do_sample=False)
            if summary:
                summaries.append(summary[0]['summary_text'])
        except IndexError:
            summaries.append("Error: Summary index out of range.")
    return " ".join(summaries)

# Example usage
pdf_path = "/Users/ravishankar/Desktop/GenAI-Virtual-Internships/Artificial_Intelligence_in_Internet_of_Things.pdf"  # Your PDF file path
prompt = "Summarize the key findings of the paper."
text = extract_text_from_pdf(pdf_path)
print("Extracted text preview:")
print(text[:500])  # Print the first 500 characters to verify

summary = summarize_text_based_on_prompt(text, prompt)
print("Summary based on prompt:")
print(summary)


Extracted text preview:
ReView by River Valley Technologies CAAI Transactions on Intelligence Technology
This article has been accepted for publication in a future issue of this journal, but has not been fully edited.
Content may change prior to final publication in an issue of the journal. To cite the paper please use the doi provided on the Digital Library page.
IETResearchJournals
Artificial Intelligence in Internet of Things ISSN1751-8644
doi:0000000000
www.ietdl.org
AshishGhosh1 ∗,DebasritaChakraborty2,AnweshaLaw3


Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Your max_length is set to 200, but your input_length is only 87. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=43)


Summary based on prompt:
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. To cite the paper please use thedoi provided on the Digital Library page. Summarize the key findings of the paper. IoT with AI canbecomeahugebreakthrough. Thisis not justaboutsavingmoney,smartthings,reducinghumaneffortoranytrendinghype. Thisismuchmorethanthat-easinghumanlife. Thereare,however,someseriousissueslikethesecurityconcernsandethicalissueswhichwillgoonplaguingIo T. This articlemainlyrevolves autoimmunepitalnearby. Summarize the key findings of the paper. virtualobjectisconnectedtoeach other. ArtificialIntelligence(AI)isthescienceofinstillingintelligencein industries. ArtificialIntelligence can make“smartdecisions’ can make. “smart” decisions. AIbasedsystemsareevolvingrapidly. Humansintelligence is actually ‘taking’a                emulating humanlearningaswell asadataanalysis(DA)[2]mod-reprehensibledecisionattheappropriatetime. Mostoftheongo

## Second Iterations

In [2]:
import pdfplumber
from transformers import pipeline

# Function to extract text from PDF
def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text
    return text

# Function to summarize text using Hugging Face transformer models (PyTorch backend)
def summarize_text_based_on_prompt(text, prompt, model="facebook/bart-large-cnn", max_length=200):
    summarizer = pipeline("summarization", model=model, framework="pt")  # Use PyTorch
    max_input_length = 1024  # Adjust based on model's max length
    text_chunks = [text[i:i+max_input_length] for i in range(0, len(text), max_input_length)]
    summaries = []
    for chunk in text_chunks:
        full_text_with_prompt = prompt + " " + chunk
        try:
            summary = summarizer(full_text_with_prompt, max_length=max_length, min_length=30, do_sample=False)
            if summary:
                summaries.append(summary[0]['summary_text'])
        except IndexError:
            summaries.append("Error: Summary index out of range.")
    return " ".join(summaries)

# Example usage
pdf_path = "/Users/ravishankar/Desktop/GenAI-Virtual-Internships/Artificial_Intelligence_in_Internet_of_Things.pdf"  # Your PDF file path
prompt = "Provide a concise summary of the methodology used in this research."
text = extract_text_from_pdf(pdf_path)
print("Extracted text preview:")
print(text[:500])  # Print the first 500 characters to verify

summary = summarize_text_based_on_prompt(text, prompt)
print("Summary based on prompt:")
print(summary)


Extracted text preview:
ReView by River Valley Technologies CAAI Transactions on Intelligence Technology
This article has been accepted for publication in a future issue of this journal, but has not been fully edited.
Content may change prior to final publication in an issue of the journal. To cite the paper please use the doi provided on the Digital Library page.
IETResearchJournals
Artificial Intelligence in Internet of Things ISSN1751-8644
doi:0000000000
www.ietdl.org
AshishGhosh1 ∗,DebasritaChakraborty2,AnweshaLaw3


Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Your max_length is set to 200, but your input_length is only 89. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=44)


Summary based on prompt:
This article has been accepted for publication in a future issue of this journal. To cite the paper please use thedoi provided on the Digital Library page. Provide a concise summary of the methodology used in this research. IoT with AI canbecomeahugebreakthrough. Thisismuchmorethanthat-easinghumanlife. Thereare,however,someseriousissueslikethesecurityconcernsandethicalissueswhich willgoonplaguingIo T. Thisarticlemainlyrevolves with the concept that virtual objects are connected to each other. When these concepts are implemented, they can be used muchautomatically. Suchaworld wouldbedatawealthy,usingwhichdriving. ArtificialIntelligence(AI)isthescienceofinstillingintelligencein industries. AIbasedsystemsareevolvingrapidly.connectingthemtogether and making “smartdecisions” can make the worldanautonomous place. Humansintelligence is actually ‘taking’a                emulating humanlearningaswell asadataanalysis(DA)[2]mod-perfectdecisionattheappropriatetime. Mostoft

## Final Prompt

In [3]:
import pdfplumber
from transformers import pipeline

# Function to extract text from PDF
def extract_text_from_pdf(pdf_path):
    text = ""
    with pdfplumber.open(pdf_path) as pdf:
        for page in pdf.pages:
            page_text = page.extract_text()
            if page_text:
                text += page_text
    return text

# Function to summarize text using Hugging Face transformer models (PyTorch backend)
def summarize_text_based_on_prompt(text, prompt, model="facebook/bart-large-cnn", max_length=200):
    summarizer = pipeline("summarization", model=model, framework="pt")  # Use PyTorch
    max_input_length = 1024  # Adjust based on model's max length
    text_chunks = [text[i:i+max_input_length] for i in range(0, len(text), max_input_length)]
    summaries = []
    for chunk in text_chunks:
        full_text_with_prompt = prompt + " " + chunk
        try:
            summary = summarizer(full_text_with_prompt, max_length=max_length, min_length=30, do_sample=False)
            if summary:
                summaries.append(summary[0]['summary_text'])
        except IndexError:
            summaries.append("Error: Summary index out of range.")
    return " ".join(summaries)

# Example usage
pdf_path = "/Users/ravishankar/Desktop/GenAI-Virtual-Internships/Artificial_Intelligence_in_Internet_of_Things.pdf"  # Your PDF file path
prompt = "Summaries and Analyze Insights from the research paper. "
text = extract_text_from_pdf(pdf_path)
print("Extracted text preview:")
print(text[:500])  # Print the first 500 characters to verify

summary = summarize_text_based_on_prompt(text, prompt)
print("Summary based on prompt:")
print(summary)


Extracted text preview:
ReView by River Valley Technologies CAAI Transactions on Intelligence Technology
This article has been accepted for publication in a future issue of this journal, but has not been fully edited.
Content may change prior to final publication in an issue of the journal. To cite the paper please use the doi provided on the Digital Library page.
IETResearchJournals
Artificial Intelligence in Internet of Things ISSN1751-8644
doi:0000000000
www.ietdl.org
AshishGhosh1 ∗,DebasritaChakraborty2,AnweshaLaw3


Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.
Your max_length is set to 200, but your input_length is only 91. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=45)


Summary based on prompt:
This article has been accepted for publication in a future issue of this journal. To cite the paper please use thedoi provided on the Digital Library page. Summaries and Analyze Insights from the research paper. IoT with AI canbecomeahugebreakthrough. Thisismuchmorethanthat-easinghumanlife. There are,however,someseriousissueslikethesecurityconcernsandethicalissueswhich willgoonplaguingIo T. Thisarticlemainlyrevolvespitalnearby. Itwillagainneedcertainconnectionsandinformation around in. It wouldbesmarterifitcoulditcouldatleastreducedistractions knowledge could be extracted. ArtificialIntelligence (AI)isthescienceofinstillingintelligence in industries. AIbasedsystemsareevolvingrapidlyconnectingthemtogether and making “smartdecisions’ Humansintelligence is actually ‘taking’a ‘trulyemulating’ humanlearning as well asadataanalysis. ML would createetechniquestofacilitatelearning andDA wouldevaluate/analyse all the data. The Internet of Things (IoT) is a growing field