<a href="https://colab.research.google.com/github/MubasharAli2020/Text-Summarization---NLP-Project/blob/main/Text-Summarization---NLP-Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
import nltk
import pandas as pd
import numpy as np

nltk.download('punkt')
nltk.download('stopwords')

sample_text = """
Climate Change: A Comprehensive Overview

Climate change refers to significant, long-term changes in the global climate. The term is often used interchangeably with global warming, but climate change encompasses a broader range of changes beyond just rising temperatures. These changes include shifts in weather patterns, precipitation, and more frequent extreme weather events. The primary driver of recent climate change is human activity, particularly the burning of fossil fuels, which increases the concentration of greenhouse gases in the atmosphere.

The Science Behind Climate Change

The Earth’s climate system is influenced by various factors, including solar radiation, volcanic activity, and greenhouse gases. Greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O), trap heat in the atmosphere, creating a “greenhouse effect” that warms the planet. While this effect is natural and necessary for life on Earth, human activities have significantly increased the concentration of these gases, enhancing the greenhouse effect and leading to global warming.

Since the Industrial Revolution, the concentration of CO2 in the atmosphere has increased by more than 40%, primarily due to the burning of fossil fuels like coal, oil, and natural gas. Deforestation and land-use changes also contribute to rising CO2 levels by reducing the number of trees that can absorb CO2. Methane, another potent greenhouse gas, is released during the production and transport of coal, oil, and natural gas, as well as from livestock and other agricultural practices.

Impacts of Climate Change

The impacts of climate change are already being felt around the world and are expected to intensify in the coming decades. Some of the most significant impacts include:

Rising Temperatures: Global temperatures have increased by about 1.2°C (2.2°F) since the late 19th century. This warming has led to more frequent and intense heatwaves, which can have severe health impacts, particularly for vulnerable populations.
Melting Ice and Rising Sea Levels: The warming climate is causing glaciers and ice sheets to melt, contributing to rising sea levels. Since 1900, global sea levels have risen by about 20 centimeters (8 inches), and the rate of rise is accelerating. Rising sea levels threaten coastal communities and ecosystems, increasing the risk of flooding and erosion.
Changing Precipitation Patterns: Climate change is altering precipitation patterns, leading to more intense and frequent storms in some regions and prolonged droughts in others. These changes can disrupt water supplies, agriculture, and natural ecosystems.
Ocean Acidification: The oceans absorb about 30% of the CO2 emitted by human activities, which leads to ocean acidification. This process reduces the pH of seawater, affecting marine life, particularly organisms with calcium carbonate shells or skeletons, such as corals and shellfish.
Ecosystem Disruption: Climate change is causing shifts in the distribution and behavior of many species. Some species may be able to adapt or migrate to new areas, but others may face extinction if they cannot cope with the changing conditions. These disruptions can have cascading effects on ecosystems and the services they provide to humans.
Mitigation and Adaptation

Addressing climate change requires both mitigation and adaptation strategies. Mitigation involves reducing or preventing the emission of greenhouse gases, while adaptation involves adjusting to the changes that are already occurring or are expected to occur.

Mitigation Strategies:

Transition to Renewable Energy: Shifting from fossil fuels to renewable energy sources, such as solar, wind, and hydropower, can significantly reduce greenhouse gas emissions. Investing in energy efficiency and conservation measures can also help reduce energy demand.
Carbon Pricing: Implementing carbon pricing mechanisms, such as carbon taxes or cap-and-trade systems, can create economic incentives for reducing emissions. These mechanisms put a price on carbon emissions, encouraging businesses and individuals to adopt cleaner technologies and practices.
Reforestation and Afforestation: Planting trees and restoring forests can help absorb CO2 from the atmosphere. Protecting existing forests and reducing deforestation are also crucial for maintaining carbon sinks.
Sustainable Agriculture: Adopting sustainable agricultural practices, such as agroforestry, conservation tillage, and improved livestock management, can reduce emissions from the agricultural sector. These practices can also enhance soil health and resilience to climate impacts.
Adaptation Strategies:

Infrastructure Resilience: Building resilient infrastructure, such as flood defenses, stormwater management systems, and climate-resilient buildings, can help communities withstand the impacts of climate change. Upgrading existing infrastructure to withstand extreme weather events is also essential.
Water Management: Implementing efficient water management practices, such as rainwater harvesting, water recycling, and improved irrigation techniques, can help address water scarcity and ensure a reliable water supply.
Ecosystem-Based Adaptation: Protecting and restoring natural ecosystems, such as wetlands, mangroves, and coral reefs, can provide natural buffers against climate impacts. These ecosystems can help reduce the risk of flooding, erosion, and storm surges while providing habitat for wildlife.
Community Engagement: Engaging communities in climate adaptation planning and decision-making can ensure that adaptation strategies are locally relevant and effective. Providing education and resources to communities can also enhance their capacity to respond to climate impacts.
The Role of Policy and International Cooperation

Effective climate action requires strong policy frameworks and international cooperation. Governments play a crucial role in setting targets, implementing regulations, and providing funding for climate initiatives. International agreements, such as the Paris Agreement, aim to unite countries in the effort to limit global warming to well below 2°C above pre-industrial levels and pursue efforts to limit the temperature increase to 1.5°C.

Conclusion

Climate change is one of the most pressing challenges of our time, with far-reaching impacts on the environment, economy, and society. Addressing this challenge requires a comprehensive approach that includes both mitigation and adaptation strategies. By transitioning to renewable energy, protecting natural ecosystems, and building resilient communities, we can work towards a sustainable and climate-resilient future.
"""



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [8]:
import re
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

# Download stopwords if you haven't already
nltk.download('stopwords')
nltk.download('punkt')

stop_words = set(stopwords.words('english'))

def preprocess(text):
    # Remove special characters and extra spaces
    text = re.sub(r'\W', ' ', text)
    text = re.sub(r'\s+', ' ', text)
    # Convert to lowercase
    text = text.lower()
    # Tokenize the text
    tokens = word_tokenize(text)
    # Remove stop words
    tokens = [word for word in tokens if word not in stop_words]
    return ' '.join(tokens)

# Apply preprocessing to your sample text
cleaned_text = preprocess(sample_text)
print(cleaned_text)


climate change comprehensive overview climate change refers significant long term changes global climate term often used interchangeably global warming climate change encompasses broader range changes beyond rising temperatures changes include shifts weather patterns precipitation frequent extreme weather events primary driver recent climate change human activity particularly burning fossil fuels increases concentration greenhouse gases atmosphere science behind climate change earth climate system influenced various factors including solar radiation volcanic activity greenhouse gases greenhouse gases carbon dioxide co2 methane ch4 nitrous oxide n2o trap heat atmosphere creating greenhouse effect warms planet effect natural necessary life earth human activities significantly increased concentration gases enhancing greenhouse effect leading global warming since industrial revolution concentration co2 atmosphere increased 40 primarily due burning fossil fuels like coal oil natural gas def

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [11]:
!pip install transformers




In [13]:
from transformers import pipeline

# Create a summarization pipeline using BART
bart_summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# Function to handle large text
def summarize_large_text(text, summarizer, max_length=512):
    # Split the text into smaller chunks
    text_chunks = [text[i:i+max_length] for i in range(0, len(text), max_length)]
    summaries = []
    for chunk in text_chunks:
        summary = summarizer(chunk, max_length=150, min_length=50, do_sample=False)
        summaries.append(summary[0]['summary_text'])
    return ' '.join(summaries)

# Apply the summarization function to your sample text
bart_summary = summarize_large_text(sample_text, bart_summarizer)
print("BART Summary:", bart_summary)





The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Your max_length is set to 150, but your input_length is only 91. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=45)
Your max_length is set to 150, but your input_length is only 111. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=55)
Your max_length is set to 150, but your input_length is only 107. Since this is a summari

BART Summary: Climate change refers to significant, long-term changes in the global climate. The term is often used interchangeably with global warming, but climate change encompasses a broader range of changes beyond just rising temperatures. These changes include shifts in weather patterns, precipitation, and more frequent extreme weather events. The Earth’s climate system is influenced by various factors, including solar radiation, volcanic activity, and greenhouse gases. Greenhouse gases trap heat in the atmosphere, creating a “greenhouse effect” that warms the planet. While this effect is natural and necessary for life on Earth, human activities have significantly increased the concentration of t. Since the Industrial Revolution, the concentration of CO2 in the atmosphere has increased by more than 40%. Deforestation and land-use changes also contribute to rising CO2 levels by reducing the number of trees that can absorb CO2. Methane, another potent greenhouse gas, is released dur

In [15]:
# Create a summarization pipeline using T5
t5_summarizer = pipeline("summarization", model="t5-base")

# Function to handle large text
def summarize_large_text_t5(text, max_length=512):
    # Split the text into smaller chunks
    text_chunks = [text[i:i+max_length] for i in range(0, len(text), max_length)]
    summaries = []
    for chunk in text_chunks:
        summary = t5_summarizer(chunk, max_length=150, min_length=50, do_sample=False)
        summaries.append(summary[0]['summary_text'])
    return ' '.join(summaries)

# Apply the summarization function to your sample text
t5_summary = summarize_large_text_t5(sample_text)
print("T5 Summary:", t5_summary)


Your max_length is set to 150, but your input_length is only 97. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=48)
Your max_length is set to 150, but your input_length is only 112. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=56)
Your max_length is set to 150, but your input_length is only 116. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=58)
Your max_length is set to 150, but your input_length is only 104. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=52)
Y

T5 Summary: climate change encompasses a broader range of changes beyond just rising temperatures . primary driver of recent climate change is human activity, particularly the burning of fossil fuels . human activity increases the concentration of carbon dioxide (CO2) into the atmosphere . greenhouse gases, such as carbon dioxide (CO2), methane (CH4), and nitrous oxide (N2O), trap heat in the atmosphere, creating a "greenhouse effect" that warms the planet . human activities have significantly increased the concentration of tetracycline (tc) since the Industrial Revolution, the concentration of CO2 in the atmosphere has increased by more than 40% . deforestation and land-use changes also contribute to rising CO2 levels . methane, another potent greenhouse gas, is released during the production and transport of coal, oil, and natural gas . global temperatures have increased by about 1.2°C (2.2°F) since the late 19th century . this warming has led to more frequent and intense heatwaves, 

In [16]:
# Create a summarization pipeline using Pegasus
pegasus_summarizer = pipeline("summarization", model="google/pegasus-xsum")

# Function to handle large text
def summarize_large_text_pegasus(text, max_length=512):
    # Split the text into smaller chunks
    text_chunks = [text[i:i+max_length] for i in range(0, len(text), max_length)]
    summaries = []
    for chunk in text_chunks:
        summary = pegasus_summarizer(chunk, max_length=150, min_length=50, do_sample=False)
        summaries.append(summary[0]['summary_text'])
    return ' '.join(summaries)

# Apply the summarization function to your sample text
pegasus_summary = summarize_large_text_pegasus(sample_text)
print("Pegasus Summary:", pegasus_summary)


Some weights of PegasusForConditionalGeneration were not initialized from the model checkpoint at google/pegasus-xsum and are newly initialized: ['model.decoder.embed_positions.weight', 'model.encoder.embed_positions.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


generation_config.json:   0%|          | 0.00/259 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/87.0 [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/1.91M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/3.52M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

Your max_length is set to 150, but your input_length is only 86. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=43)
Your max_length is set to 150, but your input_length is only 101. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)
Your max_length is set to 150, but your input_length is only 105. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=52)
Your max_length is set to 150, but your input_length is only 96. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=48)
Yo

Pegasus Summary: Research and Markets has announced the addition of the "Climate Change: A Comprehensive Overview" report to their collection of energy and natural resources market reports and data products. The term climate change is often used interchangeably with global warming, but climate change encompasses a broader range of changes beyond just rising temperatures. Climate change refers to the change in the Earth’s climate caused by human activities, such as the burning of coal, oil, and gas for energy, and the greenhouse effect, in which carbon dioxide is released into the atmosphere to trap heat. Carbon dioxide (CO2) is a greenhouse gas that is released into the atmosphere when the burning of fossil fuels such as coal, oil, and natural gas causes the amount of CO2 in the atmosphere to increase. Methane, another potent greenhouse gas, is released during the production and transport of coal, oil, and natural gas, as wel The United Nations Framework Convention on Climate Change (U

In [17]:
# Summaries from different models
bart_summary = summarize_large_text(sample_text, bart_summarizer)
pegasus_summary = summarize_large_text(sample_text, pegasus_summarizer)
t5_summary = summarize_large_text(sample_text, t5_summarizer)


Your max_length is set to 150, but your input_length is only 91. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=45)
Your max_length is set to 150, but your input_length is only 111. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=55)
Your max_length is set to 150, but your input_length is only 107. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=53)
Your max_length is set to 150, but your input_length is only 111. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=55)
Y

In [18]:
!pip install rouge-score



In [19]:
from rouge_score import rouge_scorer

def evaluate_summary(reference, summary):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, summary)
    return scores


In [20]:
# Evaluate BART summary
bart_scores = evaluate_summary(sample_text, bart_summary)
print("BART ROUGE Scores:", bart_scores)

# Evaluate Pegasus summary
pegasus_scores = evaluate_summary(sample_text, pegasus_summary)
print("Pegasus ROUGE Scores:", pegasus_scores)

# Evaluate T5 summary
t5_scores = evaluate_summary(sample_text, t5_summary)
print("T5 ROUGE Scores:", t5_scores)


BART ROUGE Scores: {'rouge1': Score(precision=0.9435975609756098, recall=0.6522655426765016, fmeasure=0.7713395638629283), 'rougeL': Score(precision=0.8704268292682927, recall=0.6016859852476291, fmeasure=0.7115264797507789)}
Pegasus ROUGE Scores: {'rouge1': Score(precision=0.5552367288378766, recall=0.4077976817702845, fmeasure=0.47023086269744835), 'rougeL': Score(precision=0.30272596843615496, recall=0.22233930453108536, fmeasure=0.2563791008505468)}
T5 ROUGE Scores: {'rouge1': Score(precision=0.9408502772643254, recall=0.5363540569020021, fmeasure=0.6832214765100671), 'rougeL': Score(precision=0.8724584103512015, recall=0.49736564805057953, fmeasure=0.6335570469798657)}
