**TEXT SUMMARIZATION**

In [4]:
text = """Renewable energy comes from natural sources that are constantly replenished, such as sunlight, wind, rain, tides, waves, and geothermal heat. Unlike fossil fuels, which are limited and polluting, renewable energy is sustainable and eco-friendly.

Renewable energy helps protect the environment by reducing greenhouse gas emissions and air pollution, both major causes of climate change and health issues. It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources. This shift can protect countries from geopolitical risks and price fluctuations related to fossil fuels.

Investing in renewable energy technologies boosts economic growth by creating jobs in various sectors, such as manufacturing, installation, and maintenance. Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial for a sustainable energy system in the future.However, the transition to renewable energy faces challenges, like the intermittency of sources such as solar and wind, which don't produce energy continuously. Advancements in energy storage technologies and updating the grid infrastructure are necessary to address these issues. Investment in research, supportive policies, and international cooperation are essential to ensure a successful shift to renewable energy, leading to a cleaner, healthier, and more sustainable world."""

In [5]:
len(text)

1396

**IMPORTING THE LIBRARY AND DATASET**

In [7]:
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

In [8]:
nlp =spacy.load("en_core_web_sm")

In [9]:
doc = nlp(text)

In [10]:
tokens = [token.text for token in doc]
print(tokens)

['Renewable', 'energy', 'comes', 'from', 'natural', 'sources', 'that', 'are', 'constantly', 'replenished', ',', 'such', 'as', 'sunlight', ',', 'wind', ',', 'rain', ',', 'tides', ',', 'waves', ',', 'and', 'geothermal', 'heat', '.', 'Unlike', 'fossil', 'fuels', ',', 'which', 'are', 'limited', 'and', 'polluting', ',', 'renewable', 'energy', 'is', 'sustainable', 'and', 'eco', '-', 'friendly', '.', '\n\n', 'Renewable', 'energy', 'helps', 'protect', 'the', 'environment', 'by', 'reducing', 'greenhouse', 'gas', 'emissions', 'and', 'air', 'pollution', ',', 'both', 'major', 'causes', 'of', 'climate', 'change', 'and', 'health', 'issues', '.', 'It', 'also', 'enhances', 'energy', 'security', 'by', 'reducing', 'dependence', 'on', 'imported', 'fuels', 'and', 'diversifying', 'energy', 'sources', '.', 'This', 'shift', 'can', 'protect', 'countries', 'from', 'geopolitical', 'risks', 'and', 'price', 'fluctuations', 'related', 'to', 'fossil', 'fuels', '.', '\n\n', 'Investing', 'in', 'renewable', 'energy', 

In [11]:
punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'

In [12]:
punctuation = punctuation + '\n'

In [13]:
punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~\n'

**TEXT CLEANING**

**create word frequency , how many times each work appear**

In [19]:
word_freq ={}
stop_words = list(STOP_WORDS)
for word in doc:
  if word.text.lower() not in stop_words:
    if word.text.lower() not in punctuation:
      if word.text not in word_freq.keys():
        word_freq[word.text] = 1
      else :
        word_freq[word.text] +=1



In [24]:
print(word_freq)

{'Renewable': 2, 'energy': 12, 'comes': 1, 'natural': 1, 'sources': 4, 'constantly': 1, 'replenished': 1, 'sunlight': 1, 'wind': 2, 'rain': 1, 'tides': 1, 'waves': 1, 'geothermal': 1, 'heat': 1, 'Unlike': 1, 'fossil': 3, 'fuels': 4, 'limited': 1, 'polluting': 1, 'renewable': 5, 'sustainable': 3, 'eco': 1, 'friendly': 1, '\n\n': 2, 'helps': 1, 'protect': 2, 'environment': 1, 'reducing': 2, 'greenhouse': 1, 'gas': 1, 'emissions': 1, 'air': 1, 'pollution': 1, 'major': 1, 'causes': 1, 'climate': 1, 'change': 1, 'health': 1, 'issues': 2, 'enhances': 1, 'security': 1, 'dependence': 1, 'imported': 1, 'diversifying': 1, 'shift': 2, 'countries': 1, 'geopolitical': 1, 'risks': 1, 'price': 1, 'fluctuations': 1, 'related': 1, 'Investing': 1, 'technologies': 2, 'boosts': 1, 'economic': 1, 'growth': 1, 'creating': 1, 'jobs': 1, 'sectors': 1, 'manufacturing': 1, 'installation': 1, 'maintenance': 1, 'inexhaustible': 1, 'unlike': 1, 'making': 1, 'crucial': 1, 'system': 1, 'future': 1, 'transition': 1, 

In [27]:
max_freq = max(word_freq.values())

**we are trying to normalize all the score as the max number is 12**
Raw frequencies can be large numbers and difficult to compare directly. Normalizing them scales the values to a range, typically between 0 and 1, making comparisons easier.

In [29]:
for word in word_freq.keys():
  word_freq[word] = word_freq[word] / max_freq

In [30]:
print(word_freq)

{'Renewable': 0.16666666666666666, 'energy': 1.0, 'comes': 0.08333333333333333, 'natural': 0.08333333333333333, 'sources': 0.3333333333333333, 'constantly': 0.08333333333333333, 'replenished': 0.08333333333333333, 'sunlight': 0.08333333333333333, 'wind': 0.16666666666666666, 'rain': 0.08333333333333333, 'tides': 0.08333333333333333, 'waves': 0.08333333333333333, 'geothermal': 0.08333333333333333, 'heat': 0.08333333333333333, 'Unlike': 0.08333333333333333, 'fossil': 0.25, 'fuels': 0.3333333333333333, 'limited': 0.08333333333333333, 'polluting': 0.08333333333333333, 'renewable': 0.4166666666666667, 'sustainable': 0.25, 'eco': 0.08333333333333333, 'friendly': 0.08333333333333333, '\n\n': 0.16666666666666666, 'helps': 0.08333333333333333, 'protect': 0.16666666666666666, 'environment': 0.08333333333333333, 'reducing': 0.16666666666666666, 'greenhouse': 0.08333333333333333, 'gas': 0.08333333333333333, 'emissions': 0.08333333333333333, 'air': 0.08333333333333333, 'pollution': 0.08333333333333

**SENTENCES TOKENIZATION**

In [31]:
sent_tokens = [sent for sent in doc.sents]
print(sent_tokens)

[Renewable energy comes from natural sources that are constantly replenished, such as sunlight, wind, rain, tides, waves, and geothermal heat., Unlike fossil fuels, which are limited and polluting, renewable energy is sustainable and eco-friendly.

, Renewable energy helps protect the environment by reducing greenhouse gas emissions and air pollution, both major causes of climate change and health issues., It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources., This shift can protect countries from geopolitical risks and price fluctuations related to fossil fuels.

, Investing in renewable energy technologies boosts economic growth by creating jobs in various sectors, such as manufacturing, installation, and maintenance., Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial for a sustainable energy system in the future., However, the transition to renewable energy faces challenges, like the in

In [32]:
sent_score = {}

In [34]:
for sent in sent_tokens:
  for word in sent:
    if word.text.lower() in word_freq.keys():
      if sent not in sent_score.keys():
        sent_score[sent] = word_freq[word.text.lower()]
      else :
        sent_score[sent] += word_freq[word.text.lower()]



In [35]:
print(sent_score)

{Renewable energy comes from natural sources that are constantly replenished, such as sunlight, wind, rain, tides, waves, and geothermal heat.: 2.7500000000000004, Unlike fossil fuels, which are limited and polluting, renewable energy is sustainable and eco-friendly.

: 2.8333333333333335, Renewable energy helps protect the environment by reducing greenhouse gas emissions and air pollution, both major causes of climate change and health issues.: 2.916666666666668, It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources.: 3.2499999999999996, This shift can protect countries from geopolitical risks and price fluctuations related to fossil fuels.

: 1.5833333333333335, Investing in renewable energy technologies boosts economic growth by creating jobs in various sectors, such as manufacturing, installation, and maintenance.: 2.3333333333333335, Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial fo

**SELECT 30% SENTENCES WITH MAXIMUM SCORE**

In [36]:
from heapq import nlargest

In [38]:
len(sent_score) * 0.3

3.0

In [40]:
summary = nlargest(n = 3 , iterable =sent_score , key =sent_score.get)

In [41]:
print(summary)

[Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial for a sustainable energy system in the future., However, the transition to renewable energy faces challenges, like the intermittency of sources such as solar and wind, which don't produce energy continuously., It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources.]


**COMBINE ALL THE TEXT**

In [43]:
final_summary = [word.text for word in summary]
print(final_summary)

['Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial for a sustainable energy system in the future.', "However, the transition to renewable energy faces challenges, like the intermittency of sources such as solar and wind, which don't produce energy continuously.", 'It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources.']


In [44]:
summary = " ".join(final_summary)
print(summary)

Moreover, renewable energy sources are inexhaustible, unlike fossil fuels, making them crucial for a sustainable energy system in the future. However, the transition to renewable energy faces challenges, like the intermittency of sources such as solar and wind, which don't produce energy continuously. It also enhances energy security by reducing dependence on imported fuels and diversifying energy sources.


In [45]:
len(summary)

409

In [47]:
len(summary )/ len(text)

0.29297994269340977