# <p style="text-align: center; color: purple; font-weight: bold;">🕌✨ Text Summarization of Tourist Place - Jaipur ✨🕌</p>


![image.png](attachment:image.png)

# <span style="color:red">Key Terms </span>

## **NLTK**
* **Natural Language Toolkit (NLTK)** is a powerful Python library used to work with human language data (also known as text data). It helps us process and analyze large amounts of text easily.

## **Corpus**
* A **corpus** is a large collection of text data. In NLTK, it provides different sets of text like books, news articles, and speeches. These are used for training and testing various models in natural language processing (NLP).

## **Heapq**
* **Heapq** is a Python module that provides a way to work with a priority queue, which helps us find the most important elements in a dataset. In this project, we use it to get the sentences with the highest scores for the summary.

## **Punkt**
* **Punkt** is a pre-trained model in NLTK that is used for **sentence splitting**. It helps us break a large block of text into individual sentences.

## **Stopwords**
* **Stopwords** are common words like "the," "is," "in," and "at" that usually don’t carry much meaning in a sentence. We ignore them while processing the text so that our model focuses on the important words instead.

## **Tokenization**
* **Tokenization** is the process of splitting a piece of text into smaller units like **words** or **sentences**. For example, splitting the sentence “Jaipur is a city” into individual words: [‘Jaipur’, ‘is’, ‘a’, ‘city’].

## **Word Frequency**
* **Word frequency** refers to how often a word appears in a given text. Words that appear more frequently are considered more important in this project.

## **Sentence Scoring**
* **Sentence scoring** is the process of giving each sentence a score based on the importance of the words in that sentence. Higher scores mean the sentence is more important for our summary.

## **Summarization**
* **Summarization** means creating a shorter version of the original text while keeping the main points. We use sentence scores to decide which sentences to include in the summary.


In [1]:
# Requirements 
!pip install nltk

# Download the necessary nltk datasets (punkt for sentence splitting, stopwords for removing common words)
import nltk
nltk.download('punkt')
nltk.download('stopwords')






[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\MADHAVI\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\MADHAVI\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

## <span style="color:red">Step 1: Importing libraries and downloading resources</span>

In [None]:
# Install the necessary libraries
!pip install nltk


## <span style="color:red">Step 2: Importing libraries and downloading resources</span>

In [3]:
# Importing the required libraries
import nltk   # NLTK is a powerful library for working with human language data
import heapq  # Heapq is used to retrieve the top highest-scoring sentences

# Downloading necessary NLTK resources
nltk.download('punkt')   # Punkt is used to split sentences
nltk.download('stopwords')  # Stopwords are common words like 'the', 'is' which we ignore in summarization


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\MADHAVI\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\MADHAVI\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

## <span style="color:red">Step 3: Text to summarize</span>


In [None]:
# The text we want to summarize
text = '''Jaipur, the capital of Rajasthan, is often called the "Pink City" because of the distinct color of its buildings. Known for its rich history, royal palaces, and vibrant culture, Jaipur is a major tourist attraction in India. The city is part of the famous Golden Triangle tourist circuit, which also includes Delhi and Agra.

One of the main attractions in Jaipur is the Amber Fort, a magnificent fort built with red sandstone and marble. Visitors can explore its grand halls, courtyards, and even take an elephant ride up to the fort, experiencing the royal lifestyle of the past. The Hawa Mahal, or the "Palace of Winds," is another iconic landmark of Jaipur. Its unique honeycomb-shaped facade with numerous small windows was built so the royal women could observe street life while remaining unseen.

Jaipur is also home to the City Palace, which is a blend of Mughal and Rajasthani architecture. This palace complex includes museums that showcase royal costumes, armory, and art. Nearby is the Jantar Mantar, a UNESCO World Heritage site that houses an impressive collection of astronomical instruments built in the early 18th century.

Beyond its architectural wonders, Jaipur is known for its lively markets filled with colorful textiles, jewelry, and handcrafted items. Tourists flock to places like Johari Bazaar and Bapu Bazaar to buy traditional Rajasthani souvenirs. The city’s rich culinary traditions, featuring dishes like dal baati churma and ghewar, offer a delightful experience for food lovers.

Jaipur also hosts several cultural festivals throughout the year, including the famous Jaipur Literature Festival and Teej. These festivals provide a glimpse into the vibrant traditions and artistic heritage of Rajasthan. With its royal charm, architectural marvels, and cultural richness, Jaipur is a must-visit destination for anyone exploring India.'''

## <span style="color:red">Step 4: Preparing stopwords and sentences</span>

In [None]:
# List of common words (stopwords) that we will ignore during summarization
stopwords = nltk.corpus.stopwords.words('english')

# Splitting the text into a list of sentences
sentence_list = nltk.sent_tokenize(text)


## <span style="color:red">Step 5: Calculating word frequencies</span>

In [None]:
# Making a dictionary to store the frequency of each word (how often they appear)
frequency_map = {}

# Splitting the article into individual words
word_list = nltk.word_tokenize(text)

# Calculating frequency of each word (excluding stopwords)
for word in word_list:
    if word.lower() not in stopwords:  # Ignoring common words
        if word not in frequency_map:  # If the word is new, add it to the dictionary with count 1
            frequency_map[word] = 1
        else:
            frequency_map[word] += 1  # If the word exists, increase its count


## <span style="color:red">Step 6: Normalizing word frequencies</span>

In [None]:
# Finding the maximum frequency of any word
max_frequency = max(frequency_map.values())

# Adjusting the frequency of each word (scaling it so that the highest frequency is 1)
for word in frequency_map:
    frequency_map[word] = frequency_map[word] / max_frequency


## <span style="color:red">Step 7: Scoring sentences based on word frequency</span>

In [None]:
# Getting the top 5 sentences with the highest scores for the summary
summary_sentences = heapq.nlargest(5, sent_scores, key=sent_scores.get)

# Printing the final summary as key points
print("Summary in Key Points:\n")
for sentence in summary_sentences:
    print(f"- {sentence}")


## <span style="color:red">Step 8: Getting the summary</span>

In [None]:
# Getting the top 5 sentences with the highest scores for the summary
summary_sentences = heapq.nlargest(5, sent_scores, key=sent_scores.get)

# Printing the final summary as key points
print("Summary in Key Points:\n")
for sentence in summary_sentences:
    print(f"- {sentence}")


# <p style="text-align: center; color: red; font-weight: bold;">Congratulations!</p>

![image.png](attachment:image.png)

### We have successfully created a program that summarizes an article about Jaipur, focusing on its amazing attractions, rich history, and vibrant culture. This helps us understand how to extract key information from a larger text using Python.


In [None]:
# In one cell code


In [1]:
#pip install nltk
import nltk   # importing the nltk library to work with text data
import heapq  # heapq is used to get the highest scores for our summary

# nltk.download('punkt')   # punkt is used for splitting sentences
# nltk.download('stopwords')  # stopwords contains common words like 'the', 'is', which we ignore

text = ''' Jaipur, the capital of Rajasthan, is often called the "Pink City" because of the distinct color of its buildings. Known for its rich history, royal palaces, and vibrant culture, Jaipur is a major tourist attraction in India. The city is part of the famous Golden Triangle tourist circuit, which also includes Delhi and Agra.

One of the main attractions in Jaipur is the Amber Fort, a magnificent fort built with red sandstone and marble. Visitors can explore its grand halls, courtyards, and even take an elephant ride up to the fort, experiencing the royal lifestyle of the past. The Hawa Mahal, or the "Palace of Winds," is another iconic landmark of Jaipur. Its unique honeycomb-shaped facade with numerous small windows was built so the royal women could observe street life while remaining unseen.

Jaipur is also home to the City Palace, which is a blend of Mughal and Rajasthani architecture. This palace complex includes museums that showcase royal costumes, armory, and art. Nearby is the Jantar Mantar, a UNESCO World Heritage site that houses an impressive collection of astronomical instruments built in the early 18th century.

Beyond its architectural wonders, Jaipur is known for its lively markets filled with colorful textiles, jewelry, and handcrafted items. Tourists flock to places like Johari Bazaar and Bapu Bazaar to buy traditional Rajasthani souvenirs. The city’s rich culinary traditions, featuring dishes like dal baati churma and ghewar, offer a delightful experience for food lovers.

Jaipur also hosts several cultural festivals throughout the year, including the famous Jaipur Literature Festival and Teej. These festivals provide a glimpse into the vibrant traditions and artistic heritage of Rajasthan. With its royal charm, architectural marvels, and cultural richness, Jaipur is a must-visit destination for anyone exploring India.'''

# List of common words (stopwords) that we will ignore during summarization
stopwords = nltk.corpus.stopwords.words('english')

# Splitting the text into a list of sentences
sentence_list = nltk.sent_tokenize(text)

# Making a dictionary to store the frequency of each word (how often they appear)
frequency_map = {}

# Splitting the article into individual words
word_list = nltk.word_tokenize(text)

# Calculating frequency of each word (excluding stopwords)
for word in word_list:
    if word.lower() not in stopwords:  # Ignoring common words
        if word not in frequency_map:  # If the word is new, add it to the dictionary with count 1
            frequency_map[word] = 1
        else:
            frequency_map[word] += 1  # If the word exists, increase its count

# Finding the maximum frequency of any word
max_frequency = max(frequency_map.values())

# Adjusting the frequency of each word (scaling it so that the highest frequency is 1)
for word in frequency_map:
    frequency_map[word] = frequency_map[word] / max_frequency

# Assigning scores to sentences based on word frequencies
sent_scores = {}

for sent in sentence_list:
    for word in nltk.word_tokenize(sent):
        if word.lower() in frequency_map and len(sent.split(' ')) < 35:  # Only short sentences are considered
            if sent not in sent_scores:  # If the sentence is new, assign its score
                sent_scores[sent] = frequency_map[word.lower()]
            else:
                sent_scores[sent] += frequency_map[word.lower()]  # Add word frequency to sentence score

# Getting the top 5 sentences with the highest scores for the summary
summary_sentences = heapq.nlargest(5, sent_scores, key=sent_scores.get)

# Printing the final summary as key points
print("Summary in Key Points:\n")
for sentence in summary_sentences:
    print(f"- {sentence}")


C:\Users\MADHAVI\anaconda3\lib\site-packages\numpy\.libs\libopenblas.4SP5SUA7CBGXUEOC35YP2ASOICYYEQZZ.gfortran-win_amd64.dll
C:\Users\MADHAVI\anaconda3\lib\site-packages\numpy\.libs\libopenblas64__v0.3.21-gcc_10_3_0.dll


Summary in Key Points:

- Visitors can explore its grand halls, courtyards, and even take an elephant ride up to the fort, experiencing the royal lifestyle of the past.
- Known for its rich history, royal palaces, and vibrant culture, Jaipur is a major tourist attraction in India.
- With its royal charm, architectural marvels, and cultural richness, Jaipur is a must-visit destination for anyone exploring India.
- Beyond its architectural wonders, Jaipur is known for its lively markets filled with colorful textiles, jewelry, and handcrafted items.
- The city’s rich culinary traditions, featuring dishes like dal baati churma and ghewar, offer a delightful experience for food lovers.
