<a href="https://colab.research.google.com/github/MarMarhoun/freelance_work/blob/main/side_projects/NLP_projs/eda_streamlit/book_sum.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Book summarization using streamlit and tensorflow

A code to summarize a book using Streamlit and TensorFlow, you can create a user-friendly web application that allows users to upload a book, preprocess the text data, and then generate a summary using a trained summarization model. Here's a step-by-step guide on how to create such an application:

Install the required libraries:

First, make sure you have installed the necessary libraries. You can install them using the following commands:

In [None]:
!pip install streamlit tensorflow transformers


Create a Streamlit application:

Create a new Python file (e.g., app.py) and import the required libraries:

In [None]:
import streamlit as st
import torch
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import os

# Initialize the summarization pipeline
summarizer = pipeline("summarization")

# Load the pre-trained model and tokenizer for text summarization
tokenizer = AutoTokenizer.from_pretrained("t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-base")

# Define a function for summarizing text
def summarize_text(text, max_length=160, min_length=30, num_beams=4, early_stopping=True):
    inputs = tokenizer.encode("summarize: " + text, return_tensors="pt")
    input_ids = inputs["input_ids"].cuda()
    model.to(device)

    summary_ids = model.generate(
        input_ids,
        max_length=max_length,
        min_length=min_length,
        num_beams=num_beams,
        early_stopping=early_stopping,
    )

    summary = tokenizer.decode(summary_ids[0])
    return summary

# Define a function for loading the book
def load_book(file):
    with open(file, "r") as f:
        book = f.read()
    return book

# Define a function for preprocessing the book text
def preprocess_book(book):
    # Add your preprocessing steps here
    return book

# Define a function for generating the summary
def generate_summary(book):
    summary = summarize_text(book)
    return summary

# Define the main function
def main():
    # Set up the Streamlit app
    st.set_page_config(page_title="Book Summarizer", page_icon=":books:", layout="wide")

    st.title("Book Summarizer")
    st.subheader("Upload a book and get a summary")

    # Add a file uploader
    uploaded_file = st.file_uploader("Upload a book", type="txt")

    if uploaded_file is not None:
        # Load the book
        book = load_book(uploaded_file)

        # Preprocess the book text
        preprocessed_book = preprocess_book(book)

        # Generate the summary
        summary = generate_summary(preprocessed_book)

        # Display the summary
        st.subheader("Summary")
        st.write(summary)

# Run the main function
if __name__ == "__main__":
    main()

To run the app, execute the following command:



In [None]:
streamlit run app.py

This code provides a basic structure for a Streamlit app that allows users to upload a book, preprocess the text data, and generate a summary using a trained summarization model. You can further customize the preprocessing and summarization functions to improve the results.



To enhance and add advanced features to the code for summarizing a book using Streamlit and TensorFlow, you can consider the following improvements:

Add user interface elements for customizing the summary, such as adjusting the length or enabling/disabling certain preprocessing steps.

Implement multi-document summarization by splitting the book into smaller chunks and summarizing each chunk separately. Then, combine the summaries to create a summary for the entire book.

Incorporate a user authentication system to save and load summaries for specific users.

Implement a real-time text editor for users to edit the book text before summarizing.

Add a progress bar to show the status of the summary generation process.
Implement a feedback system to allow users to rate the quality of the summary and provide suggestions for improvement.

Use a more advanced summarization model, such as BART or Pegasus, to improve the quality of the summary.
Implement a caching system to speed up the summary generation process for large books.

Add a search functionality to allow users to search for specific keywords or phrases in the book and summarize the results.

Implement a recommendation system to suggest similar books based on the summary.
Here's an example of how you can implement some of these features:

app.py:

In [None]:
import streamlit as st
import torch
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import os
from dotenv import load_dotenv
import re
import numpy as np
import pandas as pd
from PIL import Image

# Load environment variables
load_dotenv()

# Initialize the summarization pipeline
summarizer = pipeline("summarization")

# Load the pre-trained model and tokenizer for text summarization
tokenizer = AutoTokenizer.from_pretrained("t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-base").cuda()

# Define a function for summarizing text
def summarize_text(text, max_length=160, min_length=30, num_beams=4, early_stopping=True):
    inputs = tokenizer.encode("summarize: " + text, return_tensors="pt")
    input_ids = inputs["input_ids"].cuda()
    model.to(device)

    summary_ids = model.generate(
        input_ids,
        max_length=max_length,
        min_length=min_length,
        num_beams=num_beams,
        early_stopping=early_stopping,
    )

    summary = tokenizer.decode(summary_ids[0])
    return summary

# Define a function for loading the book
def load_book(file):
    with open(file, "r") as f:
        book = f.read()
    return book

# Define a function for preprocessing the book text
def preprocess_book(book):
    # Add your preprocessing steps here
    book = re.sub(r'\[[0-9]*\]', '', book)
    book = re.sub(r'\s+', ' ', book)
    book = book.strip()
    return book

# Define a function for generating the summary
def generate_summary(book):
    summary = summarize_text(book)
    return summary

# Define the main function
def main():
    # Set up the Streamlit app
    st.set_page_config(page_title="Book Summarizer", page_icon=":books:", layout="wide")

    st.title("Book Summarizer")
    st.subheader("Upload a book and get a summary")

    # Add a file uploader
    uploaded_file = st.file_uploader("Upload a book", type="txt")

    if uploaded_file is not None:
        # Load the book
        book = load_book(uploaded_file)

        # Preprocess the book text
        preprocessed_book = preprocess_book(book)

        # Display the preprocessed book text
        st.subheader("Preprocessed Book Text")
        st.write(preprocessed_book)

        # Generate the summary
        summary = generate_summary(preprocessed_book)

        # Display the summary
        st.subheader("Summary")
        st.write(summary)

# Run the main function
if __name__ == "__main__":
    main()

This code provides a basic structure for a Streamlit app that allows users to upload a book, preprocess the text data, and generate a summary using a trained summarization model. You can further customize the preprocessing and summarization functions

