<a href="https://colab.research.google.com/github/Utkarsha-906/Content-summarizer/blob/main/content_summatizer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import sqlite3
from datetime import datetime
from transformers import pipeline

# Initialize summarization pipeline
summarizer = pipeline("summarization")

# Database setup
DB_NAME = "summaries.db"

def initialize_database():
    """Initialize the SQLite database."""
    conn = sqlite3.connect(DB_NAME)
    cursor = conn.cursor()
    cursor.execute("""
        CREATE TABLE IF NOT EXISTS summaries (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            original_text TEXT,
            summary TEXT,
            timestamp TEXT
        )
    """)
    conn.commit()
    conn.close()

def save_to_database(original_text, summary):
    """Save the original text, summary, and timestamp to the database."""
    conn = sqlite3.connect(DB_NAME)
    cursor = conn.cursor()
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    cursor.execute("""
        INSERT INTO summaries (original_text, summary, timestamp)
        VALUES (?, ?, ?)
    """, (original_text, summary, timestamp))
    conn.commit()
    conn.close()

def summarize_text(text):
    """Summarize the input text using Hugging Face Transformers."""
    try:
        summary = summarizer(text, max_length=150, min_length=30, do_sample=False)
        return summary[0]['summary_text']
    except Exception as e:
        print("Error summarizing text:", e)
        return None

def get_all_summaries():
    """Retrieve all summaries from the database."""
    conn = sqlite3.connect(DB_NAME)
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM summaries")
    results = cursor.fetchall()
    conn.close()
    return results

def main():
    """Main program loop for summarizing text and saving to database."""
    initialize_database()
    print("Welcome to the Content Summarizer!")

    while True:
        print("\nOptions:")
        print("1. Summarize text")
        print("2. View saved summaries")
        print("3. Exit")
        choice = input("Enter your choice: ")

        if choice == "1":
            text = input("Enter the text to summarize: ")
            summary = summarize_text(text)
            if summary:
                print("\nSummary:\n", summary)
                save_to_database(text, summary)
                print("Summary saved to database.")
            else:
                print("Failed to generate summary.")

        elif choice == "2":
            summaries = get_all_summaries()
            if summaries:
                for row in summaries:
                    print("\nID:", row[0])
                    print("Original Text:", row[1])
                    print("Summary:", row[2])
                    print("Timestamp:", row[3])
                    print("-" * 50)
            else:
                print("No summaries found in the database.")

        elif choice == "3":
            print("Exiting the program. Goodbye!")
            break

        else:
            print("Invalid choice. Please try again.")

if __name__ == "__main__":
    main()


No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Device set to use cuda:0


Welcome to the Content Summarizer!

Options:
1. Summarize text
2. View saved summaries
3. Exit
Enter your choice: 1
Enter the text to summarize: Summarize the following text: "Artificial intelligence is a field of computer science that focuses on creating systems capable of performing tasks that require human intelligence..."


Your max_length is set to 150, but your input_length is only 33. Since this is a summarization task, where outputs shorter than the input are typically wanted, you might consider decreasing max_length manually, e.g. summarizer('...', max_length=16)



Summary:
  Artificial Intelligence is a field of computer science that focuses on creating systems capable of performing tasks that require human intelligence . Artificial intelligence is the focus of computer scientists that focus on creating computer systems that perform tasks that need human intelligence...
Summary saved to database.

Options:
1. Summarize text
2. View saved summaries
3. Exit
Enter your choice: 2

ID: 1
Original Text: Summarize the following text: "Artificial intelligence is a field of computer science that focuses on creating systems capable of performing tasks that require human intelligence..."
Summary:  Artificial Intelligence is a field of computer science that focuses on creating systems capable of performing tasks that require human intelligence . Artificial intelligence is the focus of computer scientists that focus on creating computer systems that perform tasks that need human intelligence...
Timestamp: 2024-12-30 09:32:11
--------------------------------