# ParaGlow Text Transformation Pipeline (Colab Edition)

**By [Bogoju Bhavana]**

This notebook demonstrates the core NLP pipeline functions (Summarization and Paraphrasing) using your refactored, custom classes.

### üöÄ Features:
* **Summarization** ‚Üí Uses BART Large CNN (via Hugging Face API)
* **Paraphrase Generation** ‚Üí Uses Groq (LPU) for high-speed inference
* **Modular Code Structure** ‚Üí Custom classes for clear pipeline steps
* **GPU Acceleration** ‚Üí Auto-detection via PyTorch

In [26]:
import os
from google.colab import userdata

# Load HF_API_KEY
try:
    os.environ['HF_API_KEY'] = userdata.get('HF_API_KEY')
    print("‚úÖ Hugging Face Key loaded directly.")
except:
    print("‚ùå HF_API_KEY not found in Colab Secrets.")

# Load GROQ_API_KEY
try:
    os.environ['GROQ_API_KEY'] = userdata.get('GROQ_API_KEY')
    print("‚úÖ Groq Key loaded directly.")
except:
    print("‚ùå GROQ_API_KEY not found in Colab Secrets.")

‚úÖ Hugging Face Key loaded directly.
‚úÖ Groq Key loaded directly.


In [27]:
!pip install -q python-dotenv requests groq torch tqdm

In [28]:
from groq import Groq
from typing import List
import os
import sys
import requests
import torch
from tqdm import tqdm

# Check GPU availability
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"‚úÖ Running on: {device.upper()}")

# ====================================================================
# CORE PROCESSOR CLASSES (Minimal Colab Version)
# ====================================================================

# 1.1 GROQ Rewriter (Paraphraser)
class GroqRewriter:
    def __init__(self, api_key):
        if not api_key:
            raise ValueError("GROQ API key is missing.")
        self.client = Groq(api_key=api_key)
        # --- FIXED MODEL NAME ---
        self.model = "llama-3.1-8b-instant"
        # ------------------------

    def paraphrase(self, text: str, num_return_sequences: int = 3) -> List[str]:
        prompt = f"Rewrite the following text in {num_return_sequences} distinct, natural English variations:\n\nTEXT: {text}"

        response = self.client.chat.completions.create(
            model=self.model,
            messages=[
                {"role": "system", "content": "You are a helpful text rephrasing assistant."},
                {"role": "user", "content": prompt}
            ],
            temperature=0.8,
            n=1
        )

        full_text = response.choices[0].message.content
        return [line.strip() for line in full_text.split('\n') if line.strip() and (len(line.split()) > 5)]

# 1.2 HUGGING FACE Summarizer (Abstractive)
class HFSummarizer:
    def __init__(self, api_key):
        if not api_key:
            raise ValueError("HF API key is missing.")
        self.api_url = "https://api-inference.huggingface.co/models/facebook/bart-large-cnn"
        self.headers = {"Authorization": f"Bearer {api_key}"}

    def summarize(self, text: str, length='medium') -> str:
        length_map = {'medium': {"max_length": 130, "min_length": 60}}
        params = length_map.get(length, length_map['medium'])

        payload = {"inputs": text, "parameters": {**params, "do_sample": False}}

        response = requests.post(self.api_url, headers=self.headers, json=payload, timeout=60)

        if response.status_code == 200:
            result = response.json()
            return result[0]['summary_text']
        else:
            return f"‚ùå HF API Error: {response.status_code} - {response.text}"

# 1.3 End-to-End Processor
class ParaGlowProcessor:
    def __init__(self, hf_key, groq_key):
        self.hf_key = hf_key
        self.groq_key = groq_key
        self.summarizer = HFSummarizer(hf_key)
        self.paraphraser = GroqRewriter(groq_key)
        print("‚ú® ParaGlow Processor initialized successfully!")

# Load API Keys (Assuming they are set in Colab Secrets)
HF_API_KEY = os.environ.get("HF_API_KEY")
GROQ_API_KEY = os.environ.get("GROQ_API_KEY")

if not HF_API_KEY or not GROQ_API_KEY:
    print("\n\n‚ö†Ô∏è WARNING: API Keys are not loaded.")
    print("Please use the 'Secrets' tab (key icon) on the left sidebar to securely enter your HF_API_KEY and GROQ_API_KEY.")
else:
    print(f"‚úÖ Keys Loaded. HF Key: {HF_API_KEY[:4]}... Groq Key: {GROQ_API_KEY[:4]}...")

‚úÖ Running on: CPU
‚úÖ Keys Loaded. HF Key: hf_b... Groq Key: gsk_...


In [29]:
def summarize_text(processor, text, max_len=130, min_len=40):
    """
    Generates abstractive summary using the initialized processor.
    """
    print("üìò Generating Summary...")
    try:
        summary = processor.summarizer.summarize(text, length='medium')
        return summary
    except Exception as e:
        return f"‚ùå Summarization Error: {e}"


def generate_paraphrases(processor, text, num_return_sequences=3):
    """Generates multiple paraphrased versions of the given text."""
    return processor.paraphraser.paraphrase(text, num_return_sequences=num_return_sequences)


def process_text(processor, input_text):
    """
    End-to-End pipeline for Summarization and Paraphrasing.
    """
    summary = summarize_text(processor, input_text)

    if "‚ùå" in summary:
        return {"summary": summary, "paraphrases": ["Error during summarization."]}

    paraphrases = generate_paraphrases(processor, summary, num_return_sequences=3)

    return {
        "summary": summary,
        "paraphrases": paraphrases
    }

In [30]:
# Initialize Processor
try:
    # Ensure keys are loaded from the environment before initializing the processor
    HF_API_KEY = os.environ.get("HF_API_KEY")
    GROQ_API_KEY = os.environ.get("GROQ_API_KEY")
    processor = ParaGlowProcessor(HF_API_KEY, GROQ_API_KEY)
except ValueError as e:
    print(f"FATAL ERROR: {e}")
    processor = None

if processor:
    SAMPLE_INPUT = """
    Artificial Intelligence (AI) is transforming industries worldwide. From healthcare to finance,
    AI technologies like natural language processing and computer vision are improving efficiency,
    reducing costs, and enabling new innovations. However, challenges such as data privacy,
    algorithmic bias, and ethical concerns remain at the forefront of public discussion.
    Solving these complex issues will be key to ensuring the responsible and equitable deployment of AI.
    """

    print("\n\n" + "#"*60)
    print("RUNNING PARA GLOW END-TO-END PIPELINE")
    print("#"*60)

    # Run Pipeline
    result = process_text(processor, SAMPLE_INPUT)

    # --- FINAL OUTPUT DISPLAY ---
    if result:
        print("\n" + "="*60)
        print("‚úÖ FINAL SUMMARY:")
        print(result["summary"])

        print("\n" + "="*60)
        print("‚úÖ PARAPHRASES:")
        for i, p in enumerate(result["paraphrases"], 1):
            print(f"  {i}. {p}")
else:
    print("\n[PIPELINE ABORTED] Please check API key setup in the Secrets panel.")

‚ú® ParaGlow Processor initialized successfully!


############################################################
RUNNING PARA GLOW END-TO-END PIPELINE
############################################################
üìò Generating Summary...

‚úÖ FINAL SUMMARY:
Artificial Intelligence (AI) is transforming industries worldwide. From healthcare to finance, AI technologies are improving efficiency. But challenges such as data privacy, algorithmic bias, and ethical concerns remain. Solving these complex issues will be key to ensuring the responsible and equitable deployment of AI.   

‚úÖ PARAPHRASES:
  1. The global impact of Artificial Intelligence (AI) is revolutionizing industries across the board. By streamlining processes, AI technologies are enhancing efficiency in sectors such as healthcare and finance. However, critical issues like data protection, algorithmic bias, and ethics still need to be addressed. Overcoming these complex challenges is essential for the responsible and equitabl