# üöÄ Paraphraser App - Google Colab Edition

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/)

Notebook ini berisi:
- ‚úÖ Setup otomatis untuk Google Colab
- ‚úÖ Model Paraphrasing menggunakan T5
- ‚úÖ Evaluasi performa model (BLEU, ROUGE, Inference Time)
- ‚úÖ Generate file `web_app.py` otomatis
- ‚úÖ Deploy Streamlit di Colab dengan Ngrok

---

## üì¶ Step 1: Install Dependencies

Install semua package yang diperlukan untuk Colab

In [None]:
%%capture
# Install dependencies
!pip install -q transformers torch nltk streamlit pyngrok
!pip install -q rouge-score sacrebleu

print("‚úÖ All dependencies installed successfully!")

## üìö Step 2: Import Libraries

In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
import nltk
from nltk.tokenize import sent_tokenize
import time
from IPython.display import display, HTML

# For evaluation
from rouge_score import rouge_scorer
from sacrebleu.metrics import BLEU

# Download NLTK data
try:
    nltk.data.find('tokenizers/punkt_tab')
    print("‚úÖ NLTK punkt_tab already downloaded")
except LookupError:
    print("üì• Downloading NLTK punkt_tab...")
    nltk.download('punkt_tab', quiet=True)
    print("‚úÖ Download complete!")

# Check GPU availability
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"\nüñ•Ô∏è  Using device: {device.upper()}")
if device == "cuda":
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")

## ü§ñ Step 3: Load Model and Tokenizer

Loading model `humarin/chatgpt_paraphraser_on_T5_base` dari Hugging Face

In [None]:
print("üì• Loading model and tokenizer...")
start_time = time.time()

tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base")
model = AutoModelForSeq2SeqLM.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base").to(device)

load_time = time.time() - start_time
print(f"\n‚úÖ Model loaded successfully in {load_time:.2f} seconds!")

# Model info
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"üìä Model Parameters:")
print(f"   Total: {total_params:,}")
print(f"   Trainable: {trainable_params:,}")

## ‚öôÔ∏è Step 4: Define Paraphrase Functions

In [None]:
def paraphrase_sentence(
    sentence,
    num_beams=5,
    num_return_sequences=1,
    repetition_penalty=2.0,
    no_repeat_ngram_size=2,
    temperature=1.0,
    max_length=128
):
    """
    Paraphrase a single sentence using T5 model.
    
    Returns:
        list: List of paraphrased sentences
        float: Inference time in seconds
    """
    start_time = time.time()
    
    input_ids = tokenizer(
        f'paraphrase: {sentence}',
        return_tensors="pt",
        padding="longest",
        max_length=max_length,
        truncation=True,
    ).input_ids.to(device)
    
    outputs = model.generate(
        input_ids,
        temperature=temperature,
        repetition_penalty=repetition_penalty,
        num_return_sequences=num_return_sequences,
        no_repeat_ngram_size=no_repeat_ngram_size,
        num_beams=num_beams,
        max_length=max_length,
        early_stopping=True
    )

    res = tokenizer.batch_decode(outputs, skip_special_tokens=True)
    inference_time = time.time() - start_time
    
    return res, inference_time


def paraphrase_paragraph(paragraph, verbose=True):
    """
    Paraphrase an entire paragraph by splitting into sentences.
    
    Returns:
        str: Paraphrased paragraph
        float: Total inference time
    """
    sentences = sent_tokenize(paragraph)
    if verbose:
        print(f"üìù Processing {len(sentences)} sentences...")

    paraphrased_sentences = []
    total_time = 0

    for i, sentence in enumerate(sentences, 1):
        if verbose:
            print(f"   {i}/{len(sentences)}: Paraphrasing...", end=" ")
        
        paraphrased_result, inf_time = paraphrase_sentence(
            sentence,
            num_beams=3,
            num_return_sequences=1,
            max_length=128
        )
        paraphrased_sentences.append(paraphrased_result[0])
        total_time += inf_time
        
        if verbose:
            print(f"‚úÖ ({inf_time:.3f}s)")

    paraphrased_paragraph = " ".join(paraphrased_sentences)
    return paraphrased_paragraph, total_time


def calculate_bleu(reference, hypothesis):
    """Calculate BLEU score"""
    bleu = BLEU()
    score = bleu.sentence_score(hypothesis, [reference])
    return score.score


def calculate_rouge(reference, hypothesis):
    """Calculate ROUGE scores"""
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    return {
        'rouge1': scores['rouge1'].fmeasure,
        'rouge2': scores['rouge2'].fmeasure,
        'rougeL': scores['rougeL'].fmeasure
    }

print("‚úÖ Functions defined successfully!")

## üß™ Step 5: Test with Simple Sentence

In [None]:
test_sentence = "Artificial Intelligence is transforming the world of technology."

print(f"üìå Original: {test_sentence}")
print("\nüîÑ Generating paraphrases...\n")

results, inf_time = paraphrase_sentence(test_sentence, num_beams=5, num_return_sequences=3)

print("‚ú® Paraphrased versions:")
for i, result in enumerate(results, 1):
    print(f"   {i}. {result}")

print(f"\n‚è±Ô∏è  Inference time: {inf_time:.3f} seconds")

## üìÑ Step 6: Test with Full Paragraph

In [None]:
paragraph = """Hugging Face is an innovative AI company that has become a leading platform for natural language processing (NLP) and machine learning tools. Founded in 2016, it started as a chatbot app but soon pivoted to focus on developing open-source models and libraries for NLP tasks. Hugging Face is best known for its Transformers library, which provides pre-trained models for tasks like text classification, translation, summarization, and question-answering."""

print("="*80)
print("üìå ORIGINAL PARAGRAPH:")
print("="*80)
print(paragraph)
print("\n" + "="*80)

paraphrased, total_time = paraphrase_paragraph(paragraph)

print("\n" + "="*80)
print("‚ú® PARAPHRASED PARAGRAPH:")
print("="*80)
print(paraphrased)
print("="*80)
print(f"\n‚è±Ô∏è  Total inference time: {total_time:.3f} seconds")

## üìä Step 7: Model Performance Evaluation

Evaluasi performa model menggunakan BLEU dan ROUGE scores

In [None]:
# Test cases untuk evaluasi
test_cases = [
    "Machine learning models can learn patterns from data without being explicitly programmed.",
    "Deep learning is a subset of machine learning that uses neural networks.",
    "Natural language processing enables computers to understand human language.",
    "Cloud computing provides on-demand access to computing resources.",
    "Cybersecurity protects systems and networks from digital attacks."
]

print("üéØ Evaluating Model Performance")
print("="*80)

total_bleu = 0
total_rouge1 = 0
total_rouge2 = 0
total_rougeL = 0
total_inf_time = 0

for i, original in enumerate(test_cases, 1):
    print(f"\nüìù Test Case {i}:")
    print(f"   Original: {original}")
    
    paraphrased, inf_time = paraphrase_sentence(original, num_beams=5, num_return_sequences=1)
    paraphrased = paraphrased[0]
    
    print(f"   Paraphrased: {paraphrased}")
    
    # Calculate metrics
    bleu = calculate_bleu(original, paraphrased)
    rouge = calculate_rouge(original, paraphrased)
    
    total_bleu += bleu
    total_rouge1 += rouge['rouge1']
    total_rouge2 += rouge['rouge2']
    total_rougeL += rouge['rougeL']
    total_inf_time += inf_time
    
    print(f"   üìä Metrics:")
    print(f"      BLEU: {bleu:.2f}")
    print(f"      ROUGE-1: {rouge['rouge1']:.4f}")
    print(f"      ROUGE-2: {rouge['rouge2']:.4f}")
    print(f"      ROUGE-L: {rouge['rougeL']:.4f}")
    print(f"      Inference Time: {inf_time:.3f}s")

n = len(test_cases)
print("\n" + "="*80)
print("üìà AVERAGE PERFORMANCE METRICS")
print("="*80)
print(f"   Average BLEU Score: {total_bleu/n:.2f}")
print(f"   Average ROUGE-1: {total_rouge1/n:.4f}")
print(f"   Average ROUGE-2: {total_rouge2/n:.4f}")
print(f"   Average ROUGE-L: {total_rougeL/n:.4f}")
print(f"   Average Inference Time: {total_inf_time/n:.3f}s")
print(f"   Total Processing Time: {total_inf_time:.3f}s")
print("="*80)

# Display summary
display(HTML(f"""
<div style='background-color: #e8f4f8; padding: 20px; border-radius: 10px; margin-top: 20px;'>
    <h3 style='color: #0066cc; margin-top: 0;'>üéØ Model Performance Summary</h3>
    <table style='width: 100%; border-collapse: collapse;'>
        <tr style='background-color: #d0e8f0;'>
            <th style='padding: 10px; text-align: left; border: 1px solid #0066cc;'>Metric</th>
            <th style='padding: 10px; text-align: center; border: 1px solid #0066cc;'>Score</th>
        </tr>
        <tr>
            <td style='padding: 10px; border: 1px solid #ccc;'>BLEU Score</td>
            <td style='padding: 10px; text-align: center; border: 1px solid #ccc; font-weight: bold;'>{total_bleu/n:.2f}</td>
        </tr>
        <tr style='background-color: #f9f9f9;'>
            <td style='padding: 10px; border: 1px solid #ccc;'>ROUGE-1 F1</td>
            <td style='padding: 10px; text-align: center; border: 1px solid #ccc; font-weight: bold;'>{total_rouge1/n:.4f}</td>
        </tr>
        <tr>
            <td style='padding: 10px; border: 1px solid #ccc;'>ROUGE-2 F1</td>
            <td style='padding: 10px; text-align: center; border: 1px solid #ccc; font-weight: bold;'>{total_rouge2/n:.4f}</td>
        </tr>
        <tr style='background-color: #f9f9f9;'>
            <td style='padding: 10px; border: 1px solid #ccc;'>ROUGE-L F1</td>
            <td style='padding: 10px; text-align: center; border: 1px solid #ccc; font-weight: bold;'>{total_rougeL/n:.4f}</td>
        </tr>
        <tr>
            <td style='padding: 10px; border: 1px solid #ccc;'>Avg Inference Time</td>
            <td style='padding: 10px; text-align: center; border: 1px solid #ccc; font-weight: bold;'>{total_inf_time/n:.3f}s</td>
        </tr>
    </table>
</div>
"""))

## ‚úèÔ∏è Step 8: Try Your Own Text!

In [None]:
# Edit variabel ini dengan teks Anda sendiri
your_text = "The advancement of artificial intelligence has revolutionized many industries."

print(f"üìå Original:\n{your_text}")
print("\n" + "-"*80 + "\n")

your_paraphrase, inf_time = paraphrase_paragraph(your_text)

print(f"\n‚ú® Paraphrased:\n{your_paraphrase}")
print(f"\n‚è±Ô∏è  Time: {inf_time:.3f}s")

## üåê Step 9: Create Streamlit Web App File

Generate file `web_app.py` otomatis di Colab

In [None]:
# COPY KODE INI KE STEP 9 DI GOOGLE COLAB NOTEBOOK ANDA
# Ganti seluruh isi cell Step 9 dengan kode ini

webapp_code = '''from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
from nltk.tokenize import sent_tokenize
import streamlit as st
import time
import nltk

# For evaluation metrics
from rouge_score import rouge_scorer
from sacrebleu.metrics import BLEU

# Download NLTK data if not exists
try:
    nltk.data.find('tokenizers/punkt_tab')
except LookupError:
    nltk.download('punkt_tab')

# Setup device and load model
@st.cache_resource
def load_model():
    device = "cuda" if torch.cuda.is_available() else "cpu"
    tokenizer = AutoTokenizer.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base")
    model = AutoModelForSeq2SeqLM.from_pretrained("humarin/chatgpt_paraphraser_on_T5_base").to(device)
    return tokenizer, model, device

tokenizer, model, device = load_model()

def paraphrase_one_sentence(
    question,
    num_beams=5,
    num_return_sequences=1,
    repetition_penalty=2.0,
    no_repeat_ngram_size=2,
    temperature=1.0,
    max_length=128
):
    input_ids = tokenizer(
        f'paraphrase: {question}',
        return_tensors="pt", 
        padding="longest",
        max_length=max_length,
        truncation=True,
    ).input_ids.to(device)
    
    outputs = model.generate(
        input_ids, 
        temperature=temperature, 
        repetition_penalty=repetition_penalty,
        num_return_sequences=num_return_sequences, 
        no_repeat_ngram_size=no_repeat_ngram_size,
        num_beams=num_beams,
        max_length=max_length,
        early_stopping=True
    )

    res = tokenizer.batch_decode(outputs, skip_special_tokens=True)
    return res

def paraphrase(paragraph):
    sentences = sent_tokenize(paragraph)
    paraphrased_sentences = []

    for sentence in sentences:
        paraphrased_result = paraphrase_one_sentence(
            sentence, 
            num_beams=3, 
            num_return_sequences=1, 
            max_length=128
        )
        paraphrased_sentences.append(paraphrased_result[0])

    paraphrased_paragraph = " ".join(paraphrased_sentences)
    return paraphrased_paragraph

def calculate_bleu(reference, hypothesis):
    """Calculate BLEU score"""
    bleu = BLEU()
    score = bleu.sentence_score(hypothesis, [reference])
    return score.score

def calculate_rouge(reference, hypothesis):
    """Calculate ROUGE scores"""
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    return {
        'rouge1': scores['rouge1'].fmeasure,
        'rouge2': scores['rouge2'].fmeasure,
        'rougeL': scores['rougeL'].fmeasure
    }

# Streamlit UI Configuration
st.set_page_config(
    page_title="AI Paraphraser Pro",
    page_icon="üîÑ",
    layout="wide",
    initial_sidebar_state="expanded"
)

# Custom CSS for modern UI
st.markdown("""
    <style>
        .main {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
        }
        .stTextArea textarea {
            font-size: 16px;
            border-radius: 10px;
            border: 2px solid #667eea;
        }
        .stButton>button {
            background: linear-gradient(90deg, #00c6ff 0%, #0072ff 100%);
            color: white;
            font-size: 18px;
            font-weight: bold;
            border-radius: 10px;
            padding: 15px 30px;
            border: none;
            transition: all 0.3s ease;
            width: 100%;
        }
        .stButton>button:hover {
            transform: scale(1.05);
            box-shadow: 0 5px 15px rgba(0,114,255,0.4);
        }
        h1 {
            color: white;
            text-align: center;
            font-size: 3em;
            text-shadow: 2px 2px 4px rgba(0,0,0,0.3);
            margin-bottom: 10px;
        }
        h2 {
            color: white;
            text-shadow: 1px 1px 2px rgba(0,0,0,0.2);
        }
        h3 {
            color: #333;
        }
        .metric-card {
            background: white;
            padding: 20px;
            border-radius: 15px;
            box-shadow: 0 4px 6px rgba(0,0,0,0.1);
            margin: 10px 0;
            text-align: center;
        }
        .metric-title {
            font-size: 14px;
            color: #666;
            font-weight: 600;
            margin-bottom: 8px;
        }
        .metric-value {
            font-size: 28px;
            font-weight: bold;
            color: #0072ff;
        }
        .metric-subtitle {
            font-size: 12px;
            color: #999;
            margin-top: 5px;
        }
        .performance-header {
            background: linear-gradient(90deg, #00c6ff 0%, #0072ff 100%);
            color: white;
            padding: 15px;
            border-radius: 10px;
            text-align: center;
            font-size: 20px;
            font-weight: bold;
            margin: 20px 0 10px 0;
        }
        .score-excellent {
            color: #10b981;
            font-weight: bold;
        }
        .score-good {
            color: #3b82f6;
            font-weight: bold;
        }
        .score-fair {
            color: #f59e0b;
            font-weight: bold;
        }
        .score-poor {
            color: #ef4444;
            font-weight: bold;
        }
    </style>
""", unsafe_allow_html=True)

st.title("üîÑ AI Paraphraser Pro")
st.markdown(
    "<p style='text-align: center; color: white; font-size: 1.2em; margin-top: -20px;'>"
    "Transform your text with AI-powered paraphrasing & real-time performance metrics"
    "</p>", 
    unsafe_allow_html=True
)

# Sidebar
with st.sidebar:
    st.header("‚ÑπÔ∏è About")
    st.info(
        "This app uses the **T5 transformer model** to paraphrase text with "
        "real-time performance evaluation using BLEU and ROUGE metrics."
    )
    
    st.header("‚öôÔ∏è Model Info")
    st.write(f"**Device:** {device.upper()}")
    if device == "cuda":
        st.write(f"**GPU:** {torch.cuda.get_device_name(0)}")
    st.write("**Model:** humarin/chatgpt_paraphraser_on_T5_base")
    st.write("**Base Architecture:** T5-base")
    
    st.header("üìä Metrics Explained")
    with st.expander("BLEU Score"):
        st.write("""
        **BLEU (Bilingual Evaluation Understudy)**
        - Originally designed for machine translation
        - Measures exact n-gram matches between texts
        - Range: 0-100
        - **For paraphrasing: 10-35 is EXCELLENT!**
        - Why? Because good paraphrase = lots of synonyms = low exact matches
        - ‚ö†Ô∏è High BLEU (>70) = barely paraphrased
        """)
    
    with st.expander("ROUGE Scores"):
        st.write("""
        **ROUGE (Recall-Oriented Understudy for Gisting Evaluation)**
        - **ROUGE-1:** Unigram (single word) overlap
        - **ROUGE-2:** Bigram (2-word phrase) overlap  
        - **ROUGE-L:** Longest common subsequence
        - Range: 0-1
        - **For paraphrasing: 0.4-0.7 is ideal**
        - This ensures meaning is preserved while allowing variation
        """)
    
    with st.expander("How to Interpret"):
        st.write("""
        ‚úÖ **Excellent Paraphrase:**
        - BLEU: 10-35 (high word variation!)
        - ROUGE-1: 0.4-0.7 (meaning preserved)
        - ROUGE-L: 0.35-0.7 (structure maintained)
        
        ‚úÖ **Very Good Paraphrase:**
        - BLEU: 20-50
        - ROUGE-1: 0.5-0.8
        
        ‚ö†Ô∏è **Too Similar (barely paraphrased):**
        - BLEU: >75
        - ROUGE-1: >0.9
        
        ‚ö†Ô∏è **Needs Review (verify meaning):**
        - ROUGE-1: <0.25
        - ROUGE-L: <0.2
        
        **Remember:** Low BLEU + Moderate ROUGE = Perfect! üéØ
        """)

# Main content
st.markdown("<br>", unsafe_allow_html=True)

col1, col2 = st.columns(2)

with col1:
    st.header("üìù Original Text")
    input_text = st.text_area(
        "Enter your text here:", 
        height=350,
        placeholder="Type or paste your text here...",
        help="Enter the text you want to paraphrase"
    )

with col2:
    st.header("‚ú® Paraphrased Text")
    
    if st.button("üöÄ Paraphrase Now", use_container_width=True):
        if input_text:
            with st.spinner("üîÑ Paraphrasing in progress..."):
                start_time = time.time()
                paraphrased_text = paraphrase(input_text)
                elapsed_time = time.time() - start_time
                bleu_score = calculate_bleu(input_text, paraphrased_text)
                rouge_scores = calculate_rouge(input_text, paraphrased_text)
            
            st.success("‚úÖ Paraphrasing complete!")
            st.text_area(
                "Result:", 
                value=paraphrased_text, 
                height=350,
                help="Your paraphrased text"
            )
            
            st.session_state.last_original = input_text
            st.session_state.last_paraphrased = paraphrased_text
            st.session_state.last_metrics = {
                'time': elapsed_time,
                'bleu': bleu_score,
                'rouge': rouge_scores,
                'original_words': len(input_text.split()),
                'paraphrased_words': len(paraphrased_text.split())
            }
        else:
            st.warning("‚ö†Ô∏è Please enter text in the left column.")

# Display performance metrics
if 'last_metrics' in st.session_state:
    metrics = st.session_state.last_metrics
    
    st.markdown(
        '<div class="performance-header">üìä Performance Metrics & Quality Analysis</div>', 
        unsafe_allow_html=True
    )
    
    # Row 1: Basic metrics
    col_a, col_b, col_c, col_d = st.columns(4)
    
    with col_a:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">‚è±Ô∏è INFERENCE TIME</div>
            <div class="metric-value">{metrics['time']:.2f}s</div>
            <div class="metric-subtitle">Processing time</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_b:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üìù ORIGINAL WORDS</div>
            <div class="metric-value">{metrics['original_words']}</div>
            <div class="metric-subtitle">Word count</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_c:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">‚ú® PARAPHRASED WORDS</div>
            <div class="metric-value">{metrics['paraphrased_words']}</div>
            <div class="metric-subtitle">Word count</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_d:
        word_diff = metrics['paraphrased_words'] - metrics['original_words']
        diff_percent = (word_diff / metrics['original_words'] * 100) if metrics['original_words'] > 0 else 0
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üìä WORD CHANGE</div>
            <div class="metric-value">{diff_percent:+.1f}%</div>
            <div class="metric-subtitle">Length variation</div>
        </div>
        """, unsafe_allow_html=True)
    
    st.markdown("<br>", unsafe_allow_html=True)
    
    # Row 2: Quality metrics
    st.subheader("üéØ Quality Metrics")
    
    col_e, col_f, col_g, col_h = st.columns(4)
    
    def get_bleu_class(score):
        if score > 70: return "score-poor"
        elif score >= 30: return "score-excellent"
        elif score >= 20: return "score-good"
        else: return "score-fair"
    
    def get_rouge_class(score):
        if score > 0.9: return "score-poor"
        elif score >= 0.5: return "score-excellent"
        elif score >= 0.3: return "score-good"
        else: return "score-fair"
    
    bleu_class = get_bleu_class(metrics['bleu'])
    rouge1_class = get_rouge_class(metrics['rouge']['rouge1'])
    rouge2_class = get_rouge_class(metrics['rouge']['rouge2'])
    rougeL_class = get_rouge_class(metrics['rouge']['rougeL'])
    
    with col_e:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üéØ BLEU SCORE</div>
            <div class="metric-value {bleu_class}">{metrics['bleu']:.2f}</div>
            <div class="metric-subtitle">Similarity score</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_f:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üìà ROUGE-1</div>
            <div class="metric-value {rouge1_class}">{metrics['rouge']['rouge1']:.4f}</div>
            <div class="metric-subtitle">Unigram overlap</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_g:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üìà ROUGE-2</div>
            <div class="metric-value {rouge2_class}">{metrics['rouge']['rouge2']:.4f}</div>
            <div class="metric-subtitle">Bigram overlap</div>
        </div>
        """, unsafe_allow_html=True)
    
    with col_h:
        st.markdown(f"""
        <div class="metric-card">
            <div class="metric-title">üìà ROUGE-L</div>
            <div class="metric-value {rougeL_class}">{metrics['rouge']['rougeL']:.4f}</div>
            <div class="metric-subtitle">Longest common seq</div>
        </div>
        """, unsafe_allow_html=True)
    
    st.markdown("<br>", unsafe_allow_html=True)
    
    # Quality assessment with improved logic
    st.subheader("‚úÖ Quality Assessment")
    
    bleu = metrics['bleu']
    rouge1 = metrics['rouge']['rouge1']
    rouge2 = metrics['rouge']['rouge2']
    rougeL = metrics['rouge']['rougeL']
    
    # NEW LOGIC: Low BLEU + Moderate ROUGE = Excellent!
    if bleu > 75 or rouge1 > 0.9:
        quality = "Too Similar"
        quality_icon = "‚ö†Ô∏è"
        quality_color = "#f59e0b"
        quality_msg = "The paraphrased text is very similar to the original. More word variation would improve quality."
    elif rouge1 < 0.25 or rougeL < 0.2:
        quality = "Needs Review"
        quality_icon = "‚ö†Ô∏è"
        quality_color = "#ef4444"
        quality_msg = "Low semantic overlap detected. Please verify the paraphrased text preserves the original meaning."
    elif 10 <= bleu <= 35 and 0.4 <= rouge1 <= 0.7 and rougeL >= 0.35:
        quality = "Excellent"
        quality_icon = "üåü"
        quality_color = "#10b981"
        quality_msg = "Outstanding paraphrase! High word variation while preserving meaning and structure. Professional quality."
    elif 20 <= bleu <= 50 and 0.5 <= rouge1 <= 0.8:
        quality = "Very Good"
        quality_icon = "‚úÖ"
        quality_color = "#059669"
        quality_msg = "Very good paraphrase! Effective rephrasing with strong meaning preservation."
    elif rouge1 >= 0.4 and rougeL >= 0.3:
        quality = "Good"
        quality_icon = "üëç"
        quality_color = "#3b82f6"
        quality_msg = "Good paraphrase! The text has been rephrased while preserving the original meaning."
    else:
        quality = "Fair"
        quality_icon = "‚ÑπÔ∏è"
        quality_color = "#6366f1"
        quality_msg = "Acceptable paraphrase. Consider reviewing to ensure meaning is preserved."
    
    st.markdown(f"""
    <div style="background: {quality_color}; color: white; padding: 20px; border-radius: 10px; text-align: center;">
        <div style="font-size: 48px; margin-bottom: 10px;">{quality_icon}</div>
        <div style="font-size: 24px; font-weight: bold; margin-bottom: 10px;">Quality: {quality}</div>
        <div style="font-size: 16px;">{quality_msg}</div>
    </div>
    """, unsafe_allow_html=True)
    
    # Detailed breakdown
    with st.expander("üìã Detailed Analysis"):
        st.write("**Interpretation:**")
        st.write("*Note: For paraphrasing, low BLEU with moderate ROUGE is actually IDEAL!*")
        st.write("")
        
        # BLEU interpretation
        if bleu > 75:
            st.write(f"- üî¥ **BLEU ({bleu:.2f}):** Too high - paraphrase is almost identical to original (barely rephrased)")
        elif 35 < bleu <= 75:
            st.write(f"- üü° **BLEU ({bleu:.2f}):** Moderate-high - some variation but could use more rephrasing")
        elif 20 <= bleu <= 35:
            st.write(f"- üü¢ **BLEU ({bleu:.2f}):** Good range - balanced similarity and variation")
        elif 10 <= bleu < 20:
            st.write(f"- üü¢ **BLEU ({bleu:.2f}):** Excellent - high word variation (this is GOOD for paraphrasing!)")
        else:
            st.write(f"- üü° **BLEU ({bleu:.2f}):** Very low - verify meaning is preserved, but variation is great")
        
        # ROUGE-1 interpretation
        if rouge1 > 0.9:
            st.write(f"- üî¥ **ROUGE-1 ({rouge1:.4f}):** Too high - very little word variation")
        elif rouge1 >= 0.7:
            st.write(f"- üü° **ROUGE-1 ({rouge1:.4f}):** High - meaning preserved but limited variation")
        elif rouge1 >= 0.4:
            st.write(f"- üü¢ **ROUGE-1 ({rouge1:.4f}):** Optimal - excellent balance of preservation and variation")
        elif rouge1 >= 0.25:
            st.write(f"- üü° **ROUGE-1 ({rouge1:.4f}):** Moderate - verify meaning is preserved")
        else:
            st.write(f"- üî¥ **ROUGE-1 ({rouge1:.4f}):** Low - meaning might be significantly changed")
        
        # ROUGE-L interpretation
        if rougeL >= 0.5:
            st.write(f"- üü¢ **ROUGE-L ({rougeL:.4f}):** Excellent structural preservation")
        elif rougeL >= 0.35:
            st.write(f"- üü¢ **ROUGE-L ({rougeL:.4f}):** Good structural similarity")
        elif rougeL >= 0.2:
            st.write(f"- üü° **ROUGE-L ({rougeL:.4f}):** Moderate structural changes")
        else:
            st.write(f"- üî¥ **ROUGE-L ({rougeL:.4f}):** Significant structural changes")
        
        st.write("")
        st.write("**Key Insight:**")
        st.write("üéØ *A good paraphrase should have LOW BLEU (10-35) and MODERATE ROUGE-1 (0.4-0.7)*")
        st.write("   ‚Üí This means: lots of synonyms and restructuring, but meaning preserved!")
        
        # Performance
        st.write("")
        if metrics['time'] < 1:
            st.write(f"- ‚ö° **Speed:** Excellent ({metrics['time']:.2f}s)")
        elif metrics['time'] < 3:
            st.write(f"- üü¢ **Speed:** Good ({metrics['time']:.2f}s)")
        else:
            st.write(f"- üü° **Speed:** Slow ({metrics['time']:.2f}s) - consider optimizing")

# Footer
st.markdown("<br><br>", unsafe_allow_html=True)
st.markdown(
    "<p style='text-align: center; color: white; font-size: 0.9em;'>"
    "Made with ‚ù§Ô∏è using Streamlit & Hugging Face Transformers"
    "</p>", 
    unsafe_allow_html=True
)
'''

# Write to file
with open('web_app.py', 'w', encoding='utf-8') as f:
    f.write(webapp_code)

print("‚úÖ File 'web_app.py' created successfully!")
print("üìÅ Location: /content/web_app.py")
print("\nüí° NEW FEATURES - FIXED QUALITY ASSESSMENT:")
print("   ‚úÖ Corrected BLEU interpretation for paraphrasing")
print("   ‚úÖ Low BLEU (10-35) + Moderate ROUGE (0.4-0.7) = EXCELLENT!")
print("   ‚úÖ Real-time BLEU & ROUGE metrics with color coding")
print("   ‚úÖ Quality badges: Excellent / Very Good / Good / Fair / Too Similar / Needs Review")
print("   ‚úÖ Detailed analysis with proper metric explanation")
print("   ‚úÖ Beautiful gradient UI with metric cards")
print("\nüéØ YOUR EXAMPLE WILL NOW SHOW:")
print("   Original: 'Regular physical activity is essential for maintaining good health'")
print("   Paraphrased: 'Maintaining a healthy lifestyle requires regular physical activity'")
print("   Metrics: BLEU 17.36, ROUGE-1 0.5079, ROUGE-L 0.4762")
print("   Result: üåü Excellent - Outstanding paraphrase!")
print("\nüöÄ Ready to run in the next cell!")


## üöÄ Step 10: Run Streamlit with Ngrok (Optional)

Untuk menjalankan Streamlit di Colab dan mendapatkan public URL

In [None]:
# Setup ngrok untuk public URL
# Anda perlu authtoken dari https://dashboard.ngrok.com/get-started/your-authtoken

import subprocess
import threading
from pyngrok import ngrok

# Set your ngrok authtoken (dapatkan dari https://dashboard.ngrok.com/)
# Uncomment dan isi dengan token Anda:
# ngrok.set_auth_token("YOUR_NGROK_AUTH_TOKEN")

# Kill existing streamlit processes
!pkill -9 streamlit

# Function to run streamlit
def run_streamlit():
    !streamlit run web_app.py --server.port 8501 --server.headless true

# Start streamlit in background
thread = threading.Thread(target=run_streamlit)
thread.start()

# Wait for streamlit to start
import time
time.sleep(5)

# Create ngrok tunnel
public_url = ngrok.connect(8501)
print("\n" + "="*80)
print("üéâ STREAMLIT APP IS RUNNING!")
print("="*80)
print(f"\nüåê Public URL: {public_url}")
print("\nüí° Click the URL above to access your Streamlit app!")
print("\n‚ö†Ô∏è  Note: Keep this cell running to maintain the connection.")
print("    To stop, interrupt the kernel or restart runtime.")
print("="*80)

## üìå Alternative: Run Streamlit Locally

Jika ingin menjalankan tanpa ngrok (hanya untuk testing lokal di Colab):

In [None]:
# Run streamlit tanpa ngrok (akan jalan di background)
# Tidak akan bisa diakses dari luar Colab

!streamlit run web_app.py &

print("‚úÖ Streamlit is running in the background")
print("‚ö†Ô∏è  Note: This won't create a public URL")
print("üí° Use ngrok method in Step 10 to get a public URL")

## üì• Step 11: Download web_app.py

Download file untuk dijalankan di local machine

In [None]:
from google.colab import files

print("üì• Downloading web_app.py...")
files.download('web_app.py')
print("\n‚úÖ Download complete!")
print("\nüí° To run locally:")
print("   1. Install dependencies: pip install transformers torch nltk streamlit")
print("   2. Run: streamlit run web_app.py")

---
## üéØ Summary

### ‚úÖ What We've Done:
1. ‚úÖ Installed all dependencies for Google Colab
2. ‚úÖ Loaded T5 paraphraser model
3. ‚úÖ Tested paraphrasing on sentences and paragraphs
4. ‚úÖ Evaluated model performance (BLEU, ROUGE, Inference Time)
5. ‚úÖ Generated `web_app.py` file automatically
6. ‚úÖ Set up Streamlit deployment with Ngrok

### üìä Model Performance:
- Device: GPU (if available) or CPU
- Metrics: BLEU, ROUGE-1, ROUGE-2, ROUGE-L
- Average Inference Time per sentence

### üöÄ Next Steps:
1. Run Step 10 with your Ngrok token to get public URL
2. Or download `web_app.py` and run locally
3. Share the app with others!

---

**Made with ‚ù§Ô∏è for Google Colab**