# StyleShift: Text Formalization System

This notebook demonstrates the functionality of StyleShift, a professional text formalization system powered by TinyLlama with QLoRA (Quantized Low-Rank Adaptation) fine-tuning. The system converts casual or informal English text into formal, professional language while preserving the original meaning.

## System Overview

- **Base Model:** TinyLlama-1.1B-Chat (quantized to 4-bit for efficient inference)
- **Adapter:** Fine-tuned LoRA adapter for text formalization
- **Functionality:** Converts informal text to formal English with maintained semantic meaning

## Workflow

1. Load the pre-trained TinyLlama model with 4-bit quantization
2. Load the fine-tuned LoRA adapter for formalization tasks
3. Generate formal versions of input text using the instruction-based prompt format
4. Clean and extract the converted text
5. Display the original and formal versions

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)

tokenizer = AutoTokenizer.from_pretrained("adapters")

base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="cuda"
)

model = PeftModel.from_pretrained(base_model, "adapters")
model.eval()

# Test with the user input
test_input = "I love programming in Python because it is fun and easy to learn."

prompt = f"""### Instruction:
Rewrite the following text in formal English. Preserve the original meaning precisely, and modify only the wording and style by using appropriate synonyms to achieve a more professional tone.
### Input:
{test_input}

### Response:
"""

print(f"Original: {test_input}")
print()

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
input_length = inputs["input_ids"].shape[1]

with torch.no_grad():
    outputs = model.generate(
        input_ids=inputs["input_ids"],
        attention_mask=inputs["attention_mask"],
        max_new_tokens=100,
        do_sample=False,
        num_beams=1
    )


generated_tokens = outputs[0][input_length:]
converted_text = tokenizer.decode(generated_tokens, skip_special_tokens=True).strip()

# Clean up any artifacts and extra text
converted_text = converted_text.replace("###", "").replace("Instruction:", "").replace("Input:", "").replace("Response:", "").strip()
converted_text = converted_text.replace("Formal response:", "").replace("Comment", "").strip()
converted_text = converted_text.replace("Formal version:", "").strip()


if "\n" in converted_text:
    converted_text = converted_text.split("\n")[0].strip()

print(f"Formal: {converted_text}")

Original: I love programming in Python because it is fun and easy to learn.

Formal: I appreciate the ease of learning Python programming.
