# TildeOpen-30b on Google Colab

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/YOUR_USERNAME/tilde-test/blob/main/TildeOpen_30b_Colab.ipynb)

This notebook allows you to run the **TildeOpen-30b** model on Google Colab with free GPU access!

## About TildeOpen-30b
- üåç **30B parameter model** supporting **34 European languages**
- üîì **Open Source** (CC-BY-4.0 license)
- ‚ö° **Optimized** for Nordic and Eastern European languages
- üèóÔ∏è **Enterprise Ready** - trained on LUMI supercomputer

## Setup Instructions
1. **Enable GPU**: Go to `Runtime` > `Change runtime type` > Select `GPU` (T4 or better)
2. **Run all cells** in order
3. **Start chatting** with the model!


In [None]:
# @title üöÄ Setup Environment
# @markdown Run this cell first to install dependencies and check GPU availability

import subprocess
import sys

print("üîß Installing dependencies...")

# Install required packages
packages = [
    "torch>=2.0.0",
    "transformers>=4.35.0",
    "accelerate>=0.20.0",
    "safetensors>=0.3.0",
    "sentencepiece>=0.1.99",
    "protobuf>=3.20.0",
    "bitsandbytes>=0.41.0"
]

for package in packages:
    print(f"Installing {package}...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", package])

print("‚úÖ Dependencies installed!")

# Check GPU availability
import torch
print(f"\nüéÆ CUDA Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"üì± GPU: {torch.cuda.get_device_name()}")
    print(f"üíæ GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f}GB")
else:
    print("‚ö†Ô∏è No GPU detected. Please enable GPU in Runtime > Change runtime type")

print("\nüéâ Setup complete!")


In [None]:
# @title üì• Load TildeOpen-30b Model
# @markdown This will load the model with optimizations for Colab's GPU

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import warnings
import gc

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore", category=UserWarning)

class ColabTildeModel:
    def __init__(self):
        self.model_name = "TildeAI/TildeOpen-30b"
        self.tokenizer = None
        self.model = None
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        
    def load_model(self, use_4bit=True):
        """Load the model with Colab-optimized settings."""
        try:
            print("üì• Loading tokenizer...")
            self.tokenizer = AutoTokenizer.from_pretrained(
                self.model_name,
                use_fast=False  # Required for this model
            )
            print("‚úÖ Tokenizer loaded!")
            
            print("üì• Loading model with quantization...")
            
            # Configure quantization for Colab's limited memory
            if use_4bit and torch.cuda.is_available():
                print("üîß Using 4-bit quantization for optimal memory usage")
                quantization_config = BitsAndBytesConfig(
                    load_in_4bit=True,
                    bnb_4bit_compute_dtype=torch.bfloat16,
                    bnb_4bit_use_double_quant=True,
                    bnb_4bit_quant_type="nf4"
                )
                
                self.model = AutoModelForCausalLM.from_pretrained(
                    self.model_name,
                    quantization_config=quantization_config,
                    device_map="auto",
                    trust_remote_code=True,
                    low_cpu_mem_usage=True
                )
            else:
                print("üîß Loading without quantization")
                self.model = AutoModelForCausalLM.from_pretrained(
                    self.model_name,
                    torch_dtype=torch.bfloat16,
                    device_map="auto",
                    trust_remote_code=True,
                    low_cpu_mem_usage=True
                )
            
            print("‚úÖ Model loaded successfully!")
            
            # Show memory usage
            if torch.cuda.is_available():
                memory_used = torch.cuda.memory_allocated() / 1024**3
                memory_total = torch.cuda.get_device_properties(0).total_memory / 1024**3
                print(f"üìä GPU Memory: {memory_used:.1f}GB / {memory_total:.1f}GB used")
            
            return True
            
        except Exception as e:
            print(f"‚ùå Error loading model: {e}")
            return False
    
    def generate_text(self, prompt, max_new_tokens=256, temperature=0.7):
        """Generate text using the loaded model."""
        if not self.model or not self.tokenizer:
            raise RuntimeError("Model not loaded!")
        
        try:
            inputs = self.tokenizer(prompt, return_tensors="pt").to(self.device)
            
            with torch.no_grad():
                outputs = self.model.generate(
                    **inputs,
                    max_new_tokens=max_new_tokens,
                    temperature=temperature,
                    do_sample=True,
                    repetition_penalty=1.1,
                    pad_token_id=self.tokenizer.eos_token_id,
                    use_cache=True
                )
            
            generated_text = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
            response = generated_text[len(prompt):].strip()
            
            # Clean up GPU memory
            del outputs
            torch.cuda.empty_cache()
            
            return response
            
        except Exception as e:
            print(f"‚ùå Error during generation: {e}")
            return None

# Initialize and load the model
print("üöÄ Initializing TildeOpen-30b for Google Colab...")
model = ColabTildeModel()

if model.load_model():
    print("\nüéâ Model ready! You can now generate text in the next cell.")
else:
    print("\n‚ùå Failed to load model. Check GPU availability and try again.")


In [None]:
# @title üß™ Test the Model
# @markdown Run this cell to test the model with sample prompts

test_prompts = [
    "Hello, I am TildeOpen and I can help you with",
    "The future of artificial intelligence is",
    "Bonjour, je suis un mod√®le de langage qui peut",  # French
    "–ó–¥—Ä–∞–≤—Å—Ç–≤—É–π—Ç–µ, —è –∏—Å–∫—É—Å—Å—Ç–≤–µ–Ω–Ω—ã–π –∏–Ω—Ç–µ–ª–ª–µ–∫—Ç, –∫–æ—Ç–æ—Ä—ã–π –º–æ–∂–µ—Ç"  # Russian
]

print("üß™ Testing TildeOpen-30b with multilingual prompts...\n")

for i, prompt in enumerate(test_prompts, 1):
    print(f"Test {i}: {prompt}")
    response = model.generate_text(prompt, max_new_tokens=50)
    if response:
        print(f"Response: {response}")
    else:
        print("‚ùå Failed to generate response")
    print("-" * 80)

print("‚úÖ Model testing complete!")


In [None]:
# @title üí¨ Interactive Chat
# @markdown Chat with TildeOpen-30b! Try different languages.

def chat_with_model():
    """Interactive chat function for Colab."""
    print("üîÑ Starting chat with TildeOpen-30b")
    print("üí° Try different languages - this model supports 34 European languages!")
    print("üí° Type 'quit' to end the conversation")
    print("üí° Examples:")
    print("   - How are you today?")
    print("   - Comment allez-vous? (French)")
    print("   - –ö–∞–∫ –¥–µ–ª–∞? (Russian)")
    print("   - ¬øC√≥mo est√°s? (Spanish)")
    print("-" * 60)
    
    conversation_history = ""
    
    while True:
        try:
            user_input = input("\nüë§ You: ").strip()
            
            if user_input.lower() in ['quit', 'exit', 'q']:
                print("üëã Goodbye!")
                break
            
            if user_input.lower() == 'clear':
                conversation_history = ""
                print("üßπ Conversation cleared!")
                continue
            
            if not user_input:
                continue
            
            # Build prompt with conversation context
            if conversation_history:
                prompt = f"{conversation_history}\nHuman: {user_input}\nAssistant:"
            else:
                prompt = f"Human: {user_input}\nAssistant:"
            
            # Keep conversation manageable
            if len(prompt) > 1000:
                lines = prompt.split('\n')
                prompt = '\n'.join(lines[-6:])
            
            print("ü§ñ Generating response...")
            response = model.generate_text(prompt, max_new_tokens=200, temperature=0.8)
            
            if response:
                print(f"\nü§ñ TildeOpen: {response}")
                conversation_history = f"{prompt} {response}"
            else:
                print("‚ùå Failed to generate response")
        
        except KeyboardInterrupt:
            print("\n\nüëã Chat interrupted. Goodbye!")
            break
        except Exception as e:
            print(f"‚ùå Error: {e}")

# Start the interactive chat
chat_with_model()


## üìö Additional Information

### Supported Languages (34 total)
Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Hungarian, Icelandic, Irish, Italian, Latvian, Lithuanian, Macedonian, Maltese, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovene, Spanish, Swedish, Turkish, Ukrainian, and more.

### Model Details
- **Parameters**: 30 billion
- **Training Data**: 2 trillion tokens
- **License**: CC-BY-4.0
- **Specialty**: Nordic and Eastern European languages

### Performance Tips
- Use shorter prompts for faster responses
- The model performs best on European languages
- Adjust `temperature` (0.1-1.0) to control creativity
- Use `max_new_tokens` to control response length

### Links
- [Model on Hugging Face](https://huggingface.co/TildeAI/TildeOpen-30b)
- [Tilde.ai](https://tilde.ai/tildeopen-llm/)
- [GitHub Repository](https://github.com/YOUR_USERNAME/tilde-test)

---
**Happy experimenting with TildeOpen-30b! üöÄ**
