# GreenCode AI - Remote StarCoder 15B Optimization

This notebook sets up the remote StarCoder 15B model for code optimization. It creates a simple API endpoint that can be used by the local backend when more advanced code optimization is needed.

## Installation of Required Packages

First, we need to install the necessary packages.

In [None]:
!pip install transformers==4.28.1 accelerate bitsandbytes flask flask-cors pyngrok

## Authentication with Hugging Face

You'll need a Hugging Face account and token to access StarCoder. Create one at https://huggingface.co/ and generate a token.

In [None]:
from huggingface_hub import notebook_login
notebook_login()

## Load StarCoder 15B Model

Now we'll load the full StarCoder 15B model. Colab has enough resources to handle this large model.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Define model name
model_name = "bigcode/starcoder"

# Load tokenizer
print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load model with low precision for efficiency
print("Loading StarCoder 15B model... This may take a few minutes...")
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,  # Use half precision
    device_map="auto",          # Let the library handle device placement
    load_in_8bit=True           # Use 8-bit quantization for more efficiency
)

print("Model loaded successfully!")

## Create Flask API

We'll create a simple Flask API that can receive code and return optimized versions.

In [None]:
from flask import Flask, request, jsonify
from flask_cors import CORS
import re
import logging

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

app = Flask(__name__)
CORS(app)  # Enable CORS for all routes

def generate_optimized_code(code, context="energy_efficiency"):
    """Use StarCoder 15B to generate optimized code"""
    try:
        # Different prompts based on optimization context
        if context == "energy_efficiency":
            prompt = f"""# Original Python code:
```python
{code}
```

# Task: Optimize the above code for maximum energy efficiency and sustainability.
# Make it use less CPU, memory, and energy without changing functionality.

# Optimized code for energy efficiency:
```python
"""
        elif context == "performance":
            prompt = f"""# Original Python code:
```python
{code}
```

# Task: Optimize the above code for maximum speed and performance.
# Make it run faster without changing functionality.

# Speed-optimized code:
```python
"""
        elif context == "memory_efficiency":
            prompt = f"""# Original Python code:
```python
{code}
```

# Task: Optimize the above code for minimum memory usage.
# Make it use less RAM without changing functionality.

# Memory-optimized code:
```python
"""
        else:  # readability or default
            prompt = f"""# Original Python code:
```python
{code}
```

# Task: Refactor the above code to be more readable and maintainable without changing functionality.
# Use Python best practices and PEP 8 style guide.

# Refactored code:
```python
"""
        
        # Generate with the model
        inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
        
        # Generate optimized code
        with torch.no_grad():
            outputs = model.generate(
                inputs.input_ids,
                max_length=len(inputs.input_ids[0]) + 800,  # Allow for longer outputs
                temperature=0.3,                           # Lower temperature for more focused output
                top_p=0.95,                               # Nucleus sampling for some creativity
                num_return_sequences=1,
                pad_token_id=tokenizer.eos_token_id,
                do_sample=True
            )
        
        # Decode the result
        generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
        
        # Extract only the optimized code part using regex
        pattern = r"```python\n(.+?)(?:```|$)"
        matches = re.findall(pattern, generated_text, re.DOTALL)
        
        if len(matches) >= 2:  # We want the second code block (optimized version)
            optimized_code = matches[1].strip()
            return optimized_code
        elif len(matches) == 1:  # Just in case there's only one block
            optimized_code = matches[0].strip()
            return optimized_code
        else:
            logger.error("Could not extract optimized code from model output")
            return None
        
    except Exception as e:
        logger.error(f"Error generating optimized code: {str(e)}")
        return None

@app.route('/optimize', methods=['POST'])
def optimize():
    try:
        data = request.get_json()
        if not data or 'code' not in data:
            return jsonify({'error': 'No code provided'}), 400
        
        code = data['code']
        context = data.get('context', 'energy_efficiency')
        
        logger.info(f"Received optimization request with context: {context}")
        
        # Generate optimized code
        optimized_code = generate_optimized_code(code, context)
        
        if optimized_code:
            return jsonify({
                'optimized_code': optimized_code,
                'success': True
            })
        else:
            return jsonify({
                'error': 'Failed to generate optimized code',
                'success': False
            }), 500
    
    except Exception as e:
        logger.error(f"Error processing request: {str(e)}")
        return jsonify({
            'error': f'Error: {str(e)}',
            'success': False
        }), 500

@app.route('/health', methods=['GET'])
def health_check():
    return jsonify({
        'status': 'healthy',
        'model': 'StarCoder 15B',
        'device': str(model.device)
    })

## Create Public URL with ngrok

Now we'll use ngrok to create a public URL that can be accessed by the local backend.

In [None]:
from pyngrok import ngrok

# Start ngrok tunnel
ngrok_tunnel = ngrok.connect(5000)
print(f"Public URL: {ngrok_tunnel.public_url}")
print("\nIMPORTANT: Copy this URL and set it as the COLAB_URL environment variable in your local backend!")
print("For example: export COLAB_URL=https://xxxx-xx-xxx-xxx-xx.ngrok.io/optimize")

# This is the URL you'll need to set in your local backend
remote_api_url = f"{ngrok_tunnel.public_url}/optimize"
print(f"\nFull API endpoint: {remote_api_url}")

## Start the Flask Server

Finally, we'll start the Flask server to handle API requests.

In [None]:
# Start Flask app
print("Starting Flask server...")
app.run(host='0.0.0.0', port=5000)

## Testing the Server

You can test the server by sending a POST request to the API endpoint. Here's how you can test it with Python requests:

```python
import requests

url = "<your_ngrok_url>/optimize"
data = {
    "code": "def calculate_values(data):\n    result = []\n    for item in data:\n        if item > 0:\n            result.append(item * 2)\n    total = 0\n    for r in result:\n        total += r\n    return result, total"
}

response = requests.post(url, json=data)
print(response.json())
```

This should return the optimized code from StarCoder 15B.

## Important Notes

1. This notebook needs to be running for the remote API to work. Keep it open in your browser.
2. Colab sessions have time limits. If your session disconnects, you'll need to restart it.
3. The ngrok URL will change each time you restart the notebook.
4. For a more permanent solution, consider deploying to a cloud provider.