# Hugging Face Authentication Troubleshooting

This notebook helps you troubleshoot authentication issues when accessing gated Hugging Face repositories like `bigcode/starcoderbase-3b`. It provides step-by-step guidance to:

1. **Check token validity** - Verify your Hugging Face token is working
2. **Handle access errors** - Properly catch and handle gated repository errors 
3. **Request access** - Get instructions for requesting access to gated models
4. **Implement fallbacks** - Automatically use public models when gated models aren't accessible

## Problem Overview

The error occurs when trying to access `bigcode/starcoderbase-3b`, which is a **gated repository** that requires:
- A valid Hugging Face account
- An authenticated token
- Approval to access the specific model

Let's walk through the solution step by step.

In [None]:
# Import Required Libraries
import os
import toml
import warnings
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import login, HfApi, hf_hub_download
from langchain_community.llms import HuggingFaceEndpoint
from peft import PeftModel, PeftConfig

# Suppress warnings for cleaner output
warnings.filterwarnings("ignore")

print("✅ All required libraries imported successfully")

## Step 1: Load Secrets and Authenticate

First, let's load your Hugging Face token from the secrets file and authenticate.

In [None]:
# Load secrets from the .streamlit/secrets.toml file
try:
    secrets_file_path = os.path.abspath(os.path.join('..', '.streamlit', 'secrets.toml'))
    secrets = toml.load(secrets_file_path)
    hf_token = secrets.get("huggingface_token")
    
    if hf_token and hf_token.strip():
        print("✅ Hugging Face token found in secrets.toml")
        print(f"Token length: {len(hf_token)} characters")
        print(f"Token preview: {hf_token[:8]}...{hf_token[-8:] if len(hf_token) > 16 else ''}")
    else:
        print("❌ No Hugging Face token found or token is empty")
        print("Please add your token to .streamlit/secrets.toml")
        hf_token = None
        
except FileNotFoundError:
    print("❌ secrets.toml file not found")
    print("Please create .streamlit/secrets.toml with your huggingface_token")
    hf_token = None
except Exception as e:
    print(f"❌ Error loading secrets: {e}")
    hf_token = None

In [None]:
# Test authentication with Hugging Face
if hf_token:
    try:
        # Login to Hugging Face
        login(hf_token, add_to_git_credential=True)
        print("✅ Successfully logged in to Hugging Face")
        
        # Test API access
        api = HfApi()
        user_info = api.whoami(token=hf_token)
        print(f"✅ Authenticated as: {user_info['name']}")
        
    except Exception as e:
        print(f"❌ Authentication failed: {e}")
        print("Please check your token is valid:")
        print("1. Go to https://huggingface.co/settings/tokens")
        print("2. Generate a new token if needed")
        print("3. Update your .streamlit/secrets.toml file")
else:
    print("⚠️ Skipping authentication - no token available")

## Step 2: Test Access to Gated Repository

Now let's test access to the gated `bigcode/starcoderbase-3b` model and handle the OSError properly.

In [None]:
# Test access to the gated StarCoder model
gated_model_name = "bigcode/starcoderbase-3b"
access_granted = False

print(f"🔍 Testing access to gated model: {gated_model_name}")
print("-" * 60)

try:
    # Try to access the model configuration (lightweight test)
    print("Attempting to access model config...")
    
    # This is the line that was causing the original error
    config = AutoModelForCausalLM.from_pretrained(
        gated_model_name, 
        token=hf_token,
        trust_remote_code=True
    )
    
    print("✅ SUCCESS: Access granted to gated repository!")
    print("You can now use the Fine-tuned StarCoder model.")
    access_granted = True
    
except OSError as e:
    if "403 Client Error" in str(e) or "gated repo" in str(e):
        print("❌ ACCESS DENIED: You don't have access to this gated repository")
        print("\n🔧 To request access:")
        print("1. Visit: https://huggingface.co/bigcode/starcoderbase-3b")
        print("2. Click 'Request Access' button")
        print("3. Fill out the form explaining your use case")
        print("4. Wait for approval (usually takes 1-2 business days)")
        print("\n💡 In the meantime, we'll use a public model as fallback")
        
    elif "token" in str(e).lower():
        print("❌ TOKEN ERROR: Invalid or missing authentication token")
        print("\n🔧 To fix this:")
        print("1. Go to https://huggingface.co/settings/tokens")
        print("2. Create a new token with 'read' permissions")
        print("3. Add it to your .streamlit/secrets.toml file")
        
    else:
        print(f"❌ UNEXPECTED ERROR: {e}")
        
except Exception as e:
    print(f"❌ OTHER ERROR: {e}")

print(f"\nAccess Status: {'✅ GRANTED' if access_granted else '❌ DENIED'}")

## Step 3: Implement Robust Error Handling

Here's the improved version of your model loading function with proper error handling and fallbacks.

In [None]:
def setup_model_with_fallback(model_name: str, hf_token: str = None):
    """
    Improved model setup function with proper error handling and fallbacks
    """
    print(f"🤖 Setting up model: {model_name}")
    print("-" * 50)
    
    if model_name == "Fine-tuned StarCoder":
        try:
            print("🔄 Attempting to load Fine-tuned StarCoder...")
            
            # Load the adapter configuration first (lightweight test)
            config = PeftConfig.from_pretrained(
                "ArneKreuz/starcoderbase-finetuned-thestack", 
                token=hf_token
            )
            print("✅ Adapter config loaded successfully")
            
            # Load the base model
            base_model = AutoModelForCausalLM.from_pretrained(
                "bigcode/starcoderbase-3b", 
                token=hf_token
            )
            print("✅ Base model loaded successfully")
            
            # Load the fine-tuned model
            model = PeftModel.from_pretrained(
                base_model, 
                "ArneKreuz/starcoderbase-finetuned-thestack", 
                token=hf_token
            )
            print("✅ Fine-tuned StarCoder model loaded successfully!")
            return model
            
        except OSError as e:
            if "403" in str(e) or "gated" in str(e):
                print("❌ Access denied to gated repository")
                print("🔄 Falling back to public Zephyr-7b model...")
            else:
                print(f"❌ OSError: {e}")
                print("🔄 Falling back to public model...")
                
        except Exception as e:
            print(f"❌ Unexpected error: {e}")
            print("🔄 Falling back to public model...")
    
    # Fallback to public model (Zephyr-7b via API endpoint)
    try:
        print("🔄 Loading Zephyr-7b via Hugging Face Inference API...")
        endpoint_url = "https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta"
        
        model = HuggingFaceEndpoint(
            endpoint_url=endpoint_url,
            task="text-generation",
            max_new_tokens=512,
            top_k=50,
            temperature=0.1,
            repetition_penalty=1.03,
            huggingfacehub_api_token=hf_token
        )
        print("✅ Zephyr-7b model loaded successfully as fallback!")
        return model
        
    except Exception as e:
        print(f"❌ Failed to load fallback model: {e}")
        raise RuntimeError("Could not load any model - check your internet connection and token")

# Test the improved function
try:
    test_model = setup_model_with_fallback("Fine-tuned StarCoder", hf_token)
    print(f"\n🎉 Model setup completed successfully!")
    print(f"Model type: {type(test_model).__name__}")
except Exception as e:
    print(f"\n💥 Model setup failed: {e}")

## Step 4: Test Fallback Model Inference

Let's test that the fallback model works correctly for text generation.

In [None]:
# Test inference with the loaded model
test_prompt = """Generate G-code for a simple milling operation:
- Material: Aluminum
- Operation: Face milling
- Workpiece size: 50mm x 50mm x 10mm
- Tool: 10mm end mill"""

print("🧪 Testing model inference...")
print("Prompt:", test_prompt)
print("-" * 60)

try:
    if hasattr(test_model, 'invoke'):
        # For HuggingFaceEndpoint models
        response = test_model.invoke(test_prompt)
    else:
        # For transformers models (if we successfully loaded StarCoder)
        tokenizer = AutoTokenizer.from_pretrained("bigcode/starcoderbase-3b", token=hf_token)
        inputs = tokenizer(test_prompt, return_tensors="pt")
        outputs = test_model.generate(**inputs, max_length=200, temperature=0.7)
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    
    print("✅ Model inference successful!")
    print("Response preview:")
    print("-" * 40)
    print(response[:500] + "..." if len(response) > 500 else response)
    
except Exception as e:
    print(f"❌ Inference failed: {e}")
    print("This might be due to:")
    print("1. API rate limiting")
    print("2. Model loading issues")
    print("3. Network connectivity problems")

## Step 5: Apply Fix to Your Main Application

Now that we've tested the authentication and fallback mechanism, let's apply this fix to your main application.

### ✅ Summary of Issues Fixed:

1. **Added proper authentication** - Login with token before accessing models
2. **Implemented error handling** - Catch OSError for gated repositories  
3. **Added fallback mechanism** - Use public Zephyr-7b when StarCoder isn't accessible
4. **Improved user messaging** - Clear instructions for requesting access

### 🔧 Next Steps:

1. **Update your `model_utils.py`** - The improved version has already been applied
2. **Request access to gated models**: 
   - Visit https://huggingface.co/bigcode/starcoderbase-3b
   - Click "Request Access" and fill out the form
   - Wait for approval (usually 1-2 business days)
3. **Verify your token has correct permissions**:
   - Go to https://huggingface.co/settings/tokens
   - Ensure token has "Read repositories" permission
4. **Test your application** - It should now fallback gracefully to Zephyr-7b

### 🎯 Expected Behavior:

- ✅ If you have access: Uses Fine-tuned StarCoder as intended
- ✅ If access denied: Automatically falls back to Zephyr-7b with informative message
- ✅ No more crashes: Application continues working regardless of gated model access

In [None]:
# Final verification checklist
print("🔍 FINAL VERIFICATION CHECKLIST")
print("=" * 50)

# Check 1: Token availability
token_status = "✅ Available" if hf_token else "❌ Missing"
print(f"1. Hugging Face Token: {token_status}")

# Check 2: Authentication status  
try:
    api = HfApi()
    user_info = api.whoami(token=hf_token) if hf_token else None
    auth_status = f"✅ Authenticated as {user_info['name']}" if user_info else "❌ Not authenticated"
except:
    auth_status = "❌ Authentication failed"
print(f"2. Authentication: {auth_status}")

# Check 3: Gated model access
gated_access = "✅ Access granted" if access_granted else "❌ Access denied (using fallback)"
print(f"3. StarCoder Access: {gated_access}")

# Check 4: Fallback model 
fallback_status = "✅ Working" if 'test_model' in locals() else "❌ Failed to load"
print(f"4. Fallback Model: {fallback_status}")

print("\n🎯 RECOMMENDED ACTIONS:")
if not hf_token:
    print("• Add your Hugging Face token to .streamlit/secrets.toml")
if not access_granted:
    print("• Request access to https://huggingface.co/bigcode/starcoderbase-3b") 
    print("• Your application will use Zephyr-7b until access is granted")
if auth_status.startswith("❌"):
    print("• Verify your token is valid and has correct permissions")

print("\n✅ Your application should now run without crashes!")
print("The fixes have been applied to your model_utils.py file.")