<a href="https://colab.research.google.com/github/soberbichler/DHd_Workshop_2026/blob/main/Inference_API_Huggingface.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Running LLMs via HuggingFace




## Requirements for Hugging Face Inference API

* Free Hugging Face account - No paid subscription required (free tier available)
* Read access token - Generate a token from your account settings
* No payment method required - Free tier includes generous API limits

## Authentication Setup

* Create your access token at huggingface.co/settings/tokens (you will be given an API as part of the workshop)
* Token can have "Read" permissions (no write needed for inference)
* Set token in your code or environment as HF_TOKEN:


## Get Started

1.   Paste the Wikipedia article on Machine Learning under "Article" and ask the model to explain what Machine Learning is to test if this notebook works as intended. Compare your answer with your neighbor to see if the output is consistent.

2.   Add the narrative event detection prompt and an newspaper article (German, French, or English), all availble in our folder. Compare models against each other. Compare models outputs against "ideal" results.

3.   What are your conclusions? Can you improve the results by changing the prompt or model parameters?


In [None]:
from huggingface_hub import InferenceClient
import json
import time

# CONFIGURATION
from google.colab import userdata
HF_TOKEN = userdata.get("HF_TOKEN")
MODEL = "allenai/Olmo-3.1-32B-Instruct"  # You can change model here (allenai/Olmo-3.1-32B-Instruct; meta-llama/Llama-3.3-70B-Instruct; swiss-ai/Apertus-70B-Instruct-2509)

# PROMPT
SYSTEM_PROMPT = """Test"""

# ARTICLE
ARTICLE = """Add historical newspaper article here"""


# MAIN FUNCTION
def extract_events(article_text, model_name=MODEL):
    """
    Extract events from article using HF Inference API

    Args:
        article_text: The newspaper article text
        model_name: HuggingFace model to use

    Returns:
        Extracted events as string (JSON format)
    """
    print(f"Starting event extraction...")
    print(f"Article length: {len(article_text)} characters")
    print(f"Using model: {model_name}")

    # Initialize client
    client = InferenceClient(token=HF_TOKEN)

    # Prepare messages with proper system prompt
    messages = [
        {
            "role": "system",
            "content": SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"## EXTRACT FROM THIS TEXT:\n\n{article_text}\n\nReturn only JSON array."
        }
    ]

    # Call API
    print("Calling Inference API...")
    start_time = time.time()

    try:
        response = client.chat_completion(
            messages=messages,
            model=model_name,
            max_tokens=4000,
            temperature=0.0
        )

        elapsed = time.time() - start_time
        print(f"API call completed in {elapsed:.2f} seconds")

        # Extract result
        result = response.choices[0].message.content
        return result

    except Exception as e:
        print(f"Error: {e}")
        return None


def main():
    # Extract events
    events = extract_events(ARTICLE)

    if events:
        print(events)

        # Try to parse and count
        try:
            # Clean potential markdown
            events_clean = events.replace("```json", "").replace("```", "").strip()
            events_list = json.loads(events_clean)
        except:
            print("\nCould not parse as JSON, but extraction completed.")
    else:
        print("Extraction failed")


if __name__ == "__main__":
    main()