diff --git a/.gitignore b/.gitignore index ef81b1e..f67f53a 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,2 @@ /.venv/ +.DS_Store \ No newline at end of file diff --git a/notebooks/Agentic-Weekly-AI-News-TLDR.ipynb b/notebooks/Agentic-Weekly-AI-News-TLDR.ipynb new file mode 100644 index 0000000..51fac48 --- /dev/null +++ b/notebooks/Agentic-Weekly-AI-News-TLDR.ipynb @@ -0,0 +1,4541 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "09d88f53", + "metadata": {}, + "source": [ + "# Building an AI Weekly Newsletter Pipeline\n", + "\n", + "The AI industry moves fast. Every week brings new research papers, blog posts, product announcements, and technical breakthroughs. Keeping up with developments from ArXiv, OpenAI, Anthropic, Hugging Face, DeepLearning.AI, and other sources can be overwhelming. How do you stay informed without spending hours reading through dozens of publications?\n", + "\n", + "## The Challenge\n", + "\n", + "AI news comes in many formats—research papers (PDFs), blog posts (HTML), newsletters, and articles. Manually tracking and summarizing content from multiple sources is time-consuming and often incomplete. What busy professionals need is an automated system that collects relevant AI content and generates a concise weekly summary of what matters.\n", + "\n", + "## The Solution\n", + "\n", + "This notebook demonstrates an end-to-end pipeline for collecting, processing, and summarizing AI industry content into a weekly newsletter. We use:\n", + "- **Automated scraping** to collect recent AI papers and blog posts\n", + "- **Unstructured's hi_res processing** to extract clean text from PDFs and HTML\n", + "- **AI-powered summarization** to create concise, actionable summaries\n", + "- **Customizable prompts** so you can tailor the newsletter to your audience\n", + "\n", + "## What We'll Build\n", + "\n", + "A complete weekly AI newsletter system that scrapes the last 7 days of content from ArXiv and leading AI blogs, processes the documents through Unstructured's API, and generates both detailed summaries and an executive brief.\n", + "\n", + "```\n", + "┌──────────────────────────────────────────┐\n", + "│ WEEKLY DATA COLLECTION (Last 7 Days) │\n", + "├──────────────────────────────────────────┤\n", + "│ • ArXiv Papers (PDFs) │\n", + "│ • Hugging Face Blog (HTML) │\n", + "│ • OpenAI News (HTML) │\n", + "│ • DeepLearning.AI Batch (HTML) │\n", + "│ • Anthropic Research (HTML) │\n", + "└────────────────┬─────────────────────────┘\n", + " │\n", + " ▼\n", + "┌──────────────────────────────────────────┐\n", + "│ S3 Storage (Collected Content) │\n", + "└────────────────┬─────────────────────────┘\n", + " │\n", + " ▼\n", + "┌──────────────────────────────────────────┐\n", + "│ Unstructured API Processing │\n", + "│ • Hi-Res PDF Partitioning │\n", + "│ • HTML Text Extraction │\n", + "│ • Page-Based Chunking │\n", + "└────────────────┬─────────────────────────┘\n", + " │\n", + " ▼\n", + "┌──────────────────────────────────────────┐\n", + "│ MongoDB (Structured Content) │\n", + "└────────────────┬─────────────────────────┘\n", + " │\n", + " ▼\n", + "┌──────────────────────────────────────────┐\n", + "│ AI Summarization & Newsletter Gen │\n", + "│ • Detailed Publication Summaries │\n", + "│ • Executive Brief (~700 words) │\n", + "└──────────────────────────────────────────┘\n", + "```\n", + "\n", + "**Note**: In production, you would run the scraping daily via cron job. For this demo, we simulate a week's worth of data collection by scraping 7 days of content in one batch.\n", + "\n", + "By the end, you'll have a working system that can automatically generate weekly AI newsletters tailored to your needs." + ] + }, + { + "cell_type": "markdown", + "id": "063114d2", + "metadata": {}, + "source": [ + "## Getting Started: Your Unstructured API Key\n", + "\n", + "You'll need an Unstructured API key to access the auto document processing platform.\n", + "\n", + "### Sign Up and Get Your API Key\n", + "\n", + "Visit https://platform.unstructured.io to sign up for a free account, navigate to API Keys in the sidebar, and generate your API key. For Team or Enterprise accounts, select the correct organizational workspace before creating your key.\n", + "\n", + "**Need help?** Contact Unstructured Support at support@unstructured.io" + ] + }, + { + "cell_type": "markdown", + "id": "8d21c92f", + "metadata": {}, + "source": [ + "## Configuration: Setting Up Your Environment\n", + "\n", + "We'll configure your environment with the necessary API keys and credentials to connect to data sources and AI services." + ] + }, + { + "cell_type": "markdown", + "id": "f353fb74", + "metadata": {}, + "source": [ + "### Creating a .env File in Google Colab\n", + "\n", + "For better security and organization, we'll create a `.env` file directly in your Colab environment. Run the code cell below to create the file with placeholder values, then edit it with your actual credentials.\n", + "\n", + "After running the code cell, you'll need to replace each placeholder value (like `your-unstructured-api-key`) with your actual API keys and credentials." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "d09f538b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "📝 .env file already exists - skipping creation\n", + "💡 Using existing .env file with current configuration\n" + ] + } + ], + "source": [ + "import os\n", + "\n", + "def create_dotenv_file():\n", + " \"\"\"Create a .env file with placeholder values for the user to fill in, only if it doesn't already exist.\"\"\"\n", + " \n", + " # Check if .env file already exists\n", + " if os.path.exists('.env'):\n", + " print(\"📝 .env file already exists - skipping creation\")\n", + " print(\"💡 Using existing .env file with current configuration\")\n", + " return\n", + " \n", + " env_content = \"\"\"# AI Newsletter Pipeline Environment Configuration\n", + "# Fill in your actual values below\n", + "# Configuration - Set these explicitly\n", + "\n", + "# ===================================================================\n", + "# AWS CONFIGURATION\n", + "# ===================================================================\n", + "AWS_ACCESS_KEY_ID=\"your-aws-access-key-id\"\n", + "AWS_SECRET_ACCESS_KEY=\"your-aws-secret-access-key\"\n", + "AWS_REGION=\"us-east-1\"\n", + "\n", + "# ===================================================================\n", + "# UNSTRUCTURED API CONFIGURATION \n", + "# ===================================================================\n", + "UNSTRUCTURED_API_KEY=\"your-unstructured-api-key\"\n", + "UNSTRUCTURED_API_URL=\"https://platform.unstructuredapp.io/api/v1\"\n", + "\n", + "# ===================================================================\n", + "# MONGODB CONFIGURATION\n", + "# ===================================================================\n", + "MONGODB_URI=\"mongodb+srv://:@/?retryWrites=true&w=majority\"\n", + "MONGODB_DATABASE=\"scraped_publications\"\n", + "MONGODB_COLLECTION=\"documents\"\n", + "\n", + "# ===================================================================\n", + "# PIPELINE DATA SOURCES\n", + "# ===================================================================\n", + "S3_SOURCE_BUCKET=\"your-s3-bucket-name\"\n", + "\n", + "# ===================================================================\n", + "# OPENAI API CONFIGURATION \n", + "# ===================================================================\n", + "OPENAI_API_KEY=\"your-openai-api-key\"\n", + "\"\"\"\n", + " \n", + " with open('.env', 'w') as f:\n", + " f.write(env_content)\n", + " \n", + " print(\"✅ Created .env file with placeholder values\")\n", + " print(\"📝 Please edit the .env file and replace the placeholder values with your actual credentials\")\n", + " print(\"🔑 Required: UNSTRUCTURED_API_KEY, AWS credentials, MongoDB credentials, Firecrawl API key\")\n", + " print(\"📁 S3_SOURCE_BUCKET should point to your AI content storage bucket\")\n", + " print(\"🤖 OPENAI_API_KEY needed for AI-powered summarization and newsletter generation\")\n", + "\n", + "create_dotenv_file()" + ] + }, + { + "cell_type": "markdown", + "id": "82fd2988", + "metadata": {}, + "source": [ + "### Installing Required Dependencies\n", + "\n", + "Installing the Python packages needed: Unstructured client, MongoDB connector, AWS SDK, OpenAI integration, and document processing dependencies." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "45877702", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Configuration loaded successfully\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.1.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.2\u001b[0m\n", + "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n" + ] + } + ], + "source": [ + "import sys, subprocess\n", + "\n", + "def ensure_notebook_deps() -> None:\n", + " packages = [\n", + " \"jupytext\",\n", + " \"python-dotenv\", \n", + " \"unstructured-client\",\n", + " \"boto3\",\n", + " \"PyYAML\",\n", + " \"langchain\",\n", + " \"langchain-openai\",\n", + " \"pymongo\",\n", + " \"firecrawl-py\",\n", + " \"arxiv\",\n", + " \"python-dateutil\"\n", + " ]\n", + " try:\n", + " subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"-q\", *packages])\n", + " except Exception:\n", + " # If install fails, continue; imports below will surface actionable errors\n", + " pass\n", + "\n", + "# Install notebook dependencies (safe no-op if present)\n", + "ensure_notebook_deps()\n", + "\n", + "import os\n", + "import time\n", + "import json\n", + "import zipfile\n", + "import tempfile\n", + "import requests\n", + "from pathlib import Path\n", + "from dotenv import load_dotenv\n", + "from urllib.parse import urlparse\n", + "\n", + "import boto3\n", + "from botocore.exceptions import ClientError, NoCredentialsError\n", + "\n", + "from unstructured_client import UnstructuredClient\n", + "from unstructured_client.models.operations import (\n", + " CreateSourceRequest,\n", + " CreateDestinationRequest,\n", + " CreateWorkflowRequest\n", + ")\n", + "from unstructured_client.models.shared import (\n", + " CreateSourceConnector,\n", + " CreateDestinationConnector,\n", + " WorkflowNode,\n", + " WorkflowType,\n", + " CreateWorkflow\n", + ")\n", + "\n", + "# =============================================================================\n", + "# ENVIRONMENT CONFIGURATION\n", + "# =============================================================================\n", + "# Load from .env file if it exists\n", + "load_dotenv()\n", + "\n", + "# Configuration constants\n", + "SKIPPED = \"SKIPPED\"\n", + "UNSTRUCTURED_API_URL = os.getenv(\"UNSTRUCTURED_API_URL\", \"https://platform.unstructuredapp.io/api/v1\")\n", + "\n", + "# Get environment variables\n", + "UNSTRUCTURED_API_KEY = os.getenv(\"UNSTRUCTURED_API_KEY\")\n", + "AWS_ACCESS_KEY_ID = os.getenv(\"AWS_ACCESS_KEY_ID\")\n", + "AWS_SECRET_ACCESS_KEY = os.getenv(\"AWS_SECRET_ACCESS_KEY\")\n", + "AWS_REGION = os.getenv(\"AWS_REGION\")\n", + "S3_SOURCE_BUCKET = os.getenv(\"S3_SOURCE_BUCKET\")\n", + "S3_DESTINATION_BUCKET = os.getenv(\"S3_DESTINATION_BUCKET\")\n", + "S3_OUTPUT_PREFIX = os.getenv(\"S3_OUTPUT_PREFIX\", \"\")\n", + "MONGODB_URI = os.getenv(\"MONGODB_URI\")\n", + "MONGODB_DATABASE = os.getenv(\"MONGODB_DATABASE\")\n", + "MONGODB_COLLECTION = os.getenv(\"MONGODB_COLLECTION\")\n", + "OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\")\n", + "FIRECRAWL_API_KEY = os.getenv(\"FIRECRAWL_API_KEY\")\n", + "\n", + "# Validation\n", + "REQUIRED_VARS = {\n", + " \"UNSTRUCTURED_API_KEY\": UNSTRUCTURED_API_KEY,\n", + " \"AWS_ACCESS_KEY_ID\": AWS_ACCESS_KEY_ID,\n", + " \"AWS_SECRET_ACCESS_KEY\": AWS_SECRET_ACCESS_KEY,\n", + " \"AWS_REGION\": AWS_REGION,\n", + " \"MONGODB_URI\": MONGODB_URI,\n", + " \"MONGODB_DATABASE\": MONGODB_DATABASE,\n", + " \"MONGODB_COLLECTION\": MONGODB_COLLECTION,\n", + " \"S3_SOURCE_BUCKET\": S3_SOURCE_BUCKET,\n", + "}\n", + "\n", + "missing_vars = [key for key, value in REQUIRED_VARS.items() if not value]\n", + "if missing_vars:\n", + " print(f\"❌ Missing required environment variables: {', '.join(missing_vars)}\")\n", + " print(\"Please set these environment variables or create a .env file with your credentials.\")\n", + " raise ValueError(f\"Missing required environment variables: {missing_vars}\")\n", + "\n", + "print(\"✅ Configuration loaded successfully\")" + ] + }, + { + "cell_type": "markdown", + "id": "0067222b", + "metadata": { + "lines_to_next_cell": 0 + }, + "source": [] + }, + { + "cell_type": "markdown", + "id": "25e01ce8", + "metadata": {}, + "source": [ + "## AWS S3: Your Content Collection Repository\n", + "\n", + "Now that we have our environment configured, let's set up S3 as the central repository for collected AI content. The scraping pipeline will deposit PDFs (ArXiv papers) and HTML files (blog posts) into your S3 bucket, where they'll be ready for processing by the Unstructured API.\n", + "\n", + "### What You Need\n", + "\n", + "**An existing S3 bucket** to store scraped AI content. The following sections will automatically populate this bucket with:\n", + "- Recent AI/ML research papers from ArXiv (PDF format)\n", + "- Blog posts from Hugging Face, OpenAI, DeepLearning.AI, and Anthropic (HTML format)\n", + "\n", + "> **Note**: You'll need an AWS account with S3 access, an IAM user with read/write permissions, and your access keys (Access Key ID and Secret Access Key). For detailed S3 setup instructions, see the [Unstructured S3 source connector documentation](https://docs.unstructured.io/api-reference/api-services/source-connectors/s3).\n", + "\n", + "**Adaptable to Other Use Cases**: This same approach can be adapted for competitor tracking, industry news monitoring, internal document aggregation, or any scenario where you need to collect and summarize content from multiple sources regularly." + ] + }, + { + "cell_type": "markdown", + "id": "462b776a", + "metadata": {}, + "source": [ + "## Automated Content Scraping: Gathering AI Industry Intelligence\n", + "\n", + "The first step in building a weekly AI newsletter is collecting content from multiple sources. This section demonstrates automated scraping that gathers recent AI research papers and blog posts.\n", + "\n", + "**Data Sources:**\n", + "1. **ArXiv** - Recent AI/ML research papers from cs.AI, cs.LG, cs.CL, cs.CV, and cs.NE categories\n", + "2. **AI Company Blogs** - Blog posts from Hugging Face, OpenAI, DeepLearning.AI, and Anthropic\n", + "\n", + "**Process Flow:**\n", + "```\n", + "ArXiv API → PDFs → S3\n", + "Firecrawl API → Blog HTML → S3\n", + " ↓\n", + " Unstructured Processing → MongoDB → AI Summarization\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "e49fcf9a", + "metadata": {}, + "source": [ + "### Scraping ArXiv Research Papers\n", + "\n", + "This cell scrapes recent AI/ML papers from ArXiv, filters them by category, and uploads PDFs directly to your S3 bucket. The cell searches ArXiv for papers matching your criteria, downloads PDFs, and uploads them to S3 under `arxiv/papers/`.\n", + "\n", + "**Demo Configuration**: For this demo, we've capped the results at 5 articles to keep notebook runtime under 2 minutes. You can increase `MAX_RESULTS` in the code below to collect more papers for production use. Customize the `SEARCH_QUERY`, `ARXIV_CATEGORIES`, and `DAYS_BACK` parameters to focus on specific topics or adjust the date range." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "14ecf856", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "============================================================\n", + "📚 ARXIV PAPER SCRAPING\n", + "============================================================\n", + "\n", + "🔍 Searching for papers from the last 7 days\n", + " Query: artificial intelligence OR machine learning\n", + " Max results: 5\n", + " Categories: cs.AI, cs.LG, cs.CL, cs.CV, cs.NE\n", + "\n", + "📥 Searching ArXiv...\n", + "✅ Found 5 papers\n", + "\n", + "📄 Processing: Temporal Prompting Matters: Rethinking Referring Video Objec...\n", + " ArXiv ID: 2510.07319v1\n", + " Published: 2025-10-08\n", + " Categories: cs.CV\n", + "\n", + "📄 Processing: Artificial Hippocampus Networks for Efficient Long-Context M...\n", + " ArXiv ID: 2510.07318v1\n", + " Published: 2025-10-08\n", + " Categories: cs.CL, cs.AI, cs.LG\n", + "\n", + "📄 Processing: Quantum-enhanced Computer Vision: Going Beyond Classical Alg...\n", + " ArXiv ID: 2510.07317v1\n", + " Published: 2025-10-08\n", + " Categories: cs.CV\n", + "\n", + "📄 Processing: Vibe Checker: Aligning Code Evaluation with Human Preference...\n", + " ArXiv ID: 2510.07315v1\n", + " Published: 2025-10-08\n", + " Categories: cs.CL, cs.AI, cs.LG\n", + "\n", + "📄 Processing: GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Si...\n", + " ArXiv ID: 2510.07314v1\n", + " Published: 2025-10-08\n", + " Categories: physics.plasm-ph, cs.AI, stat.ML\n", + "\n", + "============================================================\n", + "✅ ARXIV SCRAPING COMPLETE\n", + "============================================================\n", + " 📥 Papers scraped: 5\n", + " ⏭️ Papers skipped: 0\n", + " 📦 S3 Bucket: ai-papers-and-blogs-notebook\n", + " 📁 S3 Prefix: arxiv/papers/\n" + ] + } + ], + "source": [ + "# ============================================================\n", + "# CONFIGURATION - Customize these parameters\n", + "# ============================================================\n", + "\n", + "# Search configuration\n", + "SEARCH_QUERY = \"artificial intelligence OR machine learning\"\n", + "MAX_RESULTS = 5 # Number of papers to retrieve (capped for demo - increase for production)\n", + "DAYS_BACK = 7 # How many days back to search\n", + "ARXIV_CATEGORIES = [\"cs.AI\", \"cs.LG\", \"cs.CL\", \"cs.CV\", \"cs.NE\"] # AI/ML categories\n", + "\n", + "# ============================================================\n", + "# ArXiv Scraping Logic\n", + "# ============================================================\n", + "\n", + "import arxiv\n", + "from datetime import datetime, timedelta\n", + "from io import BytesIO\n", + "\n", + "print(\"=\"*60)\n", + "print(\"📚 ARXIV PAPER SCRAPING\")\n", + "print(\"=\"*60)\n", + "\n", + "# Calculate date threshold (timezone-aware to match arxiv library)\n", + "from datetime import timezone\n", + "date_threshold = datetime.now(timezone.utc) - timedelta(days=DAYS_BACK)\n", + "print(f\"\\n🔍 Searching for papers from the last {DAYS_BACK} days\")\n", + "print(f\" Query: {SEARCH_QUERY}\")\n", + "print(f\" Max results: {MAX_RESULTS}\")\n", + "print(f\" Categories: {', '.join(ARXIV_CATEGORIES)}\")\n", + "\n", + "# Initialize S3 client\n", + "s3 = boto3.client(\n", + " 's3',\n", + " aws_access_key_id=AWS_ACCESS_KEY_ID,\n", + " aws_secret_access_key=AWS_SECRET_ACCESS_KEY,\n", + " region_name=AWS_REGION\n", + ")\n", + "\n", + "# Search ArXiv\n", + "print(f\"\\n📥 Searching ArXiv...\")\n", + "client = arxiv.Client()\n", + "search = arxiv.Search(\n", + " query=SEARCH_QUERY,\n", + " max_results=MAX_RESULTS,\n", + " sort_by=arxiv.SortCriterion.SubmittedDate\n", + ")\n", + "\n", + "results = list(client.results(search))\n", + "print(f\"✅ Found {len(results)} papers\")\n", + "\n", + "# Filter and upload papers\n", + "scraped_count = 0\n", + "skipped_count = 0\n", + "\n", + "for paper in results:\n", + " # Check if paper is in desired categories\n", + " categories = [cat.split('.')[-1] for cat in paper.categories]\n", + " if not any(cat in ARXIV_CATEGORIES for cat in paper.categories):\n", + " skipped_count += 1\n", + " continue\n", + " \n", + " # Check if paper is recent enough (both datetimes are now timezone-aware)\n", + " if paper.published < date_threshold:\n", + " skipped_count += 1\n", + " continue\n", + " \n", + " print(f\"\\n📄 Processing: {paper.title[:60]}...\")\n", + " print(f\" ArXiv ID: {paper.entry_id.split('/')[-1]}\")\n", + " print(f\" Published: {paper.published.strftime('%Y-%m-%d')}\")\n", + " print(f\" Categories: {', '.join(paper.categories[:3])}\")\n", + " \n", + " try:\n", + " # Download PDF\n", + " pdf_url = paper.pdf_url\n", + " pdf_response = requests.get(pdf_url, timeout=30)\n", + " pdf_content = pdf_response.content\n", + " \n", + " # Generate S3 key\n", + " arxiv_id = paper.entry_id.split('/')[-1].replace('.', 'v')\n", + " s3_key = f\"arxiv/papers/{arxiv_id}.pdf\"\n", + " \n", + " # Upload to S3\n", + " s3.put_object(\n", + " Bucket=S3_SOURCE_BUCKET,\n", + " Key=s3_key,\n", + " Body=pdf_content,\n", + " ContentType='application/pdf',\n", + " Metadata={\n", + " 'title': paper.title[:1000], # S3 metadata has size limits\n", + " 'published': paper.published.isoformat(),\n", + " 'arxiv_id': arxiv_id,\n", + " 'source': 'arxiv'\n", + " }\n", + " )\n", + " \n", + " scraped_count += 1\n", + " \n", + " except Exception as e:\n", + " print(f\" ❌ Error: {str(e)[:100]}\")\n", + " skipped_count += 1\n", + "\n", + "# Summary\n", + "print(f\"\\n{'='*60}\")\n", + "print(f\"✅ ARXIV SCRAPING COMPLETE\")\n", + "print(f\"{'='*60}\")\n", + "print(f\" 📥 Papers scraped: {scraped_count}\")\n", + "print(f\" ⏭️ Papers skipped: {skipped_count}\")\n", + "print(f\" 📦 S3 Bucket: {S3_SOURCE_BUCKET}\")\n", + "print(f\" 📁 S3 Prefix: arxiv/papers/\") " + ] + }, + { + "cell_type": "markdown", + "id": "ffc3f185", + "metadata": {}, + "source": [ + "### Scraping AI Company Blogs with Firecrawl\n", + "\n", + "This cell uses Firecrawl to scrape recent blog posts from AI companies, extracting clean HTML content. Firecrawl handles JavaScript-rendered content and provides clean HTML output, making it ideal for scraping modern AI company blogs.\n", + "\n", + "**Demo Configuration**: For this demo, we've commented out all blog sources except Hugging Face to keep notebook runtime under 2 minutes. You can uncomment the other sources in the code below (OpenAI, DeepLearning.AI, and Anthropic) to experiment with collecting data from those sources. Customize the `DAYS_BACK` parameter or modify the `BLOG_SOURCES` dictionary to add your own sources." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "0a3d5820", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "============================================================\n", + "🌐 BLOG SCRAPING WITH FIRECRAWL\n", + "============================================================\n", + "\n", + "🔍 Scraping posts from the last 7 days\n", + " Sources: 1\n", + "\n", + "🤗 Hugging Face\n", + " ──────────────────────────────────────────────────\n", + " 📍 https://huggingface.co/blog\n", + " 🔄 Scraping directory...\n", + " ✅ Found 35 blog post links\n", + " 📥 Scraping: https://huggingface.co/blog/AdamF92/reactive-transformer-int...\n", + " 📥 Scraping: https://huggingface.co/blog/JohnsonZheng03/ml-agent-trick-au...\n", + " 📥 Scraping: https://huggingface.co/blog/NormalUhr/grpo...\n", + " 📥 Scraping: https://huggingface.co/blog/NormalUhr/grpo-to-dapo-and-gspo...\n", + " 📥 Scraping: https://huggingface.co/blog/NormalUhr/rlhf-pipeline...\n", + " 📥 Scraping: https://huggingface.co/blog/bigcode/arena...\n", + " 📥 Scraping: https://huggingface.co/blog/catherinearnett/in-defense-of-to...\n", + " 📥 Scraping: https://huggingface.co/blog/dots-ocr-ne...\n", + " 📥 Scraping: https://huggingface.co/blog/driaforall/mem-agent-blog...\n", + " 📥 Scraping: https://huggingface.co/blog/faster-transformers...\n", + " 📊 Scraped 10 posts from Hugging Face\n", + "\n", + "============================================================\n", + "✅ BLOG SCRAPING COMPLETE\n", + "============================================================\n", + " 📥 Total posts scraped: 10\n", + " 📦 S3 Bucket: ai-papers-and-blogs-notebook\n", + " 📁 S3 Prefix: blog-posts/\n", + "\n", + "💡 Note: Posts are now ready for Unstructured processing!\n" + ] + } + ], + "source": [ + "# ============================================================\n", + "# CONFIGURATION - Customize these parameters\n", + "# ============================================================\n", + "\n", + "# Scraping configuration\n", + "DAYS_BACK = 7 # How many days of recent posts to retrieve\n", + "\n", + "# Blog source URLs (pre-configured)\n", + "BLOG_SOURCES = {\n", + " \"huggingface\": {\n", + " \"name\": \"Hugging Face\",\n", + " \"directory_url\": \"https://huggingface.co/blog\",\n", + " \"icon\": \"🤗\"\n", + " },\n", + " # \"openai\": {\n", + " # \"name\": \"OpenAI\",\n", + " # \"directory_url\": \"https://openai.com/news/\",\n", + " # \"icon\": \"🚀\"\n", + " # },\n", + " # \"deeplearning\": {\n", + " # \"name\": \"DeepLearning.AI\",\n", + " # \"directory_url\": \"https://www.deeplearning.ai/the-batch/\",\n", + " # \"icon\": \"📚\"\n", + " # },\n", + " # \"anthropic\": {\n", + " # \"name\": \"Anthropic\",\n", + " # \"directory_url\": \"https://www.anthropic.com/research\",\n", + " # \"icon\": \"🔬\"\n", + " # }\n", + "}\n", + "\n", + "# ============================================================\n", + "# Blog Scraping Logic with Firecrawl\n", + "# ============================================================\n", + "\n", + "from firecrawl import Firecrawl\n", + "from datetime import datetime, timedelta\n", + "from urllib.parse import urlparse\n", + "import re\n", + "\n", + "print(\"=\"*60)\n", + "print(\"🌐 BLOG SCRAPING WITH FIRECRAWL\")\n", + "print(\"=\"*60)\n", + "\n", + "# Helper function to convert Firecrawl Document objects to dictionaries\n", + "def convert_document_to_dict(doc):\n", + " \"\"\"Convert Firecrawl Document object to dictionary format.\"\"\"\n", + " if isinstance(doc, dict):\n", + " return doc\n", + " \n", + " # Handle Document object from newer firecrawl-py versions\n", + " result_dict = {}\n", + " \n", + " # Get attributes from the Document object\n", + " if hasattr(doc, 'markdown'):\n", + " result_dict['markdown'] = doc.markdown\n", + " if hasattr(doc, 'html'):\n", + " result_dict['html'] = doc.html\n", + " if hasattr(doc, 'links'):\n", + " result_dict['links'] = doc.links if doc.links else []\n", + " if hasattr(doc, 'metadata'):\n", + " # metadata is also an object, convert to dict\n", + " metadata_obj = doc.metadata\n", + " if metadata_obj:\n", + " if isinstance(metadata_obj, dict):\n", + " result_dict['metadata'] = metadata_obj\n", + " else:\n", + " # Convert metadata object to dict using __dict__ or vars()\n", + " result_dict['metadata'] = vars(metadata_obj) if hasattr(metadata_obj, '__dict__') else {}\n", + " else:\n", + " result_dict['metadata'] = {}\n", + " if hasattr(doc, 'extract'):\n", + " result_dict['json'] = doc.extract\n", + " \n", + " return result_dict\n", + "\n", + "# Filter blog links to exclude non-blog content\n", + "def filter_blog_links(links, source_key, directory_url):\n", + " \"\"\"Filter links to find actual blog posts, excluding images, profiles, etc.\"\"\"\n", + " # Blacklist of specific URLs to exclude\n", + " EXCLUDED_URLS = [\n", + " 'https://huggingface.co/blog/community',\n", + " 'https://anthropic.com/press-kit',\n", + " ]\n", + " \n", + " # Extract domain from directory URL\n", + " directory_domain = urlparse(directory_url).netloc\n", + " \n", + " blog_links = []\n", + " \n", + " for link in links:\n", + " if not isinstance(link, str):\n", + " continue\n", + " \n", + " # Skip non-HTTP protocols\n", + " if not link.startswith('http'):\n", + " continue\n", + " \n", + " # Skip image files\n", + " if any(link.lower().endswith(ext) for ext in ['.png', '.jpg', '.jpeg', '.gif', '.svg', '.webp']):\n", + " continue\n", + " \n", + " # Skip CDN and avatar URLs\n", + " if 'cdn-avatars' in link or '/assets/' in link:\n", + " continue\n", + " \n", + " # Only include links from the same domain\n", + " link_domain = urlparse(link).netloc\n", + " if link_domain != directory_domain:\n", + " continue\n", + " \n", + " # Source-specific filtering\n", + " if source_key == 'huggingface':\n", + " # Must have /blog/ and content after it (not just directory or community)\n", + " if '/blog/' in link:\n", + " blog_parts = link.split('/blog/')\n", + " if len(blog_parts) > 1 and blog_parts[1].strip('/'):\n", + " # Exclude community page\n", + " if link not in EXCLUDED_URLS:\n", + " blog_links.append(link)\n", + " \n", + " elif source_key == 'deeplearning':\n", + " # Must have /the-batch/ but NOT /tag/ (tag pages are navigation)\n", + " if '/the-batch/' in link and '/tag/' not in link:\n", + " blog_links.append(link)\n", + " \n", + " elif source_key == 'anthropic':\n", + " # Include both /news/ and /research/ posts\n", + " if '/news/' in link or '/research/' in link:\n", + " if link not in EXCLUDED_URLS:\n", + " blog_links.append(link)\n", + " \n", + " elif source_key == 'openai':\n", + " # OpenAI uses /index/ for actual articles\n", + " if '/index/' in link:\n", + " # Exclude category pages that end with these paths\n", + " category_pages = ['/product-releases/', '/research/', '/safety-alignment/', '/news/']\n", + " is_category = any(link.endswith(cat) for cat in category_pages)\n", + " if not is_category:\n", + " blog_links.append(link)\n", + " \n", + " # Remove duplicates and sort\n", + " return sorted(list(set(blog_links)))\n", + "\n", + "# Initialize Firecrawl and S3\n", + "firecrawl_client = Firecrawl(api_key=FIRECRAWL_API_KEY)\n", + "s3 = boto3.client(\n", + " 's3',\n", + " aws_access_key_id=AWS_ACCESS_KEY_ID,\n", + " aws_secret_access_key=AWS_SECRET_ACCESS_KEY,\n", + " region_name=AWS_REGION\n", + ")\n", + "\n", + "date_threshold = datetime.now() - timedelta(days=DAYS_BACK)\n", + "print(f\"\\n🔍 Scraping posts from the last {DAYS_BACK} days\")\n", + "print(f\" Sources: {len(BLOG_SOURCES)}\")\n", + "\n", + "total_scraped = 0\n", + "\n", + "for source_key, source_info in BLOG_SOURCES.items():\n", + " icon = source_info[\"icon\"]\n", + " name = source_info[\"name\"]\n", + " directory_url = source_info[\"directory_url\"]\n", + " \n", + " print(f\"\\n{icon} {name}\")\n", + " print(f\" {'─'*50}\")\n", + " print(f\" 📍 {directory_url}\")\n", + " \n", + " try:\n", + " # Scrape directory page with link extraction\n", + " print(f\" 🔄 Scraping directory...\")\n", + " directory_result_raw = firecrawl_client.scrape(\n", + " url=directory_url,\n", + " formats=[\"markdown\", \"html\", \"links\"],\n", + " only_main_content=True\n", + " )\n", + " \n", + " # Convert Document to dict\n", + " directory_result = convert_document_to_dict(directory_result_raw)\n", + " \n", + " if not directory_result:\n", + " print(f\" ❌ Failed to scrape directory\")\n", + " continue\n", + " \n", + " # Extract and filter blog links\n", + " all_links = directory_result.get('links', [])\n", + " blog_links = filter_blog_links(all_links, source_key, directory_url)\n", + " \n", + " print(f\" ✅ Found {len(blog_links)} blog post links\")\n", + " \n", + " # Limit to 10 posts per source for demo\n", + " post_urls = blog_links[:10]\n", + " \n", + " # Scrape individual posts\n", + " scraped_count = 0\n", + " for post_url in post_urls:\n", + " try:\n", + " # Add delay to be respectful\n", + " import time\n", + " time.sleep(1)\n", + " \n", + " print(f\" 📥 Scraping: {post_url[:60]}...\")\n", + " \n", + " # Scrape post with HTML format\n", + " post_result_raw = firecrawl_client.scrape(\n", + " url=post_url,\n", + " formats=[\"html\"],\n", + " only_main_content=True\n", + " )\n", + " \n", + " # Convert Document to dict\n", + " post_result = convert_document_to_dict(post_result_raw)\n", + " \n", + " if not post_result or not post_result.get('html'):\n", + " print(f\" ⚠️ No HTML returned\")\n", + " continue\n", + " \n", + " html_content = post_result['html']\n", + " \n", + " # Generate S3 key\n", + " url_path = urlparse(post_url).path.strip('/').replace('/', '_')\n", + " timestamp = datetime.now().strftime(\"%Y%m%d_%H%M%S\")\n", + " s3_key = f\"blog-posts/{source_key}/{url_path}_{timestamp}.html\"\n", + " \n", + " # Upload to S3\n", + " s3.put_object(\n", + " Bucket=S3_SOURCE_BUCKET,\n", + " Key=s3_key,\n", + " Body=html_content.encode('utf-8'),\n", + " ContentType='text/html',\n", + " Metadata={\n", + " 'url': post_url[:1000],\n", + " 'source': source_key,\n", + " 'scraped_at': datetime.now().isoformat()\n", + " }\n", + " )\n", + " \n", + " scraped_count += 1\n", + " total_scraped += 1\n", + " \n", + " except Exception as e:\n", + " print(f\" ❌ Error: {str(e)[:100]}\")\n", + " \n", + " print(f\" 📊 Scraped {scraped_count} posts from {name}\")\n", + " \n", + " except Exception as e:\n", + " print(f\" ❌ Error scraping {name}: {str(e)[:100]}\")\n", + "\n", + "# Summary\n", + "print(f\"\\n{'='*60}\")\n", + "print(f\"✅ BLOG SCRAPING COMPLETE\")\n", + "print(f\"{'='*60}\")\n", + "print(f\" 📥 Total posts scraped: {total_scraped}\")\n", + "print(f\" 📦 S3 Bucket: {S3_SOURCE_BUCKET}\")\n", + "print(f\" 📁 S3 Prefix: blog-posts/\")\n", + "print(f\"\\n💡 Note: Posts are now ready for Unstructured processing!\")" + ] + }, + { + "cell_type": "markdown", + "id": "e5bfbd2c", + "metadata": {}, + "source": [ + "## S3 Source Connector\n", + "\n", + "Creating the connection to your S3 document repository. This connector will authenticate with your bucket, discover PDF files, and stream them to the processing pipeline.\n", + "\n", + "**Recursive Processing**: The connector is configured with `recursive: true` to access files within nested folder structures, ensuring comprehensive document discovery across your entire S3 bucket hierarchy.\n", + "\n", + "> **Note**: For detailed S3 source connector setup instructions, see the [Unstructured S3 source connector documentation](https://docs.unstructured.io/api-reference/workflow/sources/s3)." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "0dfc7863", + "metadata": { + "lines_to_next_cell": 1 + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/functional_validators.py:218: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='s3', input_type=str])\n", + " function=lambda v, h: h(v),\n", + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/main.py:463: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='s3', input_type=str])\n", + " return self.__pydantic_serializer__.to_python(\n", + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/sources/ \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Created S3 PDF source connector: f10667f2-3430-4d20-8edb-e7a3d379bb66 -> s3://ai-papers-and-blogs-notebook/\n", + "📁 S3 source connector ready to read PDF documents from: ai-papers-and-blogs-notebook\n" + ] + } + ], + "source": [ + "def create_s3_source_connector():\n", + " \"\"\"Create an S3 source connector for PDF documents.\"\"\"\n", + " try:\n", + " if not S3_SOURCE_BUCKET:\n", + " raise ValueError(\"S3_SOURCE_BUCKET is required (bucket name, s3:// URL, or https:// URL)\")\n", + " value = S3_SOURCE_BUCKET.strip()\n", + "\n", + " if value.startswith(\"s3://\"):\n", + " s3_style = value if value.endswith(\"/\") else value + \"/\"\n", + " elif value.startswith(\"http://\") or value.startswith(\"https://\"):\n", + " parsed = urlparse(value)\n", + " host = parsed.netloc\n", + " path = parsed.path or \"/\"\n", + " bucket = host.split(\".s3.\")[0]\n", + " s3_style = f\"s3://{bucket}{path if path.endswith('/') else path + '/'}\"\n", + " else:\n", + " s3_style = f\"s3://{value if value.endswith('/') else value + '/'}\"\n", + " \n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.sources.create_source(\n", + " request=CreateSourceRequest(\n", + " create_source_connector=CreateSourceConnector(\n", + " name=\"\",\n", + " type=\"s3\",\n", + " config={\n", + " \"remote_url\": s3_style,\n", + " \"recursive\": True, \n", + " \"key\": AWS_ACCESS_KEY_ID,\n", + " \"secret\": AWS_SECRET_ACCESS_KEY,\n", + " }\n", + " )\n", + " )\n", + " )\n", + " \n", + " source_id = response.source_connector_information.id\n", + " print(f\"✅ Created S3 PDF source connector: {source_id} -> {s3_style}\")\n", + " return source_id\n", + " \n", + " except Exception as e:\n", + " print(f\"❌ Error creating S3 source connector: {e}\")\n", + " return None\n", + "\n", + "# Create S3 source connector\n", + "source_id = create_s3_source_connector()\n", + "\n", + "if source_id:\n", + " print(f\"📁 S3 source connector ready to read PDF documents from: {S3_SOURCE_BUCKET}\")\n", + "else:\n", + " print(\"❌ Failed to create S3 source connector - check your credentials and bucket configuration\") " + ] + }, + { + "cell_type": "markdown", + "id": "b2baf926", + "metadata": {}, + "source": [ + "## MongoDB: Your Document Database\n", + "\n", + "MongoDB Atlas stores processed content from your AI papers and blog posts. The pipeline uses page-based chunking (up to 6k characters per chunk) to create structured, manageable documents for downstream summarization.\n", + "\n", + "### Requirements\n", + "\n", + "- **MongoDB Atlas cluster** (M10+ for production, M0 free tier for testing)\n", + "- **Network access** configured for your application IP\n", + "- **Database user** with read/write permissions\n", + "- **Connection string** in format: `mongodb+srv://:@/...`\n", + "\n", + "### Document Structure\n", + "\n", + "Each document represents one page-level chunk:\n", + "```json\n", + "{\n", + " \"type\": \"CompositeElement\",\n", + " \"text\": \"Full text content from this page/chunk...\",\n", + " \"metadata\": {\n", + " \"filename\": \"arxiv_2501.12345.pdf\",\n", + " \"page_number\": 1,\n", + " \"languages\": [\"eng\"]\n", + " }\n", + "}\n", + "```\n", + "\n", + "The collection is cleared before each processing run to ensure fresh data for newsletter generation." + ] + }, + { + "cell_type": "markdown", + "id": "ac8d0ec4", + "metadata": {}, + "source": [ + "## MongoDB Configuration and Collection Setup\n", + "\n", + "This cell validates your MongoDB connection and prepares the collection for processing. It confirms environment variables (`MONGODB_URI`, `MONGODB_DATABASE`, `MONGODB_COLLECTION`), creates the database and collection if needed, and clears any existing documents for a fresh run.\n", + "\n", + "> **Note**: If you're running this in Google Colab, you'll need to whitelist your notebook's IP address in MongoDB Network Access. Run `!curl ifconfig.me` in a cell to get your IP address, then add it to the \"Network Access\" section of your MongoDB Atlas cluster settings." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "7349b053", + "metadata": { + "lines_to_next_cell": 1 + }, + "outputs": [], + "source": [ + "def verify_collection_exists():\n", + " \"\"\"Verify that the MongoDB collection exists and is properly configured.\"\"\"\n", + " print(f\"🔍 Verifying collection '{MONGODB_COLLECTION}' exists...\")\n", + " \n", + " try:\n", + " from pymongo import MongoClient\n", + " \n", + " # Initialize MongoDB client\n", + " client = MongoClient(MONGODB_URI)\n", + " db = client[MONGODB_DATABASE]\n", + " \n", + " # Check if collection exists\n", + " existing_collections = db.list_collection_names()\n", + " \n", + " if MONGODB_COLLECTION not in existing_collections:\n", + " print(f\"❌ Collection '{MONGODB_COLLECTION}' does not exist!\")\n", + " return False\n", + " \n", + " # Get collection info to verify configuration\n", + " try:\n", + " collection = db[MONGODB_COLLECTION]\n", + " \n", + " # Count documents (optional check)\n", + " doc_count = collection.count_documents({})\n", + " print(f\"✅ Collection '{MONGODB_COLLECTION}' exists and is accessible\")\n", + " print(f\"📄 Current document count: {doc_count}\")\n", + " \n", + " return True\n", + " \n", + " except Exception as collection_error:\n", + " print(f\"⚠️ Collection exists but may have access issues: {collection_error}\")\n", + " return True # Don't fail if we can't get detailed info\n", + " \n", + " except ImportError:\n", + " print(\"⚠️ MongoDB client not available - collection verification skipped\")\n", + " return True\n", + " \n", + " except Exception as e:\n", + " print(f\"⚠️ Warning: Could not verify collection: {e}\")\n", + " return True # Don't fail the pipeline for verification issues\n", + "\n", + "def initialize_mongodb_collection():\n", + " \"\"\"Initialize MongoDB collection - create database and collection if needed, then clear existing data for fresh start.\"\"\"\n", + " print(\"🏗️ Initializing MongoDB collection...\")\n", + " \n", + " try:\n", + " from pymongo import MongoClient\n", + " \n", + " # Initialize client\n", + " client = MongoClient(MONGODB_URI)\n", + " \n", + " # Access database (will be created automatically if it doesn't exist)\n", + " db = client[MONGODB_DATABASE]\n", + " print(f\"✅ Connected to database '{MONGODB_DATABASE}'\")\n", + " \n", + " # List existing collections\n", + " existing_collections = db.list_collection_names()\n", + " \n", + " # Step 1: Ensure collection exists (create if needed)\n", + " if MONGODB_COLLECTION not in existing_collections:\n", + " print(f\"📝 Creating collection '{MONGODB_COLLECTION}'...\")\n", + " \n", + " # Create the collection (MongoDB creates it automatically on first write)\n", + " db.create_collection(MONGODB_COLLECTION)\n", + " print(f\"✅ Created collection '{MONGODB_COLLECTION}'\")\n", + " else:\n", + " print(f\"✅ Collection '{MONGODB_COLLECTION}' already exists\")\n", + " \n", + " # Step 2: Clear existing data\n", + " collection = db[MONGODB_COLLECTION]\n", + " delete_result = collection.delete_many({})\n", + " \n", + " deleted_count = delete_result.deleted_count\n", + " print(f\"🗑️ Cleared {deleted_count} existing documents\")\n", + " \n", + " print(f\"✅ Collection '{MONGODB_COLLECTION}' is ready for document processing\")\n", + " return True\n", + " \n", + " except ImportError:\n", + " print(\"⚠️ MongoDB client not available - install with: pip install pymongo\")\n", + " return False\n", + " \n", + " except Exception as e:\n", + " print(f\"❌ Error initializing MongoDB collection: {e}\")\n", + " print(\"💡 Troubleshooting:\")\n", + " print(\" 1. Verify your MONGODB_URI connection string is correct\")\n", + " print(\" 2. Ensure your MongoDB cluster allows connections from your IP\")\n", + " print(\" 3. Check that your database user has appropriate permissions\")\n", + " print(f\" 4. Verify database name '{MONGODB_DATABASE}' and collection '{MONGODB_COLLECTION}'\")\n", + " return False\n", + "\n", + "def run_mongodb_preprocessing():\n", + " \"\"\"Validate MongoDB configuration and initialize collection for fresh processing.\"\"\"\n", + " print(\"🔧 Running MongoDB preprocessing...\")\n", + " \n", + " try:\n", + " # Validate required environment variables\n", + " required_vars = [\n", + " (\"MONGODB_URI\", MONGODB_URI),\n", + " (\"MONGODB_DATABASE\", MONGODB_DATABASE),\n", + " (\"MONGODB_COLLECTION\", MONGODB_COLLECTION)\n", + " ]\n", + " \n", + " for var_name, var_value in required_vars:\n", + " if not var_value:\n", + " raise ValueError(f\"{var_name} is required\")\n", + " \n", + " # Basic URI validation\n", + " if not MONGODB_URI.startswith(\"mongodb\"):\n", + " raise ValueError(\"MONGODB_URI must be a valid MongoDB connection string (mongodb:// or mongodb+srv://)\")\n", + " \n", + " print(f\"🔍 MongoDB Configuration:\")\n", + " print(f\" • Database: {MONGODB_DATABASE}\")\n", + " print(f\" • Collection: {MONGODB_COLLECTION}\")\n", + " print(\"✅ MongoDB configuration validation completed successfully\")\n", + " \n", + " # Initialize collection (create if needed + clear existing data)\n", + " if not initialize_mongodb_collection():\n", + " raise Exception(\"Failed to initialize MongoDB collection\")\n", + " \n", + " return True\n", + " \n", + " except Exception as e:\n", + " print(f\"❌ Error during MongoDB preprocessing: {e}\")\n", + " return False" + ] + }, + { + "cell_type": "markdown", + "id": "0b86c63d", + "metadata": {}, + "source": [ + "## MongoDB Destination Connector\n", + "\n", + "Creating the destination where processed documents will be stored. Your configured MongoDB collection will receive the extracted text content, metadata, and document structure ready for newsletter generation.\n", + "\n", + "> **Note**: For detailed MongoDB destination connector setup instructions, including cluster configuration and authentication requirements, see the [Unstructured MongoDB destination connector documentation](https://docs.unstructured.io/api-reference/workflow/destinations/mongodb)." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "fb1707e4", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/functional_validators.py:218: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='mongodb', input_type=str])\n", + " function=lambda v, h: h(v),\n", + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/main.py:463: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='mongodb', input_type=str])\n", + " return self.__pydantic_serializer__.to_python(\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "📊 Input variables to create_mongodb_destination_connector:\n", + " • Database: scraped_publications\n", + " • Collection: documents\n", + " • Batch Size: 20\n", + " • Flatten Metadata: False\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/destinations/ \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Created MongoDB destination connector: 9156515b-1d7d-48ff-8884-2ccfd56a38b7\n", + "🗄️ Database: scraped_publications\n", + "📁 Collection: documents\n", + "🔍 MongoDB destination connector ready to store processed documents\n", + "🗄️ Database: scraped_publications\n", + "📁 Collection: documents\n" + ] + } + ], + "source": [ + "def create_mongodb_destination_connector():\n", + " \"\"\"Create a MongoDB destination connector for processed results.\"\"\"\n", + " try:\n", + " # Debug: Print all input variables\n", + " print(f\"📊 Input variables to create_mongodb_destination_connector:\")\n", + " print(f\" • Database: {MONGODB_DATABASE}\")\n", + " print(f\" • Collection: {MONGODB_COLLECTION}\")\n", + " print(f\" • Batch Size: 20\")\n", + " print(f\" • Flatten Metadata: False\")\n", + " print()\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.destinations.create_destination(\n", + " request=CreateDestinationRequest(\n", + " create_destination_connector=CreateDestinationConnector(\n", + " name=f\"mongodb_newsletter_pipeline_destination_{int(time.time())}\",\n", + " type=\"mongodb\",\n", + " config={\n", + " \"uri\": MONGODB_URI,\n", + " \"database\": MONGODB_DATABASE,\n", + " \"collection\": MONGODB_COLLECTION,\n", + " \"batch_size\": 20,\n", + " \"flatten_metadata\": False\n", + " }\n", + " )\n", + " )\n", + " )\n", + "\n", + " destination_id = response.destination_connector_information.id\n", + " print(f\"✅ Created MongoDB destination connector: {destination_id}\")\n", + " print(f\"🗄️ Database: {MONGODB_DATABASE}\")\n", + " print(f\"📁 Collection: {MONGODB_COLLECTION}\")\n", + " return destination_id\n", + " \n", + " except Exception as e:\n", + " print(f\"❌ Error creating MongoDB destination connector: {e}\")\n", + " return None\n", + "\n", + "def test_mongodb_destination_connector(destination_id):\n", + " \"\"\"Test the MongoDB destination connector.\"\"\"\n", + " if destination_id and destination_id != SKIPPED:\n", + " print(f\"🔍 MongoDB destination connector ready to store processed documents\")\n", + " print(f\"🗄️ Database: {MONGODB_DATABASE}\")\n", + " print(f\"📁 Collection: {MONGODB_COLLECTION}\")\n", + " else:\n", + " print(\"❌ Failed to create MongoDB destination connector - check your credentials and configuration\")\n", + "\n", + "# Create MongoDB destination connector\n", + "destination_id = create_mongodb_destination_connector()\n", + "\n", + "test_mongodb_destination_connector(destination_id) " + ] + }, + { + "cell_type": "markdown", + "id": "02a8ca1d", + "metadata": {}, + "source": [ + "## Document Processing Pipeline\n", + "\n", + "Configuring the two-stage pipeline: Hi-Res Partitioning → Page Chunking.\n", + "\n", + "The pipeline uses Unstructured's hi_res strategy for detailed document analysis with advanced table detection, then chunks content by page to preserve document structure for downstream summarization and newsletter generation.\n", + "\n", + "**Stage 1 - High-Resolution Partitioning:**\n", + "- **Strategy**: `hi_res` for detailed document processing\n", + "- **Table Detection**: `pdf_infer_table_structure=True` for accurate table extraction\n", + "- **Page Breaks**: `include_page_breaks=True` to maintain document structure\n", + "- **Text-Focused**: Excludes images, page numbers, and formatting elements\n", + "- **Output**: Individual elements (Title, NarrativeText, Table, etc.) with metadata\n", + "\n", + "**Stage 2 - Page-Based Chunking:**\n", + "- **Strategy**: `chunk_by_page` to maintain natural page boundaries\n", + "- **Original Elements**: `include_orig_elements=False` (not used in downstream workflows)\n", + "- **Max Characters**: `max_characters=6000` for manageable chunk sizes\n", + "- **Output**: Page-level chunks (up to 6k characters) ideal for summarization and newsletter generation" + ] + }, + { + "cell_type": "markdown", + "id": "28643708", + "metadata": {}, + "source": [ + "## Orchestrating Your Complete Document Processing Pipeline\n", + "\n", + "We'll now execute the pipeline in distinct steps, allowing you to monitor progress at each stage: preprocessing, workflow creation, execution, and results validation." + ] + }, + { + "cell_type": "markdown", + "id": "e6adbe2d", + "metadata": {}, + "source": [ + "### Step 1: MongoDB Preprocessing\n", + "\n", + "First, we validate the MongoDB connection and prepare the collection for processing.\n", + "\n", + "> **Note**: If you're running this in Google Colab, you'll need to whitelist your notebook's IP address in MongoDB Network Access. Run `!curl ifconfig.me` in a cell to get your IP address, then add it to the \"Network Access\" section of your MongoDB Atlas cluster settings." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "05a8f04b", + "metadata": { + "lines_to_next_cell": 2 + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "🚀 Starting Newsletter Document Processing Pipeline\n", + "\n", + "🔧 Step 1: MongoDB preprocessing\n", + "--------------------------------------------------\n", + "🔧 Running MongoDB preprocessing...\n", + "🔍 MongoDB Configuration:\n", + " • Database: scraped_publications\n", + " • Collection: documents\n", + "✅ MongoDB configuration validation completed successfully\n", + "🏗️ Initializing MongoDB collection...\n", + "✅ Connected to database 'scraped_publications'\n", + "✅ Collection 'documents' already exists\n", + "🗑️ Cleared 64 existing documents\n", + "✅ Collection 'documents' is ready for document processing\n", + "✅ MongoDB preprocessing completed successfully\n" + ] + } + ], + "source": [ + "# Step 1: MongoDB preprocessing\n", + "print(\"🚀 Starting Newsletter Document Processing Pipeline\")\n", + "print(\"\\n🔧 Step 1: MongoDB preprocessing\")\n", + "print(\"-\" * 50)\n", + "\n", + "preprocessing_success = run_mongodb_preprocessing()\n", + "\n", + "if preprocessing_success:\n", + " print(\"✅ MongoDB preprocessing completed successfully\")\n", + "else:\n", + " print(\"❌ Failed to complete MongoDB preprocessing\")" + ] + }, + { + "cell_type": "markdown", + "id": "6cc6e94b", + "metadata": {}, + "source": [ + "### Step 2: Create Processing Workflow\n", + "\n", + "Now we'll create the document processing workflow with high-resolution partitioning and page-based chunking." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "ae53a349", + "metadata": { + "lines_to_next_cell": 2 + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "⚙️ Step 2: Creating document processing workflow\n", + "--------------------------------------------------\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/workflows/ \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Created S3 document processing workflow: ce36eca3-a417-49d8-b685-b4562475a6ae\n" + ] + } + ], + "source": [ + "# Step 2: Create document processing workflow\n", + "print(\"\\n⚙️ Step 2: Creating document processing workflow\")\n", + "print(\"-\" * 50)\n", + "\n", + "if source_id and destination_id:\n", + " # Create workflow nodes inline\n", + " try:\n", + " # High-res partitioner for detailed document processing\n", + " partitioner_workflow_node = WorkflowNode(\n", + " name=\"Partitioner\",\n", + " subtype=\"unstructured_api\",\n", + " type=\"partition\",\n", + " settings={\n", + " \"strategy\": \"hi_res\",\n", + " \"include_page_breaks\": True,\n", + " \"pdf_infer_table_structure\": True,\n", + " \"exclude_elements\": [\n", + " \"Address\",\n", + " \"PageBreak\",\n", + " \"Formula\",\n", + " \"EmailAddress\",\n", + " \"PageNumber\",\n", + " \"Image\"\n", + " ]\n", + " }\n", + " )\n", + "\n", + " # Chunk by page - keeps page boundaries intact\n", + " chunker_node = WorkflowNode(\n", + " name=\"Chunker\",\n", + " subtype=\"chunk_by_page\",\n", + " type=\"chunk\",\n", + " settings={\n", + " \"include_orig_elements\": False,\n", + " \"max_characters\": 6000 # Maximum 6k characters per chunk\n", + " }\n", + " )\n", + "\n", + " # Create the workflow\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " s3_workflow = CreateWorkflow(\n", + " name=f\"S3-Document-Processing-Workflow_{int(time.time())}\",\n", + " source_id=source_id,\n", + " destination_id=destination_id,\n", + " workflow_type=WorkflowType.CUSTOM,\n", + " workflow_nodes=[\n", + " partitioner_workflow_node,\n", + " chunker_node\n", + " ]\n", + " )\n", + "\n", + " s3_response = client.workflows.create_workflow(\n", + " request=CreateWorkflowRequest(\n", + " create_workflow=s3_workflow\n", + " )\n", + " )\n", + "\n", + " workflow_id = s3_response.workflow_information.id\n", + " print(f\"✅ Created S3 document processing workflow: {workflow_id}\")\n", + "\n", + " except Exception as e:\n", + " print(f\"❌ Error creating document processing workflow: {e}\")\n", + " workflow_id = None\n", + "else:\n", + " print(\"⚠️ Skipping workflow creation - connectors not available\")\n", + " workflow_id = None" + ] + }, + { + "cell_type": "markdown", + "id": "9d9b2fac", + "metadata": {}, + "source": [ + "### Step 3: Execute Workflow\n", + "\n", + "Run the workflow to start processing your documents." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "c39f30c1", + "metadata": { + "lines_to_next_cell": 2 + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "🚀 Step 3: Running workflow\n", + "--------------------------------------------------\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/workflows/ce36eca3-a417-49d8-b685-b4562475a6ae/run \"HTTP/1.1 202 Accepted\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Started S3 Document Processing job: 08b8e104-f2cd-42ef-9a93-7890560b489b\n" + ] + } + ], + "source": [ + "# Step 3: Run the workflow\n", + "print(\"\\n🚀 Step 3: Running workflow\")\n", + "print(\"-\" * 50)\n", + "\n", + "if workflow_id:\n", + " # Run the workflow inline\n", + " try:\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.workflows.run_workflow(\n", + " request={\"workflow_id\": workflow_id}\n", + " )\n", + " \n", + " job_id = response.job_information.id\n", + " print(f\"✅ Started S3 Document Processing job: {job_id}\")\n", + " \n", + " except Exception as e:\n", + " print(f\"❌ Error running S3 Document Processing workflow: {e}\")\n", + " job_id = None\n", + "else:\n", + " print(\"⚠️ Skipping workflow execution - workflow not created\")\n", + " job_id = None" + ] + }, + { + "cell_type": "markdown", + "id": "c1d01ef4", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 🤖 Orchestrator Agent: Autonomous Pipeline Management\n", + "\n", + "Now that you've seen how to run this process manually, let's wrap these pipeline steps in an agentic system that can orchestrate the entire workflow autonomously.\n", + "\n", + "**Orchestrator Agent** - Manages the complete pipeline from S3 → MongoDB:\n", + "- Checks S3 for documents\n", + "- Gets initial MongoDB count\n", + "- **Creates workflow** (connectors + processing nodes)\n", + "- Triggers the workflow\n", + "- Waits for completion\n", + "- Verifies MongoDB (with before/after comparison)\n", + "- Cleans up S3\n", + "\n", + "The agent uses self-contained tools that directly call the Unstructured API, demonstrating how to build fully autonomous document processing systems." + ] + }, + { + "cell_type": "markdown", + "id": "84d9094b", + "metadata": {}, + "source": [ + "### Orchestrator Agent Setup\n", + "\n", + "The Orchestrator Agent uses LangChain to autonomously manage the document processing pipeline." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "2f07a1a4", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Orchestrator Agent ready!\n", + "📋 Available tools: check_s3_documents, get_mongodb_count, create_workflow, trigger_workflow, wait_for_completion, verify_mongodb, clear_s3\n" + ] + } + ], + "source": [ + "\"\"\"\n", + "ORCHESTRATOR AGENT\n", + "Autonomous pipeline management with self-contained tools\n", + "\"\"\"\n", + "\n", + "from langchain.tools import Tool\n", + "from langchain.agents import AgentExecutor, create_openai_functions_agent\n", + "from langchain_openai import ChatOpenAI\n", + "from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder\n", + "\n", + "# ============================================================\n", + "# Self-Contained Tool Functions\n", + "# ============================================================\n", + "\n", + "def check_s3_documents(bucket_name: str) -> dict:\n", + " \"\"\"List and count documents in S3 bucket.\"\"\"\n", + " try:\n", + " s3 = boto3.client(\n", + " 's3',\n", + " aws_access_key_id=AWS_ACCESS_KEY_ID,\n", + " aws_secret_access_key=AWS_SECRET_ACCESS_KEY,\n", + " region_name=AWS_REGION\n", + " )\n", + " \n", + " response = s3.list_objects_v2(Bucket=bucket_name)\n", + " \n", + " if 'Contents' not in response:\n", + " return {\n", + " \"status\": \"empty\",\n", + " \"total_files\": 0,\n", + " \"message\": f\"Bucket {bucket_name} is empty\"\n", + " }\n", + " \n", + " files = response['Contents']\n", + " total_files = len(files)\n", + " \n", + " # Count by type\n", + " pdf_count = sum(1 for f in files if f['Key'].endswith('.pdf'))\n", + " html_count = sum(1 for f in files if f['Key'].endswith('.html'))\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"total_files\": total_files,\n", + " \"pdf_files\": pdf_count,\n", + " \"html_files\": html_count,\n", + " \"message\": f\"Found {total_files} files in S3 ({pdf_count} PDFs, {html_count} HTML)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error checking S3: {str(e)}\"\n", + " }\n", + "\n", + "def get_mongodb_count_tool(_: str = \"\") -> dict:\n", + " \"\"\"Get current document count in MongoDB.\"\"\"\n", + " try:\n", + " from pymongo import MongoClient\n", + " \n", + " client = MongoClient(MONGODB_URI)\n", + " db = client[MONGODB_DATABASE]\n", + " collection = db[MONGODB_COLLECTION]\n", + " \n", + " doc_count = collection.count_documents({})\n", + " composite_count = collection.count_documents({\"type\": \"CompositeElement\"})\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"total_documents\": doc_count,\n", + " \"composite_elements\": composite_count,\n", + " \"message\": f\"Current MongoDB count: {doc_count} total documents ({composite_count} CompositeElements)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error counting MongoDB documents: {str(e)}\"\n", + " }\n", + "\n", + "def create_workflow_tool(bucket_name: str) -> dict:\n", + " \"\"\"Create complete workflow: connectors + workflow. Returns workflow_id.\"\"\"\n", + " try:\n", + " print(\"⚙️ Creating S3 source connector...\")\n", + " \n", + " # Create S3 source connector (EXACT COPY from manual code)\n", + " value = bucket_name.strip()\n", + " if value.startswith(\"s3://\"):\n", + " s3_style = value if value.endswith(\"/\") else value + \"/\"\n", + " elif value.startswith(\"http://\") or value.startswith(\"https://\"):\n", + " from urllib.parse import urlparse\n", + " parsed = urlparse(value)\n", + " host = parsed.netloc\n", + " path = parsed.path or \"/\"\n", + " bucket = host.split(\".s3.\")[0]\n", + " s3_style = f\"s3://{bucket}{path if path.endswith('/') else path + '/'}\"\n", + " else:\n", + " s3_style = f\"s3://{value if value.endswith('/') else value + '/'}\"\n", + " \n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.sources.create_source(\n", + " request=CreateSourceRequest(\n", + " create_source_connector=CreateSourceConnector(\n", + " name=\"\",\n", + " type=\"s3\",\n", + " config={\n", + " \"remote_url\": s3_style,\n", + " \"recursive\": True, \n", + " \"key\": AWS_ACCESS_KEY_ID,\n", + " \"secret\": AWS_SECRET_ACCESS_KEY,\n", + " }\n", + " )\n", + " )\n", + " )\n", + " \n", + " s3_source_id = response.source_connector_information.id\n", + " print(f\"✅ S3 connector created: {s3_source_id}\")\n", + " \n", + " print(\"⚙️ Creating MongoDB destination connector...\")\n", + " \n", + " # Create MongoDB destination connector (EXACT COPY from manual code)\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.destinations.create_destination(\n", + " request=CreateDestinationRequest(\n", + " create_destination_connector=CreateDestinationConnector(\n", + " name=f\"mongodb_newsletter_pipeline_destination_{int(time.time())}\",\n", + " type=\"mongodb\",\n", + " config={\n", + " \"uri\": MONGODB_URI,\n", + " \"database\": MONGODB_DATABASE,\n", + " \"collection\": MONGODB_COLLECTION,\n", + " \"batch_size\": 20,\n", + " \"flatten_metadata\": False\n", + " }\n", + " )\n", + " )\n", + " )\n", + "\n", + " destination_id = response.destination_connector_information.id\n", + " print(f\"✅ MongoDB connector created: {destination_id}\")\n", + " \n", + " print(\"⚙️ Creating workflow with hi_res partitioning...\")\n", + " \n", + " # Create workflow with nodes (EXACT COPY from manual code)\n", + " partitioner_node = WorkflowNode(\n", + " name=\"Partitioner\",\n", + " subtype=\"unstructured_api\",\n", + " type=\"partition\",\n", + " settings={\n", + " \"strategy\": \"hi_res\",\n", + " \"include_page_breaks\": True,\n", + " \"pdf_infer_table_structure\": True,\n", + " \"exclude_elements\": [\n", + " \"Address\",\n", + " \"PageBreak\",\n", + " \"Formula\",\n", + " \"EmailAddress\",\n", + " \"PageNumber\",\n", + " \"Image\"\n", + " ]\n", + " }\n", + " )\n", + "\n", + " chunker_node = WorkflowNode(\n", + " name=\"Chunker\",\n", + " subtype=\"chunk_by_page\",\n", + " type=\"chunk\",\n", + " settings={\n", + " \"include_orig_elements\": False,\n", + " \"max_characters\": 6000\n", + " }\n", + " )\n", + "\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " s3_workflow = CreateWorkflow(\n", + " name=f\"S3-Document-Processing-Workflow_{int(time.time())}\",\n", + " source_id=s3_source_id,\n", + " destination_id=destination_id,\n", + " workflow_type=WorkflowType.CUSTOM,\n", + " workflow_nodes=[\n", + " partitioner_node,\n", + " chunker_node\n", + " ]\n", + " )\n", + "\n", + " s3_response = client.workflows.create_workflow(\n", + " request=CreateWorkflowRequest(\n", + " create_workflow=s3_workflow\n", + " )\n", + " )\n", + "\n", + " workflow_id = s3_response.workflow_information.id\n", + " print(f\"✅ Workflow created: {workflow_id}\")\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"workflow_id\": workflow_id,\n", + " \"s3_source_id\": s3_source_id,\n", + " \"destination_id\": destination_id,\n", + " \"message\": f\"Workflow created successfully. ID: {workflow_id}\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error creating workflow: {str(e)}\"\n", + " }\n", + "\n", + "def trigger_workflow_tool(workflow_id: str) -> dict:\n", + " \"\"\"Trigger Unstructured API workflow (self-contained).\"\"\"\n", + " try:\n", + " # Direct Unstructured API call (not using external function)\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.workflows.run_workflow(\n", + " request={\"workflow_id\": workflow_id}\n", + " )\n", + " \n", + " job_id = response.job_information.id\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"job_id\": job_id,\n", + " \"message\": f\"Workflow triggered successfully. Job ID: {job_id}\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error triggering workflow: {str(e)}\"\n", + " }\n", + "\n", + "def wait_for_completion_tool(job_id: str) -> dict:\n", + " \"\"\"Wait for workflow job to complete (self-contained polling).\"\"\"\n", + " try:\n", + " print(f\"⏳ Monitoring job status: {job_id}\")\n", + " \n", + " # Poll until completion (self-contained logic)\n", + " while True:\n", + " with UnstructuredClient(api_key_auth=UNSTRUCTURED_API_KEY) as client:\n", + " response = client.jobs.get_job(\n", + " request={\"job_id\": job_id}\n", + " )\n", + " \n", + " job_info = response.job_information\n", + " status = job_info.status\n", + " \n", + " if status in [\"SCHEDULED\", \"IN_PROGRESS\"]:\n", + " print(f\"⏳ Job status: {status}\")\n", + " time.sleep(30) # Wait 30 seconds\n", + " elif status == \"COMPLETED\":\n", + " print(f\"✅ Job completed successfully!\")\n", + " return {\n", + " \"status\": \"success\",\n", + " \"job_status\": \"COMPLETED\",\n", + " \"message\": \"Job completed successfully\"\n", + " }\n", + " elif status == \"FAILED\":\n", + " return {\n", + " \"status\": \"failed\",\n", + " \"job_status\": \"FAILED\",\n", + " \"message\": \"Job failed\"\n", + " }\n", + " else:\n", + " return {\n", + " \"status\": \"unknown\",\n", + " \"job_status\": str(status),\n", + " \"message\": f\"Job finished with unknown status: {status}\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error waiting for job: {str(e)}\"\n", + " }\n", + "\n", + "def verify_mongodb_tool(_: str = \"\") -> dict:\n", + " \"\"\"Verify processed documents in MongoDB.\"\"\"\n", + " try:\n", + " from pymongo import MongoClient\n", + " \n", + " client = MongoClient(MONGODB_URI)\n", + " db = client[MONGODB_DATABASE]\n", + " collection = db[MONGODB_COLLECTION]\n", + " \n", + " doc_count = collection.count_documents({})\n", + " composite_count = collection.count_documents({\"type\": \"CompositeElement\"})\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"total_documents\": doc_count,\n", + " \"composite_elements\": composite_count,\n", + " \"message\": f\"MongoDB verified: {doc_count} total documents ({composite_count} CompositeElements)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error verifying MongoDB: {str(e)}\"\n", + " }\n", + "\n", + "def clear_s3_bucket(bucket_name: str) -> dict:\n", + " \"\"\"Delete all objects from S3 bucket.\"\"\"\n", + " try:\n", + " s3 = boto3.client(\n", + " 's3',\n", + " aws_access_key_id=AWS_ACCESS_KEY_ID,\n", + " aws_secret_access_key=AWS_SECRET_ACCESS_KEY,\n", + " region_name=AWS_REGION\n", + " )\n", + " \n", + " # List all objects\n", + " response = s3.list_objects_v2(Bucket=bucket_name)\n", + " \n", + " if 'Contents' not in response:\n", + " return {\n", + " \"status\": \"success\",\n", + " \"files_deleted\": 0,\n", + " \"message\": f\"Bucket {bucket_name} was already empty\"\n", + " }\n", + " \n", + " # Delete all objects\n", + " objects_to_delete = [{'Key': obj['Key']} for obj in response['Contents']]\n", + " \n", + " if objects_to_delete:\n", + " s3.delete_objects(\n", + " Bucket=bucket_name,\n", + " Delete={'Objects': objects_to_delete}\n", + " )\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"files_deleted\": len(objects_to_delete),\n", + " \"message\": f\"Deleted {len(objects_to_delete)} files from S3\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error clearing S3: {str(e)}\"\n", + " }\n", + "\n", + "# ============================================================\n", + "# Create LangChain Tools\n", + "# ============================================================\n", + "\n", + "orchestrator_tools = [\n", + " Tool(\n", + " name=\"check_s3_documents\",\n", + " func=check_s3_documents,\n", + " description=\"Check S3 bucket for documents. Input: bucket_name (string). Returns count of files by type (PDF/HTML).\"\n", + " ),\n", + " Tool(\n", + " name=\"get_mongodb_count\",\n", + " func=get_mongodb_count_tool,\n", + " description=\"Get current document count in MongoDB. No input needed. Returns document counts.\"\n", + " ),\n", + " Tool(\n", + " name=\"create_workflow\",\n", + " func=create_workflow_tool,\n", + " description=\"Create workflow with connectors. Input: bucket_name (string). Returns workflow_id to use for triggering.\"\n", + " ),\n", + " Tool(\n", + " name=\"trigger_workflow\",\n", + " func=trigger_workflow_tool,\n", + " description=\"Start the document processing workflow. Input: workflow_id (string). Returns job_id for monitoring.\"\n", + " ),\n", + " Tool(\n", + " name=\"wait_for_completion\",\n", + " func=wait_for_completion_tool,\n", + " description=\"Wait for workflow job to complete. Input: job_id (string). Polls every 30 seconds until done.\"\n", + " ),\n", + " Tool(\n", + " name=\"verify_mongodb\",\n", + " func=verify_mongodb_tool,\n", + " description=\"Verify processed documents are in MongoDB. No input needed. Returns document counts.\"\n", + " ),\n", + " Tool(\n", + " name=\"clear_s3\",\n", + " func=clear_s3_bucket,\n", + " description=\"Delete all files from S3 bucket after successful processing. Input: bucket_name (string).\"\n", + " ),\n", + "]\n", + "\n", + "# ============================================================\n", + "# Create Orchestrator Agent\n", + "# ============================================================\n", + "\n", + "orchestrator_prompt = ChatPromptTemplate.from_messages([\n", + " (\"system\", \"\"\"You are an autonomous pipeline orchestrator. You MUST EXECUTE the tools, not just describe them.\n", + "\n", + "EXECUTE these steps by CALLING the tools:\n", + "\n", + "1. CALL get_mongodb_count to get the initial count\n", + "2. CALL check_s3_documents with the bucket name to see what files exist\n", + "3. If files exist, CALL create_workflow with the bucket name to create the workflow\n", + "4. CALL trigger_workflow with the workflow_id from step 3\n", + "5. CALL wait_for_completion with the job_id from step 4\n", + "6. CALL get_mongodb_count again to get the final count\n", + "7. CALL verify_mongodb to double-check the data\n", + "8. CALL clear_s3 with the bucket name to clean up\n", + "\n", + "After each tool call, examine the result and proceed to the next step.\n", + "Report the before/after MongoDB counts at the end.\n", + "\n", + "DO NOT write pseudocode. DO NOT describe what you would do. ACTUALLY CALL THE TOOLS.\n", + "\n", + "S3 bucket: {s3_bucket}\n", + "\"\"\"),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\")\n", + "])\n", + "\n", + "llm = ChatOpenAI(model=\"gpt-4\", temperature=0, openai_api_key=OPENAI_API_KEY)\n", + "\n", + "orchestrator_agent = create_openai_functions_agent(llm, orchestrator_tools, orchestrator_prompt)\n", + "orchestrator_executor = AgentExecutor(\n", + " agent=orchestrator_agent,\n", + " tools=orchestrator_tools,\n", + " verbose=True,\n", + " max_iterations=10,\n", + " handle_parsing_errors=True\n", + ")\n", + "\n", + "print(\"✅ Orchestrator Agent ready!\")\n", + "print(f\"📋 Available tools: {', '.join([t.name for t in orchestrator_tools])}\")" + ] + }, + { + "cell_type": "markdown", + "id": "7dff6ae7", + "metadata": {}, + "source": [ + "### Execute Orchestrator Agent\n", + "\n", + "Run the agent and watch it autonomously orchestrate the entire pipeline.\n", + "\n", + "> **Note**: If you're running this in Google Colab, you'll need to whitelist your notebook's IP address in MongoDB Network Access. Run `!curl ifconfig.me` in a cell to get your IP address, then add it to the \"Network Access\" section of your MongoDB Atlas cluster settings." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "df4c4636", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "🤖 Starting Orchestrator Agent\n", + "============================================================\n", + "📋 Task: Process documents from S3 → MongoDB\n", + "📁 S3 Bucket: ai-papers-and-blogs-notebook\n", + "============================================================\n", + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_mongodb_count` with ``\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'total_documents': 0, 'composite_elements': 0, 'message': 'Current MongoDB count: 0 total documents (0 CompositeElements)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `check_s3_documents` with `ai-papers-and-blogs-notebook`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m{'status': 'success', 'total_files': 25, 'pdf_files': 5, 'html_files': 20, 'message': 'Found 25 files in S3 (5 PDFs, 20 HTML)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `create_workflow` with `ai-papers-and-blogs-notebook`\n", + "\n", + "\n", + "\u001b[0m⚙️ Creating S3 source connector...\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/functional_validators.py:218: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='s3', input_type=str])\n", + " function=lambda v, h: h(v),\n", + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/main.py:463: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='s3', input_type=str])\n", + " return self.__pydantic_serializer__.to_python(\n", + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/sources/ \"HTTP/1.1 200 OK\"\n", + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/functional_validators.py:218: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='mongodb', input_type=str])\n", + " function=lambda v, h: h(v),\n", + "/Users/nvannest/Documents/GitHub/newsletter-scraping-and-summarization/venv/lib/python3.13/site-packages/pydantic/main.py:463: UserWarning: Pydantic serializer warnings:\n", + " PydanticSerializationUnexpectedValue(Expected `enum` - serialized value may not be as expected [input_value='mongodb', input_type=str])\n", + " return self.__pydantic_serializer__.to_python(\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ S3 connector created: d17e44c1-ff08-4465-8bad-f437e47a3805\n", + "⚙️ Creating MongoDB destination connector...\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/destinations/ \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ MongoDB connector created: 9bebacec-1a4c-4ed5-ada1-a228e648eeaa\n", + "⚙️ Creating workflow with hi_res partitioning...\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/workflows/ \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Workflow created: 3df7bd1b-00c1-4016-bf8d-ded25eedccc4\n", + "\u001b[38;5;200m\u001b[1;3m{'status': 'success', 'workflow_id': '3df7bd1b-00c1-4016-bf8d-ded25eedccc4', 's3_source_id': 'd17e44c1-ff08-4465-8bad-f437e47a3805', 'destination_id': '9bebacec-1a4c-4ed5-ada1-a228e648eeaa', 'message': 'Workflow created successfully. ID: 3df7bd1b-00c1-4016-bf8d-ded25eedccc4'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `trigger_workflow` with `3df7bd1b-00c1-4016-bf8d-ded25eedccc4`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://platform.unstructuredapp.io/api/v1/workflows/3df7bd1b-00c1-4016-bf8d-ded25eedccc4/run \"HTTP/1.1 202 Accepted\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[36;1m\u001b[1;3m{'status': 'success', 'job_id': '5321b116-5117-47f8-b8de-4b5b1c5ab3db', 'message': 'Workflow triggered successfully. Job ID: 5321b116-5117-47f8-b8de-4b5b1c5ab3db'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n", + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `wait_for_completion` with `5321b116-5117-47f8-b8de-4b5b1c5ab3db`\n", + "\n", + "\n", + "\u001b[0m⏳ Monitoring job status: 5321b116-5117-47f8-b8de-4b5b1c5ab3db\n", + "⏳ Job status: JobStatus.SCHEDULED\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "⏳ Job status: JobStatus.IN_PROGRESS\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "⏳ Job status: JobStatus.IN_PROGRESS\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "⏳ Job status: JobStatus.IN_PROGRESS\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "⏳ Job status: JobStatus.IN_PROGRESS\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: GET https://platform.unstructuredapp.io/api/v1/jobs/5321b116-5117-47f8-b8de-4b5b1c5ab3db \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Job completed successfully!\n", + "\u001b[33;1m\u001b[1;3m{'status': 'success', 'job_status': 'COMPLETED', 'message': 'Job completed successfully'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_mongodb_count` with ``\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'total_documents': 503, 'composite_elements': 503, 'message': 'Current MongoDB count: 503 total documents (503 CompositeElements)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `verify_mongodb` with ``\n", + "\n", + "\n", + "\u001b[0m\u001b[38;5;200m\u001b[1;3m{'status': 'success', 'total_documents': 503, 'composite_elements': 503, 'message': 'MongoDB verified: 503 total documents (503 CompositeElements)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `clear_s3` with `ai-papers-and-blogs-notebook`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m{'status': 'success', 'files_deleted': 25, 'message': 'Deleted 25 files from S3'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3mThe process has been completed successfully. Here is the summary:\n", + "\n", + "1. Initial MongoDB document count was 0.\n", + "2. Found 25 files in S3 bucket 'ai-papers-and-blogs-notebook' (5 PDFs, 20 HTML).\n", + "3. Created a workflow with ID '3df7bd1b-00c1-4016-bf8d-ded25eedccc4'.\n", + "4. Triggered the workflow successfully. Job ID was '5321b116-5117-47f8-b8de-4b5b1c5ab3db'.\n", + "5. The job completed successfully.\n", + "6. Final MongoDB document count is 503.\n", + "7. Verified MongoDB: 503 total documents.\n", + "8. Deleted 25 files from S3.\n", + "\n", + "The MongoDB document count increased by 503, which matches the number of files processed from the S3 bucket.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "\n", + "============================================================\n", + "✅ ORCHESTRATOR COMPLETE\n", + "============================================================\n", + "\n", + "The process has been completed successfully. Here is the summary:\n", + "\n", + "1. Initial MongoDB document count was 0.\n", + "2. Found 25 files in S3 bucket 'ai-papers-and-blogs-notebook' (5 PDFs, 20 HTML).\n", + "3. Created a workflow with ID '3df7bd1b-00c1-4016-bf8d-ded25eedccc4'.\n", + "4. Triggered the workflow successfully. Job ID was '5321b116-5117-47f8-b8de-4b5b1c5ab3db'.\n", + "5. The job completed successfully.\n", + "6. Final MongoDB document count is 503.\n", + "7. Verified MongoDB: 503 total documents.\n", + "8. Deleted 25 files from S3.\n", + "\n", + "The MongoDB document count increased by 503, which matches the number of files processed from the S3 bucket.\n" + ] + } + ], + "source": [ + "print(\"🤖 Starting Orchestrator Agent\")\n", + "print(\"=\" * 60)\n", + "print(f\"📋 Task: Process documents from S3 → MongoDB\")\n", + "print(f\"📁 S3 Bucket: {S3_SOURCE_BUCKET}\")\n", + "print(\"=\" * 60)\n", + "\n", + "orchestrator_response = orchestrator_executor.invoke({\n", + " \"input\": f\"\"\"Process documents from S3 bucket '{S3_SOURCE_BUCKET}' to MongoDB.\n", + "\n", + "Steps:\n", + "1. Get the INITIAL MongoDB document count\n", + "2. Check S3 for documents\n", + "3. If documents exist, CREATE the workflow (connectors + nodes)\n", + "4. Trigger the workflow you just created\n", + "5. Wait for completion\n", + "6. Get the FINAL MongoDB document count\n", + "7. Compare before/after counts and report the difference\n", + "8. Clean up S3 when verified\n", + "\n", + "Report status at each step with clear before/after comparison.\"\"\",\n", + " \"s3_bucket\": S3_SOURCE_BUCKET\n", + "})\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"✅ ORCHESTRATOR COMPLETE\")\n", + "print(\"=\" * 60)\n", + "print(f\"\\n{orchestrator_response['output']}\")" + ] + }, + { + "cell_type": "markdown", + "id": "f7276dfa", + "metadata": {}, + "source": [ + "## Generating AI Newsletters from Processed Documents\n", + "\n", + "Now that your documents are processed and stored in MongoDB, you can generate AI-powered newsletters using the autonomous Summarizer Agent below!\n", + "\n", + "The agent will:\n", + "- Retrieve documents from MongoDB\n", + "- Generate detailed summaries for each document\n", + "- Create an executive brief highlighting the most important developments\n", + "- Handle context window limitations automatically\n", + "\n", + "You can customize the summary and executive brief prompts in the agent execution cell to control the style, length, and focus of the generated content." + ] + }, + { + "cell_type": "markdown", + "id": "e38a734b", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 🤖 Summarizer Agent: Autonomous Newsletter Generation\n", + "\n", + "Now that documents are processed and stored in MongoDB, let's use an AI agent to autonomously generate the newsletter content.\n", + "\n", + "**Summarizer Agent** - Generates newsletter from MongoDB:\n", + "- Retrieves documents from MongoDB\n", + "- Handles context window limitations\n", + "- Generates individual summaries\n", + "- Synthesizes executive brief\n", + "\n", + "Like the Orchestrator Agent, this agent uses self-contained tools that demonstrate how to build autonomous content generation systems." + ] + }, + { + "cell_type": "markdown", + "id": "7ceb9902", + "metadata": {}, + "source": [ + "## Summarizer Agent Setup\n", + "\n", + "The Summarizer Agent uses LangChain to autonomously generate newsletter content from processed documents." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "2806a2ff", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✅ Summarizer Agent ready!\n", + "📋 Available tools: retrieve_documents, get_document_text, count_tokens, batch_documents, generate_summary, collapse_summaries, generate_brief\n" + ] + } + ], + "source": [ + "\"\"\"\n", + "SUMMARIZER AGENT\n", + "Autonomous newsletter generation from MongoDB\n", + "\"\"\"\n", + "\n", + "# ============================================================\n", + "# Tool Functions\n", + "# ============================================================\n", + "\n", + "def retrieve_documents_from_mongodb(_: str = \"\") -> dict:\n", + " \"\"\"Retrieve list of unique filenames from MongoDB (NOT the full content).\"\"\"\n", + " try:\n", + " from pymongo import MongoClient\n", + " from collections import defaultdict\n", + " \n", + " client = MongoClient(MONGODB_URI)\n", + " db = client[MONGODB_DATABASE]\n", + " collection = db[MONGODB_COLLECTION]\n", + " \n", + " # Query for CompositeElement documents\n", + " query = {\"type\": \"CompositeElement\"}\n", + " documents = list(collection.find(query))\n", + " \n", + " # Group by filename to get unique files\n", + " grouped = defaultdict(list)\n", + " for doc in documents:\n", + " metadata = doc.get(\"metadata\", {})\n", + " filename = metadata.get(\"filename\", \"unknown\")\n", + " grouped[filename].append(doc)\n", + " \n", + " # Return just the filenames list (NOT the full content)\n", + " filenames = list(grouped.keys())\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"total_documents\": len(documents),\n", + " \"unique_files\": len(filenames),\n", + " \"filenames\": filenames, # Just the list of files\n", + " \"message\": f\"Found {len(filenames)} unique files to process (use get_document_text to retrieve content)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error retrieving documents: {str(e)}\"\n", + " }\n", + "\n", + "def get_document_text(filename: str) -> dict:\n", + " \"\"\"Get full text for a specific document (grouped by page, sorted, concatenated).\"\"\"\n", + " try:\n", + " from pymongo import MongoClient\n", + " \n", + " client = MongoClient(MONGODB_URI)\n", + " db = client[MONGODB_DATABASE]\n", + " collection = db[MONGODB_COLLECTION]\n", + " \n", + " # Query for this specific filename\n", + " query = {\n", + " \"type\": \"CompositeElement\",\n", + " \"metadata.filename\": filename\n", + " }\n", + " documents = list(collection.find(query))\n", + " \n", + " if not documents:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"message\": f\"No documents found for filename: {filename}\"\n", + " }\n", + " \n", + " # Sort by page number (same as manual code)\n", + " sorted_docs = sorted(documents, key=lambda d: d.get(\"metadata\", {}).get(\"page_number\", 0))\n", + " \n", + " # Concatenate text (same as manual code)\n", + " full_text = \"\\n\\n\".join([d.get(\"text\", \"\") for d in sorted_docs if d.get(\"text\")])\n", + " \n", + " # Truncate if too long (same as manual code)\n", + " max_chars = 100000\n", + " if len(full_text) > max_chars:\n", + " full_text = full_text[:max_chars]\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"filename\": filename,\n", + " \"pages\": len(documents),\n", + " \"text\": full_text,\n", + " \"text_length\": len(full_text),\n", + " \"message\": f\"Retrieved {len(documents)} pages for {filename}\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error retrieving document text: {str(e)}\"\n", + " }\n", + "\n", + "def count_tokens(text: str) -> dict:\n", + " \"\"\"Estimate token count and check if it fits in context window.\"\"\"\n", + " # Simple estimation: ~4 characters per token\n", + " estimated_tokens = len(text) // 4\n", + " max_tokens = 120000 # GPT-4 context window\n", + " \n", + " fits = estimated_tokens < max_tokens\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"estimated_tokens\": estimated_tokens,\n", + " \"max_tokens\": max_tokens,\n", + " \"fits_in_window\": fits,\n", + " \"message\": f\"Estimated {estimated_tokens:,} tokens. {'Fits' if fits else 'Does not fit'} in context window.\"\n", + " }\n", + "\n", + "def batch_documents(documents_json: str, max_tokens: int = 100000) -> dict:\n", + " \"\"\"Split documents into batches that fit in context window.\"\"\"\n", + " try:\n", + " import json\n", + " documents = json.loads(documents_json)\n", + " \n", + " batches = []\n", + " current_batch = []\n", + " current_tokens = 0\n", + " \n", + " for filename, docs in documents.items():\n", + " # Estimate tokens for this file\n", + " text = \"\\n\\n\".join([d.get(\"text\", \"\") for d in docs if d.get(\"text\")])\n", + " file_tokens = len(text) // 4\n", + " \n", + " if current_tokens + file_tokens > max_tokens and current_batch:\n", + " # Start new batch\n", + " batches.append(current_batch)\n", + " current_batch = [filename]\n", + " current_tokens = file_tokens\n", + " else:\n", + " current_batch.append(filename)\n", + " current_tokens += file_tokens\n", + " \n", + " if current_batch:\n", + " batches.append(current_batch)\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"num_batches\": len(batches),\n", + " \"batches\": batches,\n", + " \"message\": f\"Split into {len(batches)} batches\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error batching documents: {str(e)}\"\n", + " }\n", + "\n", + "def generate_document_summary(text: str, instructions: str = None) -> dict:\n", + " \"\"\"Generate summary for document text. Automatically handles large documents via chunking.\"\"\"\n", + " try:\n", + " from langchain_openai import ChatOpenAI\n", + " \n", + " if not instructions:\n", + " instructions = \"\"\"Summarize this AI/ML content focusing on:\n", + " - Novel advancements or breakthroughs\n", + " - Performance improvements or benchmark results\n", + " - Practical applications and industry impact\n", + " \n", + " Keep summary focused and concise (max 12 sentences).\"\"\"\n", + " \n", + " # Check if document is too large (~20k tokens = ~80k chars)\n", + " estimated_tokens = len(text) // 4\n", + " MAX_SINGLE_CALL_TOKENS = 20000 # Conservative limit to avoid timeouts\n", + " \n", + " if estimated_tokens > MAX_SINGLE_CALL_TOKENS:\n", + " # Use chunked summarization for large documents\n", + " print(f\" 📊 Document too large ({estimated_tokens:,} tokens), using chunked summarization...\")\n", + " return generate_chunked_summary(text, instructions)\n", + " \n", + " # Normal single-pass summarization\n", + " llm = ChatOpenAI(model=\"gpt-4o\", temperature=0.3, openai_api_key=OPENAI_API_KEY)\n", + " \n", + " prompt = f\"\"\"{instructions}\n", + "\n", + "Content:\n", + "{text}\n", + "\n", + "Summary:\"\"\"\n", + " \n", + " response = llm.invoke(prompt)\n", + " summary = response.content.strip()\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"summary\": summary,\n", + " \"length\": len(summary),\n", + " \"message\": f\"Generated summary ({len(summary)} characters)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error generating summary: {str(e)}\"\n", + " }\n", + "\n", + "def generate_chunked_summary(text: str, instructions: str = None) -> dict:\n", + " \"\"\"Split large document into chunks, summarize each, then create final summary.\"\"\"\n", + " try:\n", + " from langchain_openai import ChatOpenAI\n", + " import math\n", + " \n", + " if not instructions:\n", + " instructions = \"\"\"Summarize this AI/ML content focusing on:\n", + " - Novel advancements or breakthroughs\n", + " - Performance improvements or benchmark results\n", + " - Practical applications and industry impact\n", + " \n", + " Keep summary focused and concise (max 12 sentences).\"\"\"\n", + " \n", + " llm = ChatOpenAI(model=\"gpt-4o\", temperature=0.3, openai_api_key=OPENAI_API_KEY)\n", + " \n", + " # Split into chunks (~40k chars each = ~10k tokens)\n", + " CHUNK_SIZE = 40000\n", + " chunks = []\n", + " for i in range(0, len(text), CHUNK_SIZE):\n", + " chunks.append(text[i:i+CHUNK_SIZE])\n", + " \n", + " print(f\" 📝 Splitting into {len(chunks)} chunks for sequential processing...\")\n", + " \n", + " # Summarize each chunk\n", + " chunk_summaries = []\n", + " for idx, chunk in enumerate(chunks, 1):\n", + " print(f\" 🔄 Processing chunk {idx}/{len(chunks)}...\")\n", + " \n", + " chunk_prompt = f\"\"\"This is part {idx} of {len(chunks)} of a larger document.\n", + " \n", + "{instructions}\n", + "\n", + "Content (Part {idx}/{len(chunks)}):\n", + "{chunk}\n", + "\n", + "Summary of this section:\"\"\"\n", + " \n", + " try:\n", + " response = llm.invoke(chunk_prompt)\n", + " chunk_summary = response.content.strip()\n", + " chunk_summaries.append(chunk_summary)\n", + " print(f\" ✅ Chunk {idx} summarized ({len(chunk_summary)} chars)\")\n", + " except Exception as e:\n", + " print(f\" ⚠️ Error summarizing chunk {idx}: {str(e)[:100]}\")\n", + " continue\n", + " \n", + " if not chunk_summaries:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"message\": \"Failed to summarize any chunks\"\n", + " }\n", + " \n", + " # Combine chunk summaries into final summary\n", + " print(f\" 🔗 Combining {len(chunk_summaries)} chunk summaries...\")\n", + " combined_text = \"\\n\\n\".join([f\"Section {i+1}:\\n{summary}\" for i, summary in enumerate(chunk_summaries)])\n", + " \n", + " final_prompt = f\"\"\"{instructions}\n", + "\n", + "The following are summaries of different sections of a single document. \n", + "Please create one coherent final summary that integrates all sections:\n", + "\n", + "{combined_text}\n", + "\n", + "Final integrated summary:\"\"\"\n", + " \n", + " response = llm.invoke(final_prompt)\n", + " final_summary = response.content.strip()\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"summary\": final_summary,\n", + " \"length\": len(final_summary),\n", + " \"chunks_processed\": len(chunks),\n", + " \"message\": f\"Generated chunked summary from {len(chunks)} parts ({len(final_summary)} characters)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error in chunked summarization: {str(e)}\"\n", + " }\n", + "\n", + "def collapse_summaries_tool(summaries_json: str, max_tokens: int = 15000) -> dict:\n", + " \"\"\"Collapse multiple summaries into fewer summaries to fit context window.\n", + " \n", + " Based on LangChain map-reduce pattern. Use this when you have many summaries\n", + " that might exceed context limits. More aggressive threshold to prevent overflow.\n", + " \"\"\"\n", + " try:\n", + " import json\n", + " from langchain_openai import ChatOpenAI\n", + " \n", + " summaries = json.loads(summaries_json)\n", + " \n", + " if not isinstance(summaries, list):\n", + " return {\n", + " \"status\": \"error\",\n", + " \"message\": \"summaries_json must be a JSON array of summary objects\"\n", + " }\n", + " \n", + " # Estimate tokens (rough: ~4 chars per token)\n", + " total_text = \" \".join([s.get(\"summary\", \"\") for s in summaries])\n", + " estimated_tokens = len(total_text) // 4\n", + " \n", + " if estimated_tokens < max_tokens:\n", + " return {\n", + " \"status\": \"success\",\n", + " \"collapsed_summaries\": summaries,\n", + " \"message\": f\"Summaries already fit in context ({estimated_tokens:,} tokens). No collapse needed.\"\n", + " }\n", + " \n", + " # Batch summaries into groups\n", + " batch_size = max(2, len(summaries) // 3) # Collapse 3:1 ratio\n", + " batches = [summaries[i:i+batch_size] for i in range(0, len(summaries), batch_size)]\n", + " \n", + " llm = ChatOpenAI(model=\"gpt-4o\", temperature=0.3, openai_api_key=OPENAI_API_KEY)\n", + " \n", + " collapsed = []\n", + " for i, batch in enumerate(batches):\n", + " batch_text = \"\\n\\n\".join([f\"**{s.get('filename', f'Doc {j}')}**: {s.get('summary', '')}\" \n", + " for j, s in enumerate(batch)])\n", + " \n", + " prompt = f\"\"\"Consolidate these summaries into a single summary that preserves key points:\n", + "\n", + "{batch_text}\n", + "\n", + "Consolidated summary:\"\"\"\n", + " \n", + " response = llm.invoke(prompt)\n", + " collapsed.append({\n", + " \"filename\": f\"collapsed_batch_{i+1}\",\n", + " \"summary\": response.content.strip()\n", + " })\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"collapsed_summaries\": collapsed,\n", + " \"original_count\": len(summaries),\n", + " \"collapsed_count\": len(collapsed),\n", + " \"message\": f\"Collapsed {len(summaries)} summaries into {len(collapsed)} batches\"\n", + " }\n", + " \n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error collapsing summaries: {str(e)}\"\n", + " }\n", + "\n", + "def generate_executive_brief(summaries_json: str, instructions: str = None) -> dict:\n", + " \"\"\"Create executive brief from summaries.\"\"\"\n", + " try:\n", + " import json\n", + " from langchain_openai import ChatOpenAI\n", + " from datetime import datetime\n", + " \n", + " summaries = json.loads(summaries_json)\n", + " \n", + " if not instructions:\n", + " instructions = \"\"\"Create an executive summary (~700 words) that:\n", + " 1. Identifies the most significant industry developments\n", + " 2. Highlights practical applications\n", + " 3. Notes key performance milestones\n", + " 4. Synthesizes trends across developments\n", + " \n", + " Write for C-suite executives. Be selective - only include most relevant developments.\"\"\"\n", + " \n", + " # Build detailed content\n", + " detailed_content = f\"\"\"# AI Industry Weekly Digest\n", + "*{datetime.now().strftime(\"%B %d, %Y\")}*\n", + "\n", + "## Summaries of Recent Publications\n", + "\n", + "\"\"\"\n", + " \n", + " for i, summary_data in enumerate(summaries, 1):\n", + " filename = summary_data.get(\"filename\", f\"Document {i}\")\n", + " summary_text = summary_data.get(\"summary\", \"\")\n", + " \n", + " title = filename.replace(\".pdf\", \"\").replace(\".html\", \"\").replace(\"_\", \" \").title()\n", + " if len(title) > 80:\n", + " title = title[:77] + \"...\"\n", + " \n", + " detailed_content += f\"\\n### {i}. {title}\\n\\n{summary_text}\\n\\n\"\n", + " \n", + " llm = ChatOpenAI(model=\"gpt-4o\", temperature=0.3, openai_api_key=OPENAI_API_KEY)\n", + " \n", + " prompt = f\"\"\"{instructions}\n", + "\n", + "Detailed Newsletter:\n", + "{detailed_content}\n", + "\n", + "Executive Summary:\"\"\"\n", + " \n", + " response = llm.invoke(prompt)\n", + " brief = response.content.strip()\n", + " word_count = len(brief.split())\n", + " \n", + " return {\n", + " \"status\": \"success\",\n", + " \"brief\": brief,\n", + " \"word_count\": word_count,\n", + " \"message\": f\"Generated executive brief ({word_count} words)\"\n", + " }\n", + " except Exception as e:\n", + " return {\n", + " \"status\": \"error\",\n", + " \"error\": str(e),\n", + " \"message\": f\"Error generating brief: {str(e)}\"\n", + " }\n", + "\n", + "# ============================================================\n", + "# Create LangChain Tools\n", + "# ============================================================\n", + "\n", + "summarizer_tools = [\n", + " Tool(\n", + " name=\"retrieve_documents\",\n", + " func=retrieve_documents_from_mongodb,\n", + " description=\"Get list of unique filenames from MongoDB. Returns filenames list (NOT full content). No input needed.\"\n", + " ),\n", + " Tool(\n", + " name=\"get_document_text\",\n", + " func=get_document_text,\n", + " description=\"Get full text for ONE specific document by filename. Input: filename (string). Returns grouped, sorted, concatenated text.\"\n", + " ),\n", + " Tool(\n", + " name=\"count_tokens\",\n", + " func=count_tokens,\n", + " description=\"Estimate token count for text. Input: text (string). Returns whether it fits in context window.\"\n", + " ),\n", + " Tool(\n", + " name=\"batch_documents\",\n", + " func=batch_documents,\n", + " description=\"Split documents into batches. Input: documents_json (JSON string), max_tokens (int). Returns batches.\"\n", + " ),\n", + " Tool(\n", + " name=\"generate_summary\",\n", + " func=generate_document_summary,\n", + " description=\"Generate summary for document. Input: text (string), optional instructions (string).\"\n", + " ),\n", + " Tool(\n", + " name=\"collapse_summaries\",\n", + " func=collapse_summaries_tool,\n", + " description=\"Collapse many summaries into fewer summaries if approaching context limits. Input: summaries_json (JSON array). Use if you have 10+ summaries.\"\n", + " ),\n", + " Tool(\n", + " name=\"generate_brief\",\n", + " func=generate_executive_brief,\n", + " description=\"Create executive brief from summaries. Input: summaries_json (JSON array), optional instructions (string).\"\n", + " ),\n", + "]\n", + "\n", + "# ============================================================\n", + "# Create Summarizer Agent\n", + "# ============================================================\n", + "\n", + "summarizer_prompt = ChatPromptTemplate.from_messages([\n", + " (\"system\", \"\"\"You generate AI newsletter content from MongoDB documents.\n", + "\n", + "IMPORTANT WORKFLOW:\n", + "1. Call retrieve_documents to get the list of filenames\n", + "2. For EACH filename:\n", + " a. Call get_document_text(filename) to get the full text\n", + " b. Call generate_summary(text) to create a summary\n", + " c. Store the summary\n", + "3. After processing 3-4 files (or sooner if context is filling):\n", + " a. IMMEDIATELY call collapse_summaries to reduce accumulated context\n", + " b. Continue with remaining files (if any)\n", + "4. Before generating the executive brief:\n", + " a. Call collapse_summaries ONE MORE TIME to ensure context is minimal\n", + " b. Then call generate_brief with the fully collapsed summaries\n", + "5. Present the final newsletter\n", + "\n", + "CONTEXT WINDOW SAFETY (CRITICAL):\n", + "- Your conversation history accumulates tool outputs and can exceed limits\n", + "- Call collapse_summaries EARLY and OFTEN (every 3-4 documents)\n", + "- ALWAYS collapse before generate_brief, even if you already collapsed earlier\n", + "- This prevents context window overflow by keeping intermediate history small\n", + "\n", + "CRITICAL: Process ONE document at a time. DO NOT try to retrieve all documents at once.\n", + "Each document's chunks are already grouped, sorted by page, and concatenated by get_document_text.\n", + "\n", + "Focus summaries on AI/ML advancements. Keep executive brief ~700 words.\n", + "\n", + "MongoDB Database: {mongodb_database}\n", + "MongoDB Collection: {mongodb_collection}\n", + "\"\"\"),\n", + " (\"user\", \"{input}\"),\n", + " MessagesPlaceholder(variable_name=\"agent_scratchpad\")\n", + "])\n", + "\n", + "# Create Summarizer LLM with larger context window\n", + "summarizer_llm = ChatOpenAI(model=\"gpt-4o\", temperature=0.3, openai_api_key=OPENAI_API_KEY)\n", + "\n", + "summarizer_agent = create_openai_functions_agent(summarizer_llm, summarizer_tools, summarizer_prompt)\n", + "summarizer_executor = AgentExecutor(\n", + " agent=summarizer_agent,\n", + " tools=summarizer_tools,\n", + " verbose=True,\n", + " max_iterations=20, # Increased for multiple documents\n", + " handle_parsing_errors=True\n", + ")\n", + "\n", + "print(\"✅ Summarizer Agent ready!\")\n", + "print(f\"📋 Available tools: {', '.join([t.name for t in summarizer_tools])}\")" + ] + }, + { + "cell_type": "markdown", + "id": "d3dfc6fb", + "metadata": {}, + "source": [ + "### Execute Summarizer Agent\n", + "\n", + "Generate this week's AI newsletter autonomously." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "7fc657be", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "📝 Starting Summarizer Agent\n", + "============================================================\n", + "📋 Task: Generate AI newsletter from MongoDB\n", + "🗄️ Database: scraped_publications\n", + "📁 Collection: documents\n", + "📄 Documents to process: 25 unique files (503 total chunks)\n", + "============================================================\n", + "\n", + "\n", + "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `retrieve_documents` with `scraped_publications.documents`\n", + "\n", + "\n", + "\u001b[0m\u001b[36;1m\u001b[1;3m{'status': 'success', 'total_documents': 503, 'unique_files': 25, 'filenames': ['blog_bigcode_arena_20251009_165045.html', 'blog_dots-ocr-ne_20251009_165053.html', '2510v07317v1.pdf', 'blog_faster-transformers_20251009_165057.html', 'blog_AdamF92_reactive-transformer-intro_20251009_165200.html', 'blog_AdamF92_reactive-transformer-intro_20251009_165032.html', 'blog_JohnsonZheng03_ml-agent-trick-automind_20251009_165034.html', '2510v07315v1.pdf', 'blog_NormalUhr_grpo-to-dapo-and-gspo_20251009_165041.html', 'blog_NormalUhr_grpo_20251009_165039.html', 'blog_NormalUhr_grpo-to-dapo-and-gspo_20251009_165213.html', '2510v07314v1.pdf', 'blog_JohnsonZheng03_ml-agent-trick-automind_20251009_165203.html', 'blog_NormalUhr_rlhf-pipeline_20251009_165043.html', 'blog_NormalUhr_rlhf-pipeline_20251009_165215.html', 'blog_NormalUhr_grpo_20251009_165211.html', '2510v07318v1.pdf', '2510v07319v1.pdf', 'blog_catherinearnett_in-defense-of-tokenizers_20251009_165219.html', 'blog_catherinearnett_in-defense-of-tokenizers_20251009_165052.html', 'blog_bigcode_arena_20251009_165217.html', 'blog_driaforall_mem-agent-blog_20251009_165055.html', 'blog_dots-ocr-ne_20251009_165221.html', 'blog_driaforall_mem-agent-blog_20251009_165223.html', 'blog_faster-transformers_20251009_165225.html'], 'message': 'Found 25 unique files to process (use get_document_text to retrieve content)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_document_text` with `blog_bigcode_arena_20251009_165045.html`\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'filename': 'blog_bigcode_arena_20251009_165045.html', 'pages': 6, 'text': 'Back to Articles\\n\\nBigCodeArena: Judging code generations end to end with code executions\\n\\nCommunity Article Published October 7, 2025\\n\\nUpvote\\n\\n12\\n\\nTerry Yue Zhuo\\n\\nterryyz\\n\\nbigcode\\n\\nEvaluating the quality of AI-generated code is notoriously difficult. While humans can easily spot whether a piece of code \"looks right,\" determining if it actually works correctly, handles edge cases properly, and produces the intended result requires running and testing it. This is why today, we\\'re thrilled to announce BigCodeArena -- the first human-in-the-loop platform for evaluating code generation models through execution.\\n\\nInspired by LMArena for LLMs, we\\'ve built a platform that allows anyone to compare code generation models side-by-side, but with a crucial difference: you can actually run the code and see what it produces. Just submit a coding task, watch two different models generate solutions, execute both programs, and vote on which model produced better results. The outcomes are organized into a leaderboard that displays the community\\'s highest-rated models.\\n\\nMotivation\\n\\nThe field of code generation has long struggled with reliable evaluation methods. Traditional benchmarks like HumanEval test code against predefined test cases, but these represent only a tiny fraction of real-world programming tasks. Human evaluation platforms exist for general chatbots, but they fall short for code: reading raw source code and mentally simulating its execution is cognitively demanding and error-prone, especially for longer programs or complex UI applications.\\n\\nConsider this scenario:\\n\\nYou ask two AI models to build a responsive photo gallery website. Both generate code that looks syntactically correct. But which one is actually better? Without running the code, it\\'s nearly impossible to tell. One might produce a beautiful, functional grid layout, while the other might have subtle bugs or poor styling that only become apparent when rendered in a browser.\\n\\nThis observation led us to a key insight: execution feedback is essential for humans to judge code quality reliably. That\\'s exactly what BigCodeArena provides.\\n\\nThe BigCodeArena Platform\\n\\nBigCodeArena extends the Chatbot Arena framework with powerful features specifically designed for code evaluation:\\n\\nReal-Time Execution\\n\\nEvery code snippet generated by models is automatically executed in isolated sandbox environments. Whether it\\'s a Python script, a React web app, a PyGame game, or a C++ algorithm, you can see the actual output, not just the source code.\\n\\nMulti-Language & Framework Support\\n\\nWe currently support 10 languages (Python, JavaScript, TypeScript, HTML, C, C++, Java, Go, Rust, and Markdown) and 8 execution environments:\\n\\nWeb Frameworks: React, Vue, Core Web (vanilla HTML/CSS/JS)\\n\\nPython Frameworks: Streamlit, Gradio, PyGame\\n\\nDiagrams: Mermaid\\n\\nGeneral Purpose Interpreters: Python and JavaScript code interpreters, plus compiled language runners\\n\\nInteractive Testing\\n\\nUnlike static code comparison, you can actually interact with the generated applications:\\n\\nClick buttons and test UI elements in web apps\\n\\nPlay the games generated by models\\n\\nEdit the code and re-run it to test modifications\\n\\nView visual outputs like plots, charts, and diagrams\\n\\nMulti-Turn Conversations\\n\\nReal programming isn\\'t one-and-done. BigCodeArena supports multi-turn interactions, allowing you to refine requirements, ask for features to be added, or request bug fixes -- just like working with a real coding assistant.\\n\\nWhat We\\'ve Learned: 5 Months of Community Evaluation\\n\\nSince launching in February 2025, BigCodeArena has collected over 14,000 conversations from more than 500 unique users, with 4,700+ high-quality preference votes comparing 10 frontier LLMs.\\n\\nProgramming Topics in the Wild\\n\\nOur users have explored remarkably diverse coding scenarios:\\n\\nWeb Design (36%): Building responsive websites, interactive dashboards, and web applications\\n\\nProblem Solving (23%): Algorithms, data structures, and computational challenges\\n\\nGame Development (16%): Creating interactive games with physics, collision detection, and graphics\\n\\nScientific Computing (14%): Data analysis, visualization, and numerical simulations\\n\\nCreative Coding (8%): Artistic visualizations, generative art, and experimental interfaces\\n\\nDiagram Creation (3%): Flowcharts, system architectures, and data visualizations\\n\\nLanguage and Framework Popularity\\n\\nPython dominates with over 4,000 conversations, followed by JavaScript/TypeScript (3,359), HTML (1,601), and C++ (642). Among frameworks, direct Python interpreters lead usage (6,000 sessions), with React (2,729), Core Web (1,574), Streamlit (1,254), and PyGame (1,087) also seeing heavy use.\\n\\nUser Interaction Patterns\\n\\nMost interactions are focused and efficient: 76% of conversations consist of just 2 turns (one request, one response), with a mean conversation length of 4.12 messages. However, the platform supports extended multi-turn debugging sessions when needed, with some conversations exceeding 10 turns as users refine complex applications.\\n\\nModel Rankings from Community Votes\\n\\nFrom our 14K conversations, we filtered for high-quality pairwise comparisons: conversations with at least two turns and actual code execution. This yielded 4,731 voting samples, with each evaluated model receiving at least 700 votes. We aggregate these votes into Elo ratings using the Bradley-Terry model, which estimates the probability that one model beats another based on head-to-head comparisons.\\n\\nTo ensure robust rankings, we use 100 bootstrap resamples to construct 95% confidence intervals, so we can identify statistically significant performance differences between models.\\n\\nWe evaluate models under three settings to control for different factors:\\n\\nAll Data: Uses all pairwise comparisons regardless of execution environment or programming language\\n\\nEnvironment Matched: Only compares models when both were executed in the same sandbox (e.g., both in React or both in PyGame)\\n\\nLanguage Matched: Further restricts comparisons to the same programming language\\n\\nRankings remain remarkably consistent across all three settings, revealing clear performance tiers:\\n\\nTop Tier: o3-mini and o1-mini consistently lead with the highest Elo ratings and tight confidence intervals. These models maintain top performance regardless of environment or language constraints, showing strong robustness across coding scenarios. Claude-3.5-Sonnet follows closely, particularly excelling when language is controlled.\\n\\nMid Tier: GPT-4o, o1, and Gemini-2.0-Pro/Flash form a competitive middle tier. GPT-4o shows some sensitivity to language matching, suggesting room for improvement in multilingual consistency.\\n\\nOpen Source Models: Qwen2.5 variants and Llama-3.3-70B lag behind frontier proprietary models, highlighting the performance gap that remains between leading closed and open models.\\n\\nFigure: Overall win rate heatmaps (percentage of all pairwise comparisons won) of each model in the sessions across languages (left) and execution environments (right). For each category, we only keep models that appear in at least 3 conversation sessions.\\n\\nPerformance Across Languages\\n\\nBreaking down performance by programming language reveals interesting patterns:\\n\\nTop-tier models like o3-mini and o1-mini achieve dominant win rates in mainstream languages like Python, Java, and C++\\n\\nGemini-2.0-Pro shows particular strength in Rust, achieving the highest win rate in that category\\n\\nDifferent models exhibit distinct areas of expertise, with frontier models excelling in different niches\\n\\nOpen models like Qwen2.5 variants show inconsistent performance, particularly struggling with Rust and Go\\n\\nPerformance Across Execution Environments\\n\\nAnalyzing win rates by execution environment reveals how models handle different runtime contexts:\\n\\nRobust Performers: o3-mini maintains consistently strong performance across React, Streamlit, Gradio, Core Web, and PyGame, demonstrating excellent environmental adaptability.\\n\\nStable but Selective: Claude-3.5-Sonnet and Gemini-2.0-Flash show generally stable performance but with reduced win rates in complex UI-heavy environments like Vue and Mermaid.\\n\\nFramework-Specific Weaknesses: Qwen2.5 models, while competitive in some web frameworks (Core Web, React), struggle significantly with interactive and visualization-oriented environments like PyGame, Vue, and Mermaid. These environments often require precise handling of control flow, graphics rendering, and package dependencies.\\n\\nThese results highlight an important insight: aggregate Elo scores don\\'t tell the whole story. Some models remain brittle under specific runtime constraints, and execution environment matters significantly for real-world deployment.\\n\\nTwo New Benchmarks: BigCodeReward and AutoCodeArena\\n\\nTo advance research beyond crowdsourced evaluation, we\\'re releasing two complementary benchmarks:\\n\\nBigCodeReward: Evaluating Reward Models for Code\\n\\nBuilding on our 4,700+ preference votes, BigCodeReward tests how well LLMs can judge code quality when acting as reward models. The key finding? Execution results dramatically improve judgment accuracy.\\n\\nWhen models can see execution outputs (screenshots of web apps, game visuals, program logs), their alignment with human preferences increases substantially:\\n\\nClaude-Sonnet-4: 56.7% → 62.3% accuracy\\n\\nGPT-4o: 54.6% → 63.8% accuracy\\n\\nQwen2.5-VL-72B: 58.7% → 66.2% accuracy\\n\\nThis reinforces our core thesis: you can\\'t reliably judge code without running it -- and this applies to both humans and AI judges.\\n\\nAutoCodeArena: Automated Code Generation Benchmarks\\n\\nInspired by Arena-Hard-Auto, AutoCodeArena provides a scalable way to evaluate new models without waiting for thousands of human votes. We carefully selected 600 representative prompts from our crowdsourced data, spanning all programming topics and frameworks.\\n\\nUsing automated LLM judges (Claude-3.7-Sonnet) to evaluate code execution results against a GPT-4.1 baseline, we can rapidly benchmark new models. This approach enables weekly leaderboard updates as new models are released.\\n\\nOur automated benchmark evaluated 20+ cutting-edge models, including recently released systems:\\n\\nTop Performers:\\n\\nGPT-5 -- Establishes new state-of-the-art by a significant margin\\n\\nClaude-Opus-4 and Claude-Sonnet-4 -- Strong second tier, excelling in reasoning-heavy tasks\\n\\nQwen3-Coder, Kimi-K2, GLM-4.5 -- Leading open models that narrow the gap with mid-tier proprietary systems\\n\\nFigure: Win rates of recent LLMs on AutoCodeArena against a GPT-4.1 baseline, judged by Claude-3.7-Sonnet. The 50% mark represents parity with GPT-4.1. Models above this line outperform the baseline, while those below underperform. Error bars show 95% confidence intervals. Note: Claude-3.7-Sonnet is excluded from rankings to avoid self-judgment bias, and GPT-4.1 appears only as the reference baseline.\\n\\nThe results show that while proprietary models maintain an edge, open-source models are rapidly closing the gap, with some approaching GPT-4.1-level performance.\\n\\nTry It Yourself\\n\\nBigCodeArena is open to everyone -- no account required! Visit https://huggingface.co/spaces/bigcode/arena to:\\n\\nCompare code from more recent frontier LLMs (e.g., Qwen3, DeepSeek-V3.X, and other proprietary models)\\n\\nTest web apps, games, visualizations, and algorithms\\n\\nSee real execution results, not just source code\\n\\nVote on your preferences to help improve the leaderboard\\n\\nExplore multi-turn coding conversations\\n\\nWhether you\\'re building a React dashboard, creating a PyGame game, solving algorithmic challenges, or generating creative visualizations, BigCodeArena lets you see which models truly deliver.\\n\\nOpen Source Everything\\n\\nFollowing the BigCode Project\\'s commitment to transparency, we\\'re releasing:\\n\\nCodebase: Full evaluation pipelines and Gradio application source (GitHub)\\n\\nCrowdsourced Data: 14K raw conversations and 4.7K preference votes (HuggingFace Collection)\\n\\nBenchmarks: BigCodeReward and AutoCodeArena datasets\\n\\nWhat\\'s Next?\\n\\nWe envision BigCodeArena as a long-term project that evolves with the community:\\n\\nExpanded Language Support: More programming languages and frameworks.\\n\\nLive Benchmarks: Continuously refreshed evaluation prompts to prevent overfitting\\n\\nAgent-Based Evaluation: Using AI agents to interact with web apps for deeper testing\\n\\nBetter Reward Models: Advancing automated code quality assessment\\n\\nCommunity Contributions: We welcome new execution environments, evaluation criteria, and model additions. PRs are always welcome!\\n\\nConclusion\\n\\nEvaluating code isn\\'t like evaluating text -- you need to run it, test it, and interact with it. BigCodeArena makes this possible at scale, combining human judgment with real execution feedback to create the most reliable evaluation platform for code generation models.\\n\\nJoin us in building the future of code generation evaluation. Write a prompt, compare the models, and vote for your favorite. Your feedback helps the entire community understand which models truly deliver on the promise of AI-assisted programming.\\n\\nWe\\'d love to hear your feedback! Connect with us on GitHub, join discussions in the Hugging Face Space community tab, or reach out to the BigCode Project at contact@bigcode-project.org.\\n\\nAcknowledgements\\n\\nWe thank Leandro von Werra for his valuable suggestions and feedback on the blog.\\n\\nCitation\\n\\n@article{zhuo2025bigcodearena, title={BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution}, author={Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Vaisakhi Mishra, Marek Suppa, Noah Ziems, Saiteja Utpala, Ming Xu, Guangyu Song, Kaixin Li, Yuhan Cao, Bo Liu, Zheng Liu, Sabina Abdurakhmanova, Wenhao Yu, Mengzhao Jia, Jihan Yao, Kenneth Hamilton, Kumar Shridhar, Minh Chien Vu, Dingmin Wang, Jiawei Liu, Zijian Wang, Qian Liu, Binyuan Hui, Meg Risdal, Ahsen Khaliq, Atin Sood, Zhenchang Xing, Wasi Uddin Ahmad, John Grundy, David Lo, Banghua Zhu, Xiaoning Du, Torsten Scholak, Leandro von Werra}, year={2025} }\\n\\nTry BigCodeArena now: Hugging Face Space\\n\\nRead the paper: Download Link\\n\\nRun the code: GitHub\\n\\nExplore the collection: Hugging Face Collection\\n\\nCommunity\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n12\\n\\nBack to Articles\\n\\nBigCodeArena: Judging code generations end to end with code executions\\n\\nCommunity Article Published October 7, 2025\\n\\nUpvote\\n\\n12\\n\\nTerry Yue Zhuo\\n\\nterryyz\\n\\nbigcode\\n\\nEvaluating the quality of AI-generated code is notoriously difficult. While humans can easily spot whether a piece of code \"looks right,\" determining if it actually works correctly, handles edge cases properly, and produces the intended result requires running and testing it. This is why today, we\\'re thrilled to announce BigCodeArena -- the first human-in-the-loop platform for evaluating code generation models through execution.\\n\\nInspired by LMArena for LLMs, we\\'ve built a platform that allows anyone to compare code generation models side-by-side, but with a crucial difference: you can actually run the code and see what it produces. Just submit a coding task, watch two different models generate solutions, execute both programs, and vote on which model produced better results. The outcomes are organized into a leaderboard that displays the community\\'s highest-rated models.\\n\\nMotivation\\n\\nThe field of code generation has long struggled with reliable evaluation methods. Traditional benchmarks like HumanEval test code against predefined test cases, but these represent only a tiny fraction of real-world programming tasks. Human evaluation platforms exist for general chatbots, but they fall short for code: reading raw source code and mentally simulating its execution is cognitively demanding and error-prone, especially for longer programs or complex UI applications.\\n\\nConsider this scenario:\\n\\nYou ask two AI models to build a responsive photo gallery website. Both generate code that looks syntactically correct. But which one is actually better? Without running the code, it\\'s nearly impossible to tell. One might produce a beautiful, functional grid layout, while the other might have subtle bugs or poor styling that only become apparent when rendered in a browser.\\n\\nThis observation led us to a key insight: execution feedback is essential for humans to judge code quality reliably. That\\'s exactly what BigCodeArena provides.\\n\\nThe BigCodeArena Platform\\n\\nBigCodeArena extends the Chatbot Arena framework with powerful features specifically designed for code evaluation:\\n\\nReal-Time Execution\\n\\nEvery code snippet generated by models is automatically executed in isolated sandbox environments. Whether it\\'s a Python script, a React web app, a PyGame game, or a C++ algorithm, you can see the actual output, not just the source code.\\n\\nMulti-Language & Framework Support\\n\\nWe currently support 10 languages (Python, JavaScript, TypeScript, HTML, C, C++, Java, Go, Rust, and Markdown) and 8 execution environments:\\n\\nWeb Frameworks: React, Vue, Core Web (vanilla HTML/CSS/JS)\\n\\nPython Frameworks: Streamlit, Gradio, PyGame\\n\\nDiagrams: Mermaid\\n\\nGeneral Purpose Interpreters: Python and JavaScript code interpreters, plus compiled language runners\\n\\nInteractive Testing\\n\\nUnlike static code comparison, you can actually interact with the generated applications:\\n\\nClick buttons and test UI elements in web apps\\n\\nPlay the games generated by models\\n\\nEdit the code and re-run it to test modifications\\n\\nView visual outputs like plots, charts, and diagrams\\n\\nMulti-Turn Conversations\\n\\nReal programming isn\\'t one-and-done. BigCodeArena supports multi-turn interactions, allowing you to refine requirements, ask for features to be added, or request bug fixes -- just like working with a real coding assistant.\\n\\nWhat We\\'ve Learned: 5 Months of Community Evaluation\\n\\nSince launching in February 2025, BigCodeArena has collected over 14,000 conversations from more than 500 unique users, with 4,700+ high-quality preference votes comparing 10 frontier LLMs.\\n\\nProgramming Topics in the Wild\\n\\nOur users have explored remarkably diverse coding scenarios:\\n\\nWeb Design (36%): Building responsive websites, interactive dashboards, and web applications\\n\\nProblem Solving (23%): Algorithms, data structures, and computational challenges\\n\\nGame Development (16%): Creating interactive games with physics, collision detection, and graphics\\n\\nScientific Computing (14%): Data analysis, visualization, and numerical simulations\\n\\nCreative Coding (8%): Artistic visualizations, generative art, and experimental interfaces\\n\\nDiagram Creation (3%): Flowcharts, system architectures, and data visualizations\\n\\nLanguage and Framework Popularity\\n\\nPython dominates with over 4,000 conversations, followed by JavaScript/TypeScript (3,359), HTML (1,601), and C++ (642). Among frameworks, direct Python interpreters lead usage (6,000 sessions), with React (2,729), Core Web (1,574), Streamlit (1,254), and PyGame (1,087) also seeing heavy use.\\n\\nUser Interaction Patterns\\n\\nMost interactions are focused and efficient: 76% of conversations consist of just 2 turns (one request, one response), with a mean conversation length of 4.12 messages. However, the platform supports extended multi-turn debugging sessions when needed, with some conversations exceeding 10 turns as users refine complex applications.\\n\\nModel Rankings from Community Votes\\n\\nFrom our 14K conversations, we filtered for high-quality pairwise comparisons: conversations with at least two turns and actual code execution. This yielded 4,731 voting samples, with each evaluated model receiving at least 700 votes. We aggregate these votes into Elo ratings using the Bradley-Terry model, which estimates the probability that one model beats another based on head-to-head comparisons.\\n\\nTo ensure robust rankings, we use 100 bootstrap resamples to construct 95% confidence intervals, so we can identify statistically significant performance differences between models.\\n\\nWe evaluate models under three settings to control for different factors:\\n\\nAll Data: Uses all pairwise comparisons regardless of execution environment or programming language\\n\\nEnvironment Matched: Only compares models when both were executed in the same sandbox (e.g., both in React or both in PyGame)\\n\\nLanguage Matched: Further restricts comparisons to the same programming language\\n\\nRankings remain remarkably consistent across all three settings, revealing clear performance tiers:\\n\\nTop Tier: o3-mini and o1-mini consistently lead with the highest Elo ratings and tight confidence intervals. These models maintain top performance regardless of environment or language constraints, showing strong robustness across coding scenarios. Claude-3.5-Sonnet follows closely, particularly excelling when language is controlled.\\n\\nMid Tier: GPT-4o, o1, and Gemini-2.0-Pro/Flash form a competitive middle tier. GPT-4o shows some sensitivity to language matching, suggesting room for improvement in multilingual consistency.\\n\\nOpen Source Models: Qwen2.5 variants and Llama-3.3-70B lag behind frontier proprietary models, highlighting the performance gap that remains between leading closed and open models.\\n\\nFigure: Overall win rate heatmaps (percentage of all pairwise comparisons won) of each model in the sessions across languages (left) and execution environments (right). For each category, we only keep models that appear in at least 3 conversation sessions.\\n\\nPerformance Across Languages\\n\\nBreaking down performance by programming language reveals interesting patterns:\\n\\nTop-tier models like o3-mini and o1-mini achieve dominant win rates in mainstream languages like Python, Java, and C++\\n\\nGemini-2.0-Pro shows particular strength in Rust, achieving the highest win rate in that category\\n\\nDifferent models exhibit distinct areas of expertise, with frontier models excelling in different niches\\n\\nOpen models like Qwen2.5 variants show inconsistent performance, particularly struggling with Rust and Go\\n\\nPerformance Across Execution Environments\\n\\nAnalyzing win rates by execution environment reveals how models handle different runtime contexts:\\n\\nRobust Performers: o3-mini maintains consistently strong performance across React, Streamlit, Gradio, Core Web, and PyGame, demonstrating excellent environmental adaptability.\\n\\nStable but Selective: Claude-3.5-Sonnet and Gemini-2.0-Flash show generally stable performance but with reduced win rates in complex UI-heavy environments like Vue and Mermaid.\\n\\nFramework-Specific Weaknesses: Qwen2.5 models, while competitive in some web frameworks (Core Web, React), struggle significantly with interactive and visualization-oriented environments like PyGame, Vue, and Mermaid. These environments often require precise handling of control flow, graphics rendering, and package dependencies.\\n\\nThese results highlight an important insight: aggregate Elo scores don\\'t tell the whole story. Some models remain brittle under specific runtime constraints, and execution environment matters significantly for real-world deployment.\\n\\nTwo New Benchmarks: BigCodeReward and AutoCodeArena\\n\\nTo advance research beyond crowdsourced evaluation, we\\'re releasing two complementary benchmarks:\\n\\nBigCodeReward: Evaluating Reward Models for Code\\n\\nBuilding on our 4,700+ preference votes, BigCodeReward tests how well LLMs can judge code quality when acting as reward models. The key finding? Execution results dramatically improve judgment accuracy.\\n\\nWhen models can see execution outputs (screenshots of web apps, game visuals, program logs), their alignment with human preferences increases substantially:\\n\\nClaude-Sonnet-4: 56.7% → 62.3% accuracy\\n\\nGPT-4o: 54.6% → 63.8% accuracy\\n\\nQwen2.5-VL-72B: 58.7% → 66.2% accuracy\\n\\nThis reinforces our core thesis: you can\\'t reliably judge code without running it -- and this applies to both humans and AI judges.\\n\\nAutoCodeArena: Automated Code Generation Benchmarks\\n\\nInspired by Arena-Hard-Auto, AutoCodeArena provides a scalable way to evaluate new models without waiting for thousands of human votes. We carefully selected 600 representative prompts from our crowdsourced data, spanning all programming topics and frameworks.\\n\\nUsing automated LLM judges (Claude-3.7-Sonnet) to evaluate code execution results against a GPT-4.1 baseline, we can rapidly benchmark new models. This approach enables weekly leaderboard updates as new models are released.\\n\\nOur automated benchmark evaluated 20+ cutting-edge models, including recently released systems:\\n\\nTop Performers:\\n\\nGPT-5 -- Establishes new state-of-the-art by a significant margin\\n\\nClaude-Opus-4 and Claude-Sonnet-4 -- Strong second tier, excelling in reasoning-heavy tasks\\n\\nQwen3-Coder, Kimi-K2, GLM-4.5 -- Leading open models that narrow the gap with mid-tier proprietary systems\\n\\nFigure: Win rates of recent LLMs on AutoCodeArena against a GPT-4.1 baseline, judged by Claude-3.7-Sonnet. The 50% mark represents parity with GPT-4.1. Models above this line outperform the baseline, while those below underperform. Error bars show 95% confidence intervals. Note: Claude-3.7-Sonnet is excluded from rankings to avoid self-judgment bias, and GPT-4.1 appears only as the reference baseline.\\n\\nThe results show that while proprietary models maintain an edge, open-source models are rapidly closing the gap, with some approaching GPT-4.1-level performance.\\n\\nTry It Yourself\\n\\nBigCodeArena is open to everyone -- no account required! Visit https://huggingface.co/spaces/bigcode/arena to:\\n\\nCompare code from more recent frontier LLMs (e.g., Qwen3, DeepSeek-V3.X, and other proprietary models)\\n\\nTest web apps, games, visualizations, and algorithms\\n\\nSee real execution results, not just source code\\n\\nVote on your preferences to help improve the leaderboard\\n\\nExplore multi-turn coding conversations\\n\\nWhether you\\'re building a React dashboard, creating a PyGame game, solving algorithmic challenges, or generating creative visualizations, BigCodeArena lets you see which models truly deliver.\\n\\nOpen Source Everything\\n\\nFollowing the BigCode Project\\'s commitment to transparency, we\\'re releasing:\\n\\nCodebase: Full evaluation pipelines and Gradio application source (GitHub)\\n\\nCrowdsourced Data: 14K raw conversations and 4.7K preference votes (HuggingFace Collection)\\n\\nBenchmarks: BigCodeReward and AutoCodeArena datasets\\n\\nWhat\\'s Next?\\n\\nWe envision BigCodeArena as a long-term project that evolves with the community:\\n\\nExpanded Language Support: More programming languages and frameworks.\\n\\nLive Benchmarks: Continuously refreshed evaluation prompts to prevent overfitting\\n\\nAgent-Based Evaluation: Using AI agents to interact with web apps for deeper testing\\n\\nBetter Reward Models: Advancing automated code quality assessment\\n\\nCommunity Contributions: We welcome new execution environments, evaluation criteria, and model additions. PRs are always welcome!\\n\\nConclusion\\n\\nEvaluating code isn\\'t like evaluating text -- you need to run it, test it, and interact with it. BigCodeArena makes this possible at scale, combining human judgment with real execution feedback to create the most reliable evaluation platform for code generation models.\\n\\nJoin us in building the future of code generation evaluation. Write a prompt, compare the models, and vote for your favorite. Your feedback helps the entire community understand which models truly deliver on the promise of AI-assisted programming.\\n\\nWe\\'d love to hear your feedback! Connect with us on GitHub, join discussions in the Hugging Face Space community tab, or reach out to the BigCode Project at contact@bigcode-project.org.\\n\\nAcknowledgements\\n\\nWe thank Leandro von Werra for his valuable suggestions and feedback on the blog.\\n\\nCitation\\n\\n@article{zhuo2025bigcodearena, title={BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution}, author={Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Vaisakhi Mishra, Marek Suppa, Noah Ziems, Saiteja Utpala, Ming Xu, Guangyu Song, Kaixin Li, Yuhan Cao, Bo Liu, Zheng Liu, Sabina Abdurakhmanova, Wenhao Yu, Mengzhao Jia, Jihan Yao, Kenneth Hamilton, Kumar Shridhar, Minh Chien Vu, Dingmin Wang, Jiawei Liu, Zijian Wang, Qian Liu, Binyuan Hui, Meg Risdal, Ahsen Khaliq, Atin Sood, Zhenchang Xing, Wasi Uddin Ahmad, John Grundy, David Lo, Banghua Zhu, Xiaoning Du, Torsten Scholak, Leandro von Werra}, year={2025} }\\n\\nTry BigCodeArena now: Hugging Face Space\\n\\nRead the paper: Download Link\\n\\nRun the code: GitHub\\n\\nExplore the collection: Hugging Face Collection\\n\\nCommunity\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n12', 'text_length': 28502, 'message': 'Retrieved 6 pages for blog_bigcode_arena_20251009_165045.html'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `generate_summary` with `Back to Articles\n", + "\n", + "BigCodeArena: Judging code generations end to end with code executions\n", + "\n", + "Community Article Published October 7, 2025\n", + "\n", + "Upvote\n", + "\n", + "12\n", + "\n", + "Terry Yue Zhuo\n", + "\n", + "terryyz\n", + "\n", + "bigcode\n", + "\n", + "Evaluating the quality of AI-generated code is notoriously difficult. While humans can easily spot whether a piece of code \"looks right,\" determining if it actually works correctly, handles edge cases properly, and produces the intended result requires running and testing it. This is why today, we're thrilled to announce BigCodeArena -- the first human-in-the-loop platform for evaluating code generation models through execution.\n", + "\n", + "Inspired by LMArena for LLMs, we've built a platform that allows anyone to compare code generation models side-by-side, but with a crucial difference: you can actually run the code and see what it produces. Just submit a coding task, watch two different models generate solutions, execute both programs, and vote on which model produced better results. The outcomes are organized into a leaderboard that displays the community's highest-rated models.\n", + "\n", + "Motivation\n", + "\n", + "The field of code generation has long struggled with reliable evaluation methods. Traditional benchmarks like HumanEval test code against predefined test cases, but these represent only a tiny fraction of real-world programming tasks. Human evaluation platforms exist for general chatbots, but they fall short for code: reading raw source code and mentally simulating its execution is cognitively demanding and error-prone, especially for longer programs or complex UI applications.\n", + "\n", + "Consider this scenario:\n", + "\n", + "You ask two AI models to build a responsive photo gallery website. Both generate code that looks syntactically correct. But which one is actually better? Without running the code, it's nearly impossible to tell. One might produce a beautiful, functional grid layout, while the other might have subtle bugs or poor styling that only become apparent when rendered in a browser.\n", + "\n", + "This observation led us to a key insight: execution feedback is essential for humans to judge code quality reliably. That's exactly what BigCodeArena provides.\n", + "\n", + "The BigCodeArena Platform\n", + "\n", + "BigCodeArena extends the Chatbot Arena framework with powerful features specifically designed for code evaluation:\n", + "\n", + "Real-Time Execution\n", + "\n", + "Every code snippet generated by models is automatically executed in isolated sandbox environments. Whether it's a Python script, a React web app, a PyGame game, or a C++ algorithm, you can see the actual output, not just the source code.\n", + "\n", + "Multi-Language & Framework Support\n", + "\n", + "We currently support 10 languages (Python, JavaScript, TypeScript, HTML, C, C++, Java, Go, Rust, and Markdown) and 8 execution environments:\n", + "\n", + "Web Frameworks: React, Vue, Core Web (vanilla HTML/CSS/JS)\n", + "\n", + "Python Frameworks: Streamlit, Gradio, PyGame\n", + "\n", + "Diagrams: Mermaid\n", + "\n", + "General Purpose Interpreters: Python and JavaScript code interpreters, plus compiled language runners\n", + "\n", + "Interactive Testing\n", + "\n", + "Unlike static code comparison, you can actually interact with the generated applications:\n", + "\n", + "Click buttons and test UI elements in web apps\n", + "\n", + "Play the games generated by models\n", + "\n", + "Edit the code and re-run it to test modifications\n", + "\n", + "View visual outputs like plots, charts, and diagrams\n", + "\n", + "Multi-Turn Conversations\n", + "\n", + "Real programming isn't one-and-done. BigCodeArena supports multi-turn interactions, allowing you to refine requirements, ask for features to be added, or request bug fixes -- just like working with a real coding assistant.\n", + "\n", + "What We've Learned: 5 Months of Community Evaluation\n", + "\n", + "Since launching in February 2025, BigCodeArena has collected over 14,000 conversations from more than 500 unique users, with 4,700+ high-quality preference votes comparing 10 frontier LLMs.\n", + "\n", + "Programming Topics in the Wild\n", + "\n", + "Our users have explored remarkably diverse coding scenarios:\n", + "\n", + "Web Design (36%): Building responsive websites, interactive dashboards, and web applications\n", + "\n", + "Problem Solving (23%): Algorithms, data structures, and computational challenges\n", + "\n", + "Game Development (16%): Creating interactive games with physics, collision detection, and graphics\n", + "\n", + "Scientific Computing (14%): Data analysis, visualization, and numerical simulations\n", + "\n", + "Creative Coding (8%): Artistic visualizations, generative art, and experimental interfaces\n", + "\n", + "Diagram Creation (3%): Flowcharts, system architectures, and data visualizations\n", + "\n", + "Language and Framework Popularity\n", + "\n", + "Python dominates with over 4,000 conversations, followed by JavaScript/TypeScript (3,359), HTML (1,601), and C++ (642). Among frameworks, direct Python interpreters lead usage (6,000 sessions), with React (2,729), Core Web (1,574), Streamlit (1,254), and PyGame (1,087) also seeing heavy use.\n", + "\n", + "User Interaction Patterns\n", + "\n", + "Most interactions are focused and efficient: 76% of conversations consist of just 2 turns (one request, one response), with a mean conversation length of 4.12 messages. However, the platform supports extended multi-turn debugging sessions when needed, with some conversations exceeding 10 turns as users refine complex applications.\n", + "\n", + "Model Rankings from Community Votes\n", + "\n", + "From our 14K conversations, we filtered for high-quality pairwise comparisons: conversations with at least two turns and actual code execution. This yielded 4,731 voting samples, with each evaluated model receiving at least 700 votes. We aggregate these votes into Elo ratings using the Bradley-Terry model, which estimates the probability that one model beats another based on head-to-head comparisons.\n", + "\n", + "To ensure robust rankings, we use 100 bootstrap resamples to construct 95% confidence intervals, so we can identify statistically significant performance differences between models.\n", + "\n", + "We evaluate models under three settings to control for different factors:\n", + "\n", + "All Data: Uses all pairwise comparisons regardless of execution environment or programming language\n", + "\n", + "Environment Matched: Only compares models when both were executed in the same sandbox (e.g., both in React or both in PyGame)\n", + "\n", + "Language Matched: Further restricts comparisons to the same programming language\n", + "\n", + "Rankings remain remarkably consistent across all three settings, revealing clear performance tiers:\n", + "\n", + "Top Tier: o3-mini and o1-mini consistently lead with the highest Elo ratings and tight confidence intervals. These models maintain top performance regardless of environment or language constraints, showing strong robustness across coding scenarios. Claude-3.5-Sonnet follows closely, particularly excelling when language is controlled.\n", + "\n", + "Mid Tier: GPT-4o, o1, and Gemini-2.0-Pro/Flash form a competitive middle tier. GPT-4o shows some sensitivity to language matching, suggesting room for improvement in multilingual consistency.\n", + "\n", + "Open Source Models: Qwen2.5 variants and Llama-3.3-70B lag behind frontier proprietary models, highlighting the performance gap that remains between leading closed and open models.\n", + "\n", + "Figure: Overall win rate heatmaps (percentage of all pairwise comparisons won) of each model in the sessions across languages (left) and execution environments (right). For each category, we only keep models that appear in at least 3 conversation sessions.\n", + "\n", + "Performance Across Languages\n", + "\n", + "Breaking down performance by programming language reveals interesting patterns:\n", + "\n", + "Top-tier models like o3-mini and o1-mini achieve dominant win rates in mainstream languages like Python, Java, and C++\n", + "\n", + "Gemini-2.0-Pro shows particular strength in Rust, achieving the highest win rate in that category\n", + "\n", + "Different models exhibit distinct areas of expertise, with frontier models excelling in different niches\n", + "\n", + "Open models like Qwen2.5 variants show inconsistent performance, particularly struggling with Rust and Go\n", + "\n", + "Performance Across Execution Environments\n", + "\n", + "Analyzing win rates by execution environment reveals how models handle different runtime contexts:\n", + "\n", + "Robust Performers: o3-mini maintains consistently strong performance across React, Streamlit, Gradio, Core Web, and PyGame, demonstrating excellent environmental adaptability.\n", + "\n", + "Stable but Selective: Claude-3.5-Sonnet and Gemini-2.0-Flash show generally stable performance but with reduced win rates in complex UI-heavy environments like Vue and Mermaid.\n", + "\n", + "Framework-Specific Weaknesses: Qwen2.5 models, while competitive in some web frameworks (Core Web, React), struggle significantly with interactive and visualization-oriented environments like PyGame, Vue, and Mermaid. These environments often require precise handling of control flow, graphics rendering, and package dependencies.\n", + "\n", + "These results highlight an important insight: aggregate Elo scores don't tell the whole story. Some models remain brittle under specific runtime constraints, and execution environment matters significantly for real-world deployment.\n", + "\n", + "Two New Benchmarks: BigCodeReward and AutoCodeArena\n", + "\n", + "To advance research beyond crowdsourced evaluation, we're releasing two complementary benchmarks:\n", + "\n", + "BigCodeReward: Evaluating Reward Models for Code\n", + "\n", + "Building on our 4,700+ preference votes, BigCodeReward tests how well LLMs can judge code quality when acting as reward models. The key finding? Execution results dramatically improve judgment accuracy.\n", + "\n", + "When models can see execution outputs (screenshots of web apps, game visuals, program logs), their alignment with human preferences increases substantially:\n", + "\n", + "Claude-Sonnet-4: 56.7% → 62.3% accuracy\n", + "\n", + "GPT-4o: 54.6% → 63.8% accuracy\n", + "\n", + "Qwen2.5-VL-72B: 58.7% → 66.2% accuracy\n", + "\n", + "This reinforces our core thesis: you can't reliably judge code without running it -- and this applies to both humans and AI judges.\n", + "\n", + "AutoCodeArena: Automated Code Generation Benchmarks\n", + "\n", + "Inspired by Arena-Hard-Auto, AutoCodeArena provides a scalable way to evaluate new models without waiting for thousands of human votes. We carefully selected 600 representative prompts from our crowdsourced data, spanning all programming topics and frameworks.\n", + "\n", + "Using automated LLM judges (Claude-3.7-Sonnet) to evaluate code execution results against a GPT-4.1 baseline, we can rapidly benchmark new models. This approach enables weekly leaderboard updates as new models are released.\n", + "\n", + "Our automated benchmark evaluated 20+ cutting-edge models, including recently released systems:\n", + "\n", + "Top Performers:\n", + "\n", + "GPT-5 -- Establishes new state-of-the-art by a significant margin\n", + "\n", + "Claude-Opus-4 and Claude-Sonnet-4 -- Strong second tier, excelling in reasoning-heavy tasks\n", + "\n", + "Qwen3-Coder, Kimi-K2, GLM-4.5 -- Leading open models that narrow the gap with mid-tier proprietary systems\n", + "\n", + "Figure: Win rates of recent LLMs on AutoCodeArena against a GPT-4.1 baseline, judged by Claude-3.7-Sonnet. The 50% mark represents parity with GPT-4.1. Models above this line outperform the baseline, while those below underperform. Error bars show 95% confidence intervals. Note: Claude-3.7-Sonnet is excluded from rankings to avoid self-judgment bias, and GPT-4.1 appears only as the reference baseline.\n", + "\n", + "The results show that while proprietary models maintain an edge, open-source models are rapidly closing the gap, with some approaching GPT-4.1-level performance.\n", + "\n", + "Try It Yourself\n", + "\n", + "BigCodeArena is open to everyone -- no account required! Visit https://huggingface.co/spaces/bigcode/arena to:\n", + "\n", + "Compare code from more recent frontier LLMs (e.g., Qwen3, DeepSeek-V3.X, and other proprietary models)\n", + "\n", + "Test web apps, games, visualizations, and algorithms\n", + "\n", + "See real execution results, not just source code\n", + "\n", + "Vote on your preferences to help improve the leaderboard\n", + "\n", + "Explore multi-turn coding conversations\n", + "\n", + "Whether you're building a React dashboard, creating a PyGame game, solving algorithmic challenges, or generating creative visualizations, BigCodeArena lets you see which models truly deliver.\n", + "\n", + "Open Source Everything\n", + "\n", + "Following the BigCode Project's commitment to transparency, we're releasing:\n", + "\n", + "Codebase: Full evaluation pipelines and Gradio application source (GitHub)\n", + "\n", + "Crowdsourced Data: 14K raw conversations and 4.7K preference votes (HuggingFace Collection)\n", + "\n", + "Benchmarks: BigCodeReward and AutoCodeArena datasets\n", + "\n", + "What's Next?\n", + "\n", + "We envision BigCodeArena as a long-term project that evolves with the community:\n", + "\n", + "Expanded Language Support: More programming languages and frameworks.\n", + "\n", + "Live Benchmarks: Continuously refreshed evaluation prompts to prevent overfitting\n", + "\n", + "Agent-Based Evaluation: Using AI agents to interact with web apps for deeper testing\n", + "\n", + "Better Reward Models: Advancing automated code quality assessment\n", + "\n", + "Community Contributions: We welcome new execution environments, evaluation criteria, and model additions. PRs are always welcome!\n", + "\n", + "Conclusion\n", + "\n", + "Evaluating code isn't like evaluating text -- you need to run it, test it, and interact with it. BigCodeArena makes this possible at scale, combining human judgment with real execution feedback to create the most reliable evaluation platform for code generation models.\n", + "\n", + "Join us in building the future of code generation evaluation. Write a prompt, compare the models, and vote for your favorite. Your feedback helps the entire community understand which models truly deliver on the promise of AI-assisted programming.\n", + "\n", + "We'd love to hear your feedback! Connect with us on GitHub, join discussions in the Hugging Face Space community tab, or reach out to the BigCode Project at contact@bigcode-project.org.\n", + "\n", + "Acknowledgements\n", + "\n", + "We thank Leandro von Werra for his valuable suggestions and feedback on the blog.\n", + "\n", + "Citation\n", + "\n", + "@article{zhuo2025bigcodearena, title={BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution}, author={Terry Yue Zhuo, Xiaolong Jin, Hange Liu, Juyong Jiang, Tianyang Liu, Chen Gong, Bhupesh Bishnoi, Vaisakhi Mishra, Marek Suppa, Noah Ziems, Saiteja Utpala, Ming Xu, Guangyu Song, Kaixin Li, Yuhan Cao, Bo Liu, Zheng Liu, Sabina Abdurakhmanova, Wenhao Yu, Mengzhao Jia, Jihan Yao, Kenneth Hamilton, Kumar Shridhar, Minh Chien Vu, Dingmin Wang, Jiawei Liu, Zijian Wang, Qian Liu, Binyuan Hui, Meg Risdal, Ahsen Khaliq, Atin Sood, Zhenchang Xing, Wasi Uddin Ahmad, John Grundy, David Lo, Banghua Zhu, Xiaoning Du, Torsten Scholak, Leandro von Werra}, year={2025} }\n", + "\n", + "Try BigCodeArena now: Hugging Face Space\n", + "\n", + "Read the paper: Download Link\n", + "\n", + "Run the code: GitHub\n", + "\n", + "Explore the collection: Hugging Face Collection\n", + "\n", + "Community\n", + "\n", + "Sign up or log in to comment\n", + "\n", + "Upvote\n", + "\n", + "12`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'status': 'success', 'summary': \"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\", 'length': 1338, 'message': 'Generated summary (1338 characters)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_document_text` with `blog_dots-ocr-ne_20251009_165053.html`\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'filename': 'blog_dots-ocr-ne_20251009_165053.html', 'pages': 6, 'text': 'Back to Articles\\n\\nSOTA OCR on-device with Core ML and dots.ocr\\n\\nPublished October 2, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n28\\n\\nChristopher Fleetwood\\n\\nFL33TW00D-HF\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nEvery year our hardware is a little more powerful, our models a little smarter for each parameter. In 2025, it is more feasible than ever to run truly competitive models on-device. dots.ocr, a 3B parameter OCR model from RedNote, surpasses Gemini 2.5 Pro in OmniDocBench, making OCR a truly no compromises on-device use case. Running models on-device is certainly appealing to developers: no smuggling API keys, zero cost, and no network required. However, if we want these models to run on-device, we need to be mindful of the limited compute and power budgets.\\n\\nEnter the Neural Engine, Apple\\'s custom AI accelerator that has shipped with every Apple device since 2017. This accelerator is designed for high performance whilst sipping battery power. Some of our testing has found the Neural Engine to be 12x more power efficient than CPU, and 4x more power efficient than GPU.\\n\\nWhilst this all sounds very appealing, unfortunately the Neural Engine is only accessible through Core ML, Apple\\'s closed source ML framework. Furthermore, even just converting a model from PyTorch to Core ML can present some challenges, and without a preconverted model or some knowledge of the sharp edges it can be arduous for developers. Luckily, Apple also offers MLX, a more modern and flexible ML framework that targets the GPU (not the Neural Engine), and can be used in conjunction with Core ML.\\n\\nIn this three part series, we will provide a reasoning trace of how we converted dots.ocr to run on-device, using a combination of CoreML and MLX. This process should be applicable to many other models, and we hope that this will help highlight the ideas and tools needed for developers looking to run their own models on-device.\\n\\nTo follow along, clone the repo. You\\'ll need uv and hf installed to run the setup command:\\n\\n./boostrap.sh\\n\\nIf you just want to skip ahead and use the converted model, you can download it here.\\n\\nConversion\\n\\nConverting from PyTorch to CoreML is a two step process:\\n\\nCapturing your PyTorch execution graph (via torch.jit.trace or, the more modern approach of torch.export).\\n\\nCompiling this converted graph to an .mlpackage using coremltools.\\n\\nWhilst we do have a few knobs we can tweak for step 2, most of our control is in step 1, the graph we feed to coremltools.\\n\\nFollowing the programmers litany of make it work, make it right, make it fast, we will first focus on getting the conversion working on GPU, in FLOAT32, and with static shapes. Once we have this working, we can dial down the precision and try and move to the Neural Engine.\\n\\nDots.OCR\\n\\nDots.OCR consists of two key components: A 1.2B parameter vision encoder trained from scratch, based on the NaViT architecture, and a Qwen2.5-1.5B backbone. We will be using CoreML to run the vision encoder, and MLX to run the LM backbone.\\n\\nStep 0: Understand and simplify the model\\n\\nIn order to convert a model, it\\'s best to understand the structure and function before getting started. Looking at the original vision modelling file here, we can see that the vision encoder is similar to the QwenVL family. Like many vision encoders, the vision encoder for dots works on a patch basis, in this case 14x14 patches. The dots vision encoder is capable of processing videos and batches of images. This gives us an opportunity to simplify by only processing a single image at a time. This approach is frequent in on-device apps, where we convert a model that provides the essential functions and iterate if we want to process multiple images.\\n\\nWhen kicking off the conversion process, it\\'s best to start with a minimal viable model. This means removing any bells and whistles that are not strictly necessary for the model to function. In our case, dots has many different attention implementations available for both the vision encoder and the LM backbone. CoreML has lots of infrastructure oriented around the scaled_dot_product_attention operator, which they introduced in iOS 18. We can simplify the model by removing all of the other attention implementations and just focusing on simple sdpa (not the memory efficient variant) for now, commit here.\\n\\nOnce we\\'ve done this, we see a scary warning message when we load the model:\\n\\nSliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.\\n\\nThe model doesn\\'t require Sliding Window Attention to function, so we can happily move on.\\n\\nStep 1: A simple harness\\n\\nUsing torch.jit.trace is still the most mature method for converting models to CoreML. We usually encapsulate this in a simple harness that allows you to modify the compute units used and the precision selected.\\n\\nYou can check out the initial harness here. If we run the following on the original code implementation:\\n\\nuv run convert.py --precision FLOAT32 --compute_units CPU_AND_GPU\\n\\nWe should bump into the first (of many) issues.\\n\\nStep 2: Bug hunting\\n\\nIt is rare that a model will convert first time. Often, you will need to progressively make changes further and further down the execution graph until you reach the final node.\\n\\nOur first issue is the following error:\\n\\nERROR - converting \\'outer\\' op (located at: \\'vision_tower/rotary_pos_emb/192\\'): In op \"matmul\", when x and y are both non-const, their dtype need to match, but got x as int32 and y as fp32\\n\\nLuckily this error gives us quite a bit of information. We can look at the VisionRotaryEmbedding layer and see the following code:\\n\\ndef forward(self, seqlen: int) -> torch.Tensor: seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) freqs = torch.outer(seq, self.inv_freq) return freqs\\n\\nAlthough torch.arange has a dtype argument, coremltools ignores this for arange and always outputs int32. We can simply add a cast after the arange to fix this issue, commit here.\\n\\nAfter fixing this, running the conversion again leads us to our next issue at repeat_interleave:\\n\\nERROR - converting \\'repeat_interleave\\' op (located at: \\'vision_tower/204\\'): Cannot add const [None]\\n\\nWhilst this error is less informative, we only have a single call to repeat_interleave in our vision encoder:\\n\\ncu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( dim=0, dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, )\\n\\ncu_seqlens is used for masking variable length sequences in flash_attention_2. It\\'s derived from the grid_thw tensor, which represents time, height and width. Since we are only processing a single image, we can simply remove this call, commit here.\\n\\nOnto the next! This time, we get a more cryptic error:\\n\\nERROR - converting \\'_internal_op_tensor_inplace_fill_\\' op (located at: \\'vision_tower/0/attn/301_internal_tensor_assign_1\\'): _internal_op_tensor_inplace_fill does not support dynamic index\\n\\nThis is again due to the masking logic to handle variable length sequences. Since we are only processing a single image (not a video or batch of images), we don\\'t really need attention masking at all! Therefore, we can just use a mask of all True. To prepare ourselves for the Neural Engine conversion, we also switch from using a boolean mask to a float mask of all zeros, as the Neural Engine does not support bool tensors commit here\\n\\nWith all of this done, the model should now successfully convert to CoreML! However, when we run the model, we get the following error:\\n\\nerror: \\'mps.reshape\\' op the result shape is not compatible with the input shape\\n\\nThis reshape could be in multiple places! Luckily, we can use a previous warning message to help us track down the issue:\\n\\nTracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won\\'t change the number of iterations executed (and might lead to errors or silently give incorrect results). for t, h, w in grid_thw:\\n\\nMost ML compilers do not like dynamic control flow. Luckily for us, as we are only processing a single image, we can simply remove the loop and process the single h, w pair, commit here.\\n\\nAnd there we have it! If we run the conversion again, we should see that the model successfully converts and matches the original PyTorch precision:\\n\\nMax difference: 0.006000518798828125, Mean difference: 1.100682402466191e-05\\n\\nStep 3: Benchmarking\\n\\nNow that we\\'ve got the model working, let\\'s evaluate the size and performance. The good news is the model is working, the bad news is that it\\'s over 5GB! This is completely untenable for on device deployment! To benchmark the computation time, we can use the built in XCode tooling by calling:\\n\\nopen DotsOCR_FLOAT32.mlpackage\\n\\nwhich will launch the XCode inspector for the model. After clicking + Performance Report and launching a report on all compute devices, you should see something like the following:\\n\\nOver a second for a single forward pass of the vision encoder! We have lots of more work.\\n\\nIn the second part of this series, we will work on the integration between CoreML and MLX, to run the full model on-device. In the third part, we will dive deep into the optimizations required to get this model running on the Neural Engine, including quantization and dynamic shapes.\\n\\nMore Articles from our Blog\\n\\nSwift Transformers Reaches 1.0 — and Looks to the Future\\n\\nBy September 26, 2025 • 26\\n\\nWWDC 24: Running Mistral 7B with Core ML\\n\\nBy July 22, 2024 • 62\\n\\nCommunity\\n\\nstopthinking\\n\\n6 days ago\\n\\nnone of the commits links work\\n\\nFL33TW00D-HF\\n\\nArticle author 6 days ago\\n\\nThanks for the flag! Try now! (repo was private)\\n\\nbweng\\n\\n6 days ago\\n\\nJust came across this, we also tried to do a write up on converting silero-vad to Core ML but you guys did a much better job explaining it!\\n\\nhttps://inference.plus/p/teaching-ai-to-optimize-ai-models\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n28\\n\\nBack to Articles\\n\\nSOTA OCR on-device with Core ML and dots.ocr\\n\\nPublished October 2, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n28\\n\\nChristopher Fleetwood\\n\\nFL33TW00D-HF\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nEvery year our hardware is a little more powerful, our models a little smarter for each parameter. In 2025, it is more feasible than ever to run truly competitive models on-device. dots.ocr, a 3B parameter OCR model from RedNote, surpasses Gemini 2.5 Pro in OmniDocBench, making OCR a truly no compromises on-device use case. Running models on-device is certainly appealing to developers: no smuggling API keys, zero cost, and no network required. However, if we want these models to run on-device, we need to be mindful of the limited compute and power budgets.\\n\\nEnter the Neural Engine, Apple\\'s custom AI accelerator that has shipped with every Apple device since 2017. This accelerator is designed for high performance whilst sipping battery power. Some of our testing has found the Neural Engine to be 12x more power efficient than CPU, and 4x more power efficient than GPU.\\n\\nWhilst this all sounds very appealing, unfortunately the Neural Engine is only accessible through Core ML, Apple\\'s closed source ML framework. Furthermore, even just converting a model from PyTorch to Core ML can present some challenges, and without a preconverted model or some knowledge of the sharp edges it can be arduous for developers. Luckily, Apple also offers MLX, a more modern and flexible ML framework that targets the GPU (not the Neural Engine), and can be used in conjunction with Core ML.\\n\\nIn this three part series, we will provide a reasoning trace of how we converted dots.ocr to run on-device, using a combination of CoreML and MLX. This process should be applicable to many other models, and we hope that this will help highlight the ideas and tools needed for developers looking to run their own models on-device.\\n\\nTo follow along, clone the repo. You\\'ll need uv and hf installed to run the setup command:\\n\\n./boostrap.sh\\n\\nIf you just want to skip ahead and use the converted model, you can download it here.\\n\\nConversion\\n\\nConverting from PyTorch to CoreML is a two step process:\\n\\nCapturing your PyTorch execution graph (via torch.jit.trace or, the more modern approach of torch.export).\\n\\nCompiling this converted graph to an .mlpackage using coremltools.\\n\\nWhilst we do have a few knobs we can tweak for step 2, most of our control is in step 1, the graph we feed to coremltools.\\n\\nFollowing the programmers litany of make it work, make it right, make it fast, we will first focus on getting the conversion working on GPU, in FLOAT32, and with static shapes. Once we have this working, we can dial down the precision and try and move to the Neural Engine.\\n\\nDots.OCR\\n\\nDots.OCR consists of two key components: A 1.2B parameter vision encoder trained from scratch, based on the NaViT architecture, and a Qwen2.5-1.5B backbone. We will be using CoreML to run the vision encoder, and MLX to run the LM backbone.\\n\\nStep 0: Understand and simplify the model\\n\\nIn order to convert a model, it\\'s best to understand the structure and function before getting started. Looking at the original vision modelling file here, we can see that the vision encoder is similar to the QwenVL family. Like many vision encoders, the vision encoder for dots works on a patch basis, in this case 14x14 patches. The dots vision encoder is capable of processing videos and batches of images. This gives us an opportunity to simplify by only processing a single image at a time. This approach is frequent in on-device apps, where we convert a model that provides the essential functions and iterate if we want to process multiple images.\\n\\nWhen kicking off the conversion process, it\\'s best to start with a minimal viable model. This means removing any bells and whistles that are not strictly necessary for the model to function. In our case, dots has many different attention implementations available for both the vision encoder and the LM backbone. CoreML has lots of infrastructure oriented around the scaled_dot_product_attention operator, which they introduced in iOS 18. We can simplify the model by removing all of the other attention implementations and just focusing on simple sdpa (not the memory efficient variant) for now, commit here.\\n\\nOnce we\\'ve done this, we see a scary warning message when we load the model:\\n\\nSliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.\\n\\nThe model doesn\\'t require Sliding Window Attention to function, so we can happily move on.\\n\\nStep 1: A simple harness\\n\\nUsing torch.jit.trace is still the most mature method for converting models to CoreML. We usually encapsulate this in a simple harness that allows you to modify the compute units used and the precision selected.\\n\\nYou can check out the initial harness here. If we run the following on the original code implementation:\\n\\nuv run convert.py --precision FLOAT32 --compute_units CPU_AND_GPU\\n\\nWe should bump into the first (of many) issues.\\n\\nStep 2: Bug hunting\\n\\nIt is rare that a model will convert first time. Often, you will need to progressively make changes further and further down the execution graph until you reach the final node.\\n\\nOur first issue is the following error:\\n\\nERROR - converting \\'outer\\' op (located at: \\'vision_tower/rotary_pos_emb/192\\'): In op \"matmul\", when x and y are both non-const, their dtype need to match, but got x as int32 and y as fp32\\n\\nLuckily this error gives us quite a bit of information. We can look at the VisionRotaryEmbedding layer and see the following code:\\n\\ndef forward(self, seqlen: int) -> torch.Tensor: seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) freqs = torch.outer(seq, self.inv_freq) return freqs\\n\\nAlthough torch.arange has a dtype argument, coremltools ignores this for arange and always outputs int32. We can simply add a cast after the arange to fix this issue, commit here.\\n\\nAfter fixing this, running the conversion again leads us to our next issue at repeat_interleave:\\n\\nERROR - converting \\'repeat_interleave\\' op (located at: \\'vision_tower/204\\'): Cannot add const [None]\\n\\nWhilst this error is less informative, we only have a single call to repeat_interleave in our vision encoder:\\n\\ncu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( dim=0, dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, )\\n\\ncu_seqlens is used for masking variable length sequences in flash_attention_2. It\\'s derived from the grid_thw tensor, which represents time, height and width. Since we are only processing a single image, we can simply remove this call, commit here.\\n\\nOnto the next! This time, we get a more cryptic error:\\n\\nERROR - converting \\'_internal_op_tensor_inplace_fill_\\' op (located at: \\'vision_tower/0/attn/301_internal_tensor_assign_1\\'): _internal_op_tensor_inplace_fill does not support dynamic index\\n\\nThis is again due to the masking logic to handle variable length sequences. Since we are only processing a single image (not a video or batch of images), we don\\'t really need attention masking at all! Therefore, we can just use a mask of all True. To prepare ourselves for the Neural Engine conversion, we also switch from using a boolean mask to a float mask of all zeros, as the Neural Engine does not support bool tensors commit here\\n\\nWith all of this done, the model should now successfully convert to CoreML! However, when we run the model, we get the following error:\\n\\nerror: \\'mps.reshape\\' op the result shape is not compatible with the input shape\\n\\nThis reshape could be in multiple places! Luckily, we can use a previous warning message to help us track down the issue:\\n\\nTracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won\\'t change the number of iterations executed (and might lead to errors or silently give incorrect results). for t, h, w in grid_thw:\\n\\nMost ML compilers do not like dynamic control flow. Luckily for us, as we are only processing a single image, we can simply remove the loop and process the single h, w pair, commit here.\\n\\nAnd there we have it! If we run the conversion again, we should see that the model successfully converts and matches the original PyTorch precision:\\n\\nMax difference: 0.006000518798828125, Mean difference: 1.100682402466191e-05\\n\\nStep 3: Benchmarking\\n\\nNow that we\\'ve got the model working, let\\'s evaluate the size and performance. The good news is the model is working, the bad news is that it\\'s over 5GB! This is completely untenable for on device deployment! To benchmark the computation time, we can use the built in XCode tooling by calling:\\n\\nopen DotsOCR_FLOAT32.mlpackage\\n\\nwhich will launch the XCode inspector for the model. After clicking + Performance Report and launching a report on all compute devices, you should see something like the following:\\n\\nOver a second for a single forward pass of the vision encoder! We have lots of more work.\\n\\nIn the second part of this series, we will work on the integration between CoreML and MLX, to run the full model on-device. In the third part, we will dive deep into the optimizations required to get this model running on the Neural Engine, including quantization and dynamic shapes.\\n\\nMore Articles from our Blog\\n\\nSwift Transformers Reaches 1.0 — and Looks to the Future\\n\\nBy September 26, 2025 • 26\\n\\nWWDC 24: Running Mistral 7B with Core ML\\n\\nBy July 22, 2024 • 62\\n\\nCommunity\\n\\nstopthinking\\n\\n6 days ago\\n\\nnone of the commits links work\\n\\nFL33TW00D-HF\\n\\nArticle author 6 days ago\\n\\nThanks for the flag! Try now! (repo was private)\\n\\nbweng\\n\\n6 days ago\\n\\nJust came across this, we also tried to do a write up on converting silero-vad to Core ML but you guys did a much better job explaining it!\\n\\nhttps://inference.plus/p/teaching-ai-to-optimize-ai-models\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n28\\n\\nBack to Articles\\n\\nSOTA OCR on-device with Core ML and dots.ocr\\n\\nPublished October 2, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n28\\n\\nChristopher Fleetwood\\n\\nFL33TW00D-HF\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nEvery year our hardware is a little more powerful, our models a little smarter for each parameter. In 2025, it is more feasible than ever to run truly competitive models on-device. dots.ocr, a 3B parameter OCR model from RedNote, surpasses Gemini 2.5 Pro in OmniDocBench, making OCR a truly no compromises on-device use case. Running models on-device is certainly appealing to developers: no smuggling API keys, zero cost, and no network required. However, if we want these models to run on-device, we need to be mindful of the limited compute and power budgets.\\n\\nEnter the Neural Engine, Apple\\'s custom AI accelerator that has shipped with every Apple device since 2017. This accelerator is designed for high performance whilst sipping battery power. Some of our testing has found the Neural Engine to be 12x more power efficient than CPU, and 4x more power efficient than GPU.\\n\\nWhilst this all sounds very appealing, unfortunately the Neural Engine is only accessible through Core ML, Apple\\'s closed source ML framework. Furthermore, even just converting a model from PyTorch to Core ML can present some challenges, and without a preconverted model or some knowledge of the sharp edges it can be arduous for developers. Luckily, Apple also offers MLX, a more modern and flexible ML framework that targets the GPU (not the Neural Engine), and can be used in conjunction with Core ML.\\n\\nIn this three part series, we will provide a reasoning trace of how we converted dots.ocr to run on-device, using a combination of CoreML and MLX. This process should be applicable to many other models, and we hope that this will help highlight the ideas and tools needed for developers looking to run their own models on-device.\\n\\nTo follow along, clone the repo. You\\'ll need uv and hf installed to run the setup command:\\n\\n./boostrap.sh\\n\\nIf you just want to skip ahead and use the converted model, you can download it here.\\n\\nConversion\\n\\nConverting from PyTorch to CoreML is a two step process:\\n\\nCapturing your PyTorch execution graph (via torch.jit.trace or, the more modern approach of torch.export).\\n\\nCompiling this converted graph to an .mlpackage using coremltools.\\n\\nWhilst we do have a few knobs we can tweak for step 2, most of our control is in step 1, the graph we feed to coremltools.\\n\\nFollowing the programmers litany of make it work, make it right, make it fast, we will first focus on getting the conversion working on GPU, in FLOAT32, and with static shapes. Once we have this working, we can dial down the precision and try and move to the Neural Engine.\\n\\nDots.OCR\\n\\nDots.OCR consists of two key components: A 1.2B parameter vision encoder trained from scratch, based on the NaViT architecture, and a Qwen2.5-1.5B backbone. We will be using CoreML to run the vision encoder, and MLX to run the LM backbone.\\n\\nStep 0: Understand and simplify the model\\n\\nIn order to convert a model, it\\'s best to understand the structure and function before getting started. Looking at the original vision modelling file here, we can see that the vision encoder is similar to the QwenVL family. Like many vision encoders, the vision encoder for dots works on a patch basis, in this case 14x14 patches. The dots vision encoder is capable of processing videos and batches of images. This gives us an opportunity to simplify by only processing a single image at a time. This approach is frequent in on-device apps, where we convert a model that provides the essential functions and iterate if we want to process multiple images.\\n\\nWhen kicking off the conversion process, it\\'s best to start with a minimal viable model. This means removing any bells and whistles that are not strictly necessary for the model to function. In our case, dots has many different attention implementations available for both the vision encoder and the LM backbone. CoreML has lots of infrastructure oriented around the scaled_dot_product_attention operator, which they introduced in iOS 18. We can simplify the model by removing all of the other attention implementations and just focusing on simple sdpa (not the memory efficient variant) for now, commit here.\\n\\nOnce we\\'ve done this, we see a scary warning message when we load the model:\\n\\nSliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.\\n\\nThe model doesn\\'t require Sliding Window Attention to function, so we can happily move on.\\n\\nStep 1: A simple harness\\n\\nUsing torch.jit.trace is still the most mature method for converting models to CoreML. We usually encapsulate this in a simple harness that allows you to modify the compute units used and the precision selected.\\n\\nYou can check out the initial harness here. If we run the following on the original code implementation:\\n\\nuv run convert.py --precision FLOAT32 --compute_units CPU_AND_GPU\\n\\nWe should bump into the first (of many) issues.\\n\\nStep 2: Bug hunting\\n\\nIt is rare that a model will convert first time. Often, you will need to progressively make changes further and further down the execution graph until you reach the final node.\\n\\nOur first issue is the following error:\\n\\nERROR - converting \\'outer\\' op (located at: \\'vision_tower/rotary_pos_emb/192\\'): In op \"matmul\", when x and y are both non-const, their dtype need to match, but got x as int32 and y as fp32\\n\\nLuckily this error gives us quite a bit of information. We can look at the VisionRotaryEmbedding layer and see the following code:\\n\\ndef forward(self, seqlen: int) -> torch.Tensor: seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) freqs = torch.outer(seq, self.inv_freq) return freqs\\n\\nAlthough torch.arange has a dtype argument, coremltools ignores this for arange and always outputs int32. We can simply add a cast after the arange to fix this issue, commit here.\\n\\nAfter fixing this, running the conversion again leads us to our next issue at repeat_interleave:\\n\\nERROR - converting \\'repeat_interleave\\' op (located at: \\'vision_tower/204\\'): Cannot add const [None]\\n\\nWhilst this error is less informative, we only have a single call to repeat_interleave in our vision encoder:\\n\\ncu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( dim=0, dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, )\\n\\ncu_seqlens is used for masking variable length sequences in flash_attention_2. It\\'s derived from the grid_thw tensor, which represents time, height and width. Since we are only processing a single image, we can simply remove this call, commit here.\\n\\nOnto the next! This time, we get a more cryptic error:\\n\\nERROR - converting \\'_internal_op_tensor_inplace_fill_\\' op (located at: \\'vision_tower/0/attn/301_internal_tensor_assign_1\\'): _internal_op_tensor_inplace_fill does not support dynamic index\\n\\nThis is again due to the masking logic to handle variable length sequences. Since we are only processing a single image (not a video or batch of images), we don\\'t really need attention masking at all! Therefore, we can just use a mask of all True. To prepare ourselves for the Neural Engine conversion, we also switch from using a boolean mask to a float mask of all zeros, as the Neural Engine does not support bool tensors commit here\\n\\nWith all of this done, the model should now successfully convert to CoreML! However, when we run the model, we get the following error:\\n\\nerror: \\'mps.reshape\\' op the result shape is not compatible with the input shape\\n\\nThis reshape could be in multiple places! Luckily, we can use a previous warning message to help us track down the issue:\\n\\nTracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won\\'t change the number of iterations executed (and might lead to errors or silently give incorrect results). for t, h, w in grid_thw:\\n\\nMost ML compilers do not like dynamic control flow. Luckily for us, as we are only processing a single image, we can simply remove the loop and process the single h, w pair, commit here.\\n\\nAnd there we have it! If we run the conversion again, we should see that the model successfully converts and matches the original PyTorch precision:\\n\\nMax difference: 0.006000518798828125, Mean difference: 1.100682402466191e-05\\n\\nStep 3: Benchmarking\\n\\nNow that we\\'ve got the model working, let\\'s evaluate the size and performance. The good news is the model is working, the bad news is that it\\'s over 5GB! This is completely untenable for on device deployment! To benchmark the computation time, we can use the built in XCode tooling by calling:\\n\\nopen DotsOCR_FLOAT32.mlpackage\\n\\nwhich will launch the XCode inspector for the model. After clicking + Performance Report and launching a report on all compute devices, you should see something like the following:\\n\\nOver a second for a single forward pass of the vision encoder! We have lots of more work.\\n\\nIn the second part of this series, we will work on the integration between CoreML and MLX, to run the full model on-device. In the third part, we will dive deep into the optimizations required to get this model running on the Neural Engine, including quantization and dynamic shapes.\\n\\nMore Articles from our Blog\\n\\nSwift Transformers Reaches 1.0 — and Looks to the Future\\n\\nBy September 26, 2025 • 26\\n\\nWWDC 24: Running Mistral 7B with Core ML\\n\\nBy July 22, 2024 • 62\\n\\nCommunity\\n\\nstopthinking\\n\\n6 days ago\\n\\nnone of the commits links work\\n\\nFL33TW00D-HF\\n\\nArticle author 6 days ago\\n\\nThanks for the flag! Try now! (repo was private)\\n\\nbweng\\n\\n6 days ago\\n\\nJust came across this, we also tried to do a write up on converting silero-vad to Core ML but you guys did a much better job explaining it!\\n\\nhttps://inference.plus/p/teaching-ai-to-optimize-ai-models\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n28', 'text_length': 29671, 'message': 'Retrieved 6 pages for blog_dots-ocr-ne_20251009_165053.html'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `generate_summary` with `Back to Articles\n", + "\n", + "SOTA OCR on-device with Core ML and dots.ocr\n", + "\n", + "Published October 2, 2025\n", + "\n", + "Update on GitHub\n", + "\n", + "Upvote\n", + "\n", + "28\n", + "\n", + "Christopher Fleetwood\n", + "\n", + "FL33TW00D-HF\n", + "\n", + "Pedro Cuenca\n", + "\n", + "pcuenq\n", + "\n", + "Every year our hardware is a little more powerful, our models a little smarter for each parameter. In 2025, it is more feasible than ever to run truly competitive models on-device. dots.ocr, a 3B parameter OCR model from RedNote, surpasses Gemini 2.5 Pro in OmniDocBench, making OCR a truly no compromises on-device use case. Running models on-device is certainly appealing to developers: no smuggling API keys, zero cost, and no network required. However, if we want these models to run on-device, we need to be mindful of the limited compute and power budgets.\n", + "\n", + "Enter the Neural Engine, Apple's custom AI accelerator that has shipped with every Apple device since 2017. This accelerator is designed for high performance whilst sipping battery power. Some of our testing has found the Neural Engine to be 12x more power efficient than CPU, and 4x more power efficient than GPU.\n", + "\n", + "Whilst this all sounds very appealing, unfortunately the Neural Engine is only accessible through Core ML, Apple's closed source ML framework. Furthermore, even just converting a model from PyTorch to Core ML can present some challenges, and without a preconverted model or some knowledge of the sharp edges it can be arduous for developers. Luckily, Apple also offers MLX, a more modern and flexible ML framework that targets the GPU (not the Neural Engine), and can be used in conjunction with Core ML.\n", + "\n", + "In this three part series, we will provide a reasoning trace of how we converted dots.ocr to run on-device, using a combination of CoreML and MLX. This process should be applicable to many other models, and we hope that this will help highlight the ideas and tools needed for developers looking to run their own models on-device.\n", + "\n", + "To follow along, clone the repo. You'll need uv and hf installed to run the setup command:\n", + "\n", + "./boostrap.sh\n", + "\n", + "If you just want to skip ahead and use the converted model, you can download it here.\n", + "\n", + "Conversion\n", + "\n", + "Converting from PyTorch to CoreML is a two step process:\n", + "\n", + "Capturing your PyTorch execution graph (via torch.jit.trace or, the more modern approach of torch.export).\n", + "\n", + "Compiling this converted graph to an .mlpackage using coremltools.\n", + "\n", + "Whilst we do have a few knobs we can tweak for step 2, most of our control is in step 1, the graph we feed to coremltools.\n", + "\n", + "Following the programmers litany of make it work, make it right, make it fast, we will first focus on getting the conversion working on GPU, in FLOAT32, and with static shapes. Once we have this working, we can dial down the precision and try and move to the Neural Engine.\n", + "\n", + "Dots.OCR\n", + "\n", + "Dots.OCR consists of two key components: A 1.2B parameter vision encoder trained from scratch, based on the NaViT architecture, and a Qwen2.5-1.5B backbone. We will be using CoreML to run the vision encoder, and MLX to run the LM backbone.\n", + "\n", + "Step 0: Understand and simplify the model\n", + "\n", + "In order to convert a model, it's best to understand the structure and function before getting started. Looking at the original vision modelling file here, we can see that the vision encoder is similar to the QwenVL family. Like many vision encoders, the vision encoder for dots works on a patch basis, in this case 14x14 patches. The dots vision encoder is capable of processing videos and batches of images. This gives us an opportunity to simplify by only processing a single image at a time. This approach is frequent in on-device apps, where we convert a model that provides the essential functions and iterate if we want to process multiple images.\n", + "\n", + "When kicking off the conversion process, it's best to start with a minimal viable model. This means removing any bells and whistles that are not strictly necessary for the model to function. In our case, dots has many different attention implementations available for both the vision encoder and the LM backbone. CoreML has lots of infrastructure oriented around the scaled_dot_product_attention operator, which they introduced in iOS 18. We can simplify the model by removing all of the other attention implementations and just focusing on simple sdpa (not the memory efficient variant) for now, commit here.\n", + "\n", + "Once we've done this, we see a scary warning message when we load the model:\n", + "\n", + "Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.\n", + "\n", + "The model doesn't require Sliding Window Attention to function, so we can happily move on.\n", + "\n", + "Step 1: A simple harness\n", + "\n", + "Using torch.jit.trace is still the most mature method for converting models to CoreML. We usually encapsulate this in a simple harness that allows you to modify the compute units used and the precision selected.\n", + "\n", + "You can check out the initial harness here. If we run the following on the original code implementation:\n", + "\n", + "uv run convert.py --precision FLOAT32 --compute_units CPU_AND_GPU\n", + "\n", + "We should bump into the first (of many) issues.\n", + "\n", + "Step 2: Bug hunting\n", + "\n", + "It is rare that a model will convert first time. Often, you will need to progressively make changes further and further down the execution graph until you reach the final node.\n", + "\n", + "Our first issue is the following error:\n", + "\n", + "ERROR - converting 'outer' op (located at: 'vision_tower/rotary_pos_emb/192'): In op \"matmul\", when x and y are both non-const, their dtype need to match, but got x as int32 and y as fp32\n", + "\n", + "Luckily this error gives us quite a bit of information. We can look at the VisionRotaryEmbedding layer and see the following code:\n", + "\n", + "def forward(self, seqlen: int) -> torch.Tensor: seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) freqs = torch.outer(seq, self.inv_freq) return freqs\n", + "\n", + "Although torch.arange has a dtype argument, coremltools ignores this for arange and always outputs int32. We can simply add a cast after the arange to fix this issue, commit here.\n", + "\n", + "After fixing this, running the conversion again leads us to our next issue at repeat_interleave:\n", + "\n", + "ERROR - converting 'repeat_interleave' op (located at: 'vision_tower/204'): Cannot add const [None]\n", + "\n", + "Whilst this error is less informative, we only have a single call to repeat_interleave in our vision encoder:\n", + "\n", + "cu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( dim=0, dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, )\n", + "\n", + "cu_seqlens is used for masking variable length sequences in flash_attention_2. It's derived from the grid_thw tensor, which represents time, height and width. Since we are only processing a single image, we can simply remove this call, commit here.\n", + "\n", + "Onto the next! This time, we get a more cryptic error:\n", + "\n", + "ERROR - converting '_internal_op_tensor_inplace_fill_' op (located at: 'vision_tower/0/attn/301_internal_tensor_assign_1'): _internal_op_tensor_inplace_fill does not support dynamic index\n", + "\n", + "This is again due to the masking logic to handle variable length sequences. Since we are only processing a single image (not a video or batch of images), we don't really need attention masking at all! Therefore, we can just use a mask of all True. To prepare ourselves for the Neural Engine conversion, we also switch from using a boolean mask to a float mask of all zeros, as the Neural Engine does not support bool tensors commit here\n", + "\n", + "With all of this done, the model should now successfully convert to CoreML! However, when we run the model, we get the following error:\n", + "\n", + "error: 'mps.reshape' op the result shape is not compatible with the input shape\n", + "\n", + "This reshape could be in multiple places! Luckily, we can use a previous warning message to help us track down the issue:\n", + "\n", + "TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). for t, h, w in grid_thw:\n", + "\n", + "Most ML compilers do not like dynamic control flow. Luckily for us, as we are only processing a single image, we can simply remove the loop and process the single h, w pair, commit here.\n", + "\n", + "And there we have it! If we run the conversion again, we should see that the model successfully converts and matches the original PyTorch precision:\n", + "\n", + "Max difference: 0.006000518798828125, Mean difference: 1.100682402466191e-05\n", + "\n", + "Step 3: Benchmarking\n", + "\n", + "Now that we've got the model working, let's evaluate the size and performance. The good news is the model is working, the bad news is that it's over 5GB! This is completely untenable for on device deployment! To benchmark the computation time, we can use the built in XCode tooling by calling:\n", + "\n", + "open DotsOCR_FLOAT32.mlpackage\n", + "\n", + "which will launch the XCode inspector for the model. After clicking + Performance Report and launching a report on all compute devices, you should see something like the following:\n", + "\n", + "Over a second for a single forward pass of the vision encoder! We have lots of more work.\n", + "\n", + "In the second part of this series, we will work on the integration between CoreML and MLX, to run the full model on-device. In the third part, we will dive deep into the optimizations required to get this model running on the Neural Engine, including quantization and dynamic shapes.\n", + "\n", + "More Articles from our Blog\n", + "\n", + "Swift Transformers Reaches 1.0 — and Looks to the Future\n", + "\n", + "By September 26, 2025 • 26\n", + "\n", + "WWDC 24: Running Mistral 7B with Core ML\n", + "\n", + "By July 22, 2024 • 62\n", + "\n", + "Community\n", + "\n", + "stopthinking\n", + "\n", + "6 days ago\n", + "\n", + "none of the commits links work\n", + "\n", + "FL33TW00D-HF\n", + "\n", + "Article author 6 days ago\n", + "\n", + "Thanks for the flag! Try now! (repo was private)\n", + "\n", + "bweng\n", + "\n", + "6 days ago\n", + "\n", + "Just came across this, we also tried to do a write up on converting silero-vad to Core ML but you guys did a much better job explaining it!\n", + "\n", + "https://inference.plus/p/teaching-ai-to-optimize-ai-models\n", + "\n", + "Sign up or log in to comment\n", + "\n", + "Upvote\n", + "\n", + "28\n", + "\n", + "Back to Articles\n", + "\n", + "SOTA OCR on-device with Core ML and dots.ocr\n", + "\n", + "Published October 2, 2025\n", + "\n", + "Update on GitHub\n", + "\n", + "Upvote\n", + "\n", + "28\n", + "\n", + "Christopher Fleetwood\n", + "\n", + "FL33TW00D-HF\n", + "\n", + "Pedro Cuenca\n", + "\n", + "pcuenq\n", + "\n", + "Every year our hardware is a little more powerful, our models a little smarter for each parameter. In 2025, it is more feasible than ever to run truly competitive models on-device. dots.ocr, a 3B parameter OCR model from RedNote, surpasses Gemini 2.5 Pro in OmniDocBench, making OCR a truly no compromises on-device use case. Running models on-device is certainly appealing to developers: no smuggling API keys, zero cost, and no network required. However, if we want these models to run on-device, we need to be mindful of the limited compute and power budgets.\n", + "\n", + "Enter the Neural Engine, Apple's custom AI accelerator that has shipped with every Apple device since 2017. This accelerator is designed for high performance whilst sipping battery power. Some of our testing has found the Neural Engine to be 12x more power efficient than CPU, and 4x more power efficient than GPU.\n", + "\n", + "Whilst this all sounds very appealing, unfortunately the Neural Engine is only accessible through Core ML, Apple's closed source ML framework. Furthermore, even just converting a model from PyTorch to Core ML can present some challenges, and without a preconverted model or some knowledge of the sharp edges it can be arduous for developers. Luckily, Apple also offers MLX, a more modern and flexible ML framework that targets the GPU (not the Neural Engine), and can be used in conjunction with Core ML.\n", + "\n", + "In this three part series, we will provide a reasoning trace of how we converted dots.ocr to run on-device, using a combination of CoreML and MLX. This process should be applicable to many other models, and we hope that this will help highlight the ideas and tools needed for developers looking to run their own models on-device.\n", + "\n", + "To follow along, clone the repo. You'll need uv and hf installed to run the setup command:\n", + "\n", + "./boostrap.sh\n", + "\n", + "If you just want to skip ahead and use the converted model, you can download it here.\n", + "\n", + "Conversion\n", + "\n", + "Converting from PyTorch to CoreML is a two step process:\n", + "\n", + "Capturing your PyTorch execution graph (via torch.jit.trace or, the more modern approach of torch.export).\n", + "\n", + "Compiling this converted graph to an .mlpackage using coremltools.\n", + "\n", + "Whilst we do have a few knobs we can tweak for step 2, most of our control is in step 1, the graph we feed to coremltools.\n", + "\n", + "Following the programmers litany of make it work, make it right, make it fast, we will first focus on getting the conversion working on GPU, in FLOAT32, and with static shapes. Once we have this working, we can dial down the precision and try and move to the Neural Engine.\n", + "\n", + "Dots.OCR\n", + "\n", + "Dots.OCR consists of two key components: A 1.2B parameter vision encoder trained from scratch, based on the NaViT architecture, and a Qwen2.5-1.5B backbone. We will be using CoreML to run the vision encoder, and MLX to run the LM backbone.\n", + "\n", + "Step 0: Understand and simplify the model\n", + "\n", + "In order to convert a model, it's best to understand the structure and function before getting started. Looking at the original vision modelling file here, we can see that the vision encoder is similar to the QwenVL family. Like many vision encoders, the vision encoder for dots works on a patch basis, in this case 14x14 patches. The dots vision encoder is capable of processing videos and batches of images. This gives us an opportunity to simplify by only processing a single image at a time. This approach is frequent in on-device apps, where we convert a model that provides the essential functions and iterate if we want to process multiple images.\n", + "\n", + "When kicking off the conversion process, it's best to start with a minimal viable model. This means removing any bells and whistles that are not strictly necessary for the model to function. In our case, dots has many different attention implementations available for both the vision encoder and the LM backbone. CoreML has lots of infrastructure oriented around the scaled_dot_product_attention operator, which they introduced in iOS 18. We can simplify the model by removing all of the other attention implementations and just focusing on simple sdpa (not the memory efficient variant) for now, commit here.\n", + "\n", + "Once we've done this, we see a scary warning message when we load the model:\n", + "\n", + "Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.\n", + "\n", + "The model doesn't require Sliding Window Attention to function, so we can happily move on.\n", + "\n", + "Step 1: A simple harness\n", + "\n", + "Using torch.jit.trace is still the most mature method for converting models to CoreML. We usually encapsulate this in a simple harness that allows you to modify the compute units used and the precision selected.\n", + "\n", + "You can check out the initial harness here. If we run the following on the original code implementation:\n", + "\n", + "uv run convert.py --precision FLOAT32 --compute_units CPU_AND_GPU\n", + "\n", + "We should bump into the first (of many) issues.\n", + "\n", + "Step 2: Bug hunting\n", + "\n", + "It is rare that a model will convert first time. Often, you will need to progressively make changes further and further down the execution graph until you reach the final node.\n", + "\n", + "Our first issue is the following error:\n", + "\n", + "ERROR - converting 'outer' op (located at: 'vision_tower/rotary_pos_emb/192'): In op \"matmul\", when x and y are both non-const, their dtype need to match, but got x as int32 and y as fp32\n", + "\n", + "Luckily this error gives us quite a bit of information. We can look at the VisionRotaryEmbedding layer and see the following code:\n", + "\n", + "def forward(self, seqlen: int) -> torch.Tensor: seq = torch.arange(seqlen, device=self.inv_freq.device, dtype=self.inv_freq.dtype) freqs = torch.outer(seq, self.inv_freq) return freqs\n", + "\n", + "Although torch.arange has a dtype argument, coremltools ignores this for arange and always outputs int32. We can simply add a cast after the arange to fix this issue, commit here.\n", + "\n", + "After fixing this, running the conversion again leads us to our next issue at repeat_interleave:\n", + "\n", + "ERROR - converting 'repeat_interleave' op (located at: 'vision_tower/204'): Cannot add const [None]\n", + "\n", + "Whilst this error is less informative, we only have a single call to repeat_interleave in our vision encoder:\n", + "\n", + "cu_seqlens = torch.repeat_interleave(grid_thw[:, 1] * grid_thw[:, 2], grid_thw[:, 0]).cumsum( dim=0, dtype=grid_thw.dtype if torch.jit.is_tracing() else torch.int32, )\n", + "\n", + "cu_seqlens is used for masking variable length sequences in flash_attention_2. It's derived from the grid_thw tensor, which represents time, height and width. Since we are only processing a single image, we can simply remove this call, commit here.\n", + "\n", + "Onto the next! This time, we get a more cryptic error:\n", + "\n", + "ERROR - converting '_internal_op_tensor_inplace_fill_' op (located at: 'vision_tower/0/attn/301_internal_tensor_assign_1'): _internal_op_tensor_inplace_fill does not support dynamic index\n", + "\n", + "This is again due to the masking logic to handle variable length sequences. Since we are only processing a single image (not a video or batch of images), we don't really need attention masking at all! Therefore, we can just use a mask of all True. To prepare ourselves for the Neural Engine conversion, we also switch from using a boolean mask to a float mask of all zeros, as the Neural Engine does not support bool tensors commit here\n", + "\n", + "With all of this done, the model should now successfully convert to CoreML! However, when we run the model, we get the following error:\n", + "\n", + "error: 'mps.reshape' op the result shape is not compatible with the input shape\n", + "\n", + "This reshape could be in multiple places! Luckily, we can use a previous warning message to help us track down the issue:\n", + "\n", + "TracerWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). for t, h, w in grid_thw:\n", + "\n", + "Most ML compilers do not like dynamic control flow. Luckily for us, as we are only processing a single image, we can simply remove the loop and process the single h, w pair, commit here.\n", + "\n", + "And there we have it! If we run the conversion again, we should see that the model successfully converts and matches the original PyTorch precision:\n", + "\n", + "Max difference: 0.006000518798828125, Mean difference: 1.100682402466191e-05\n", + "\n", + "Step 3: Benchmarking\n", + "\n", + "Now that we've got the model working, let's evaluate the size and performance. The good news is the model is working, the bad news is that it's over 5GB! This is completely untenable for on device deployment! To benchmark the computation time, we can use the built in XCode tooling by calling:\n", + "\n", + "open DotsOCR_FLOAT32.mlpackage\n", + "\n", + "which will launch the XCode inspector for the model. After clicking + Performance Report and launching a report on all compute devices, you should see something like the following:\n", + "\n", + "Over a second for a single forward pass of the vision encoder! We have lots of more work.\n", + "\n", + "In the second part of this series, we will work on the integration between CoreML and MLX, to run the full model on-device. In the third part, we will dive deep into the optimizations required to get this model running on the Neural Engine, including quantization and dynamic shapes.\n", + "\n", + "More Articles from our Blog\n", + "\n", + "Swift Transformers Reaches 1.0 — and Looks to the Future\n", + "\n", + "By September 26, 2025 • 26\n", + "\n", + "WWDC 24: Running Mistral 7B with Core ML\n", + "\n", + "By July 22, 2024 • 62\n", + "\n", + "Community\n", + "\n", + "stopthinking\n", + "\n", + "6 days ago\n", + "\n", + "none of the commits links work\n", + "\n", + "FL33TW00D-HF\n", + "\n", + "Article author 6 days ago\n", + "\n", + "Thanks for the flag! Try now! (repo was private)\n", + "\n", + "bweng\n", + "\n", + "6 days ago\n", + "\n", + "Just came across this, we also tried to do a write up on converting silero-vad to Core ML but you guys did a much better job explaining it!\n", + "\n", + "https://inference.plus/p/teaching-ai-to-optimize-ai-models\n", + "\n", + "Sign up or log in to comment\n", + "\n", + "Upvote\n", + "\n", + "28`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'status': 'success', 'summary': \"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\", 'length': 1209, 'message': 'Generated summary (1209 characters)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_document_text` with `2510v07317v1.pdf`\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'filename': '2510v07317v1.pdf', 'pages': 122, 'text': '5 2 0 2 t c O 8 ] V C . s c [ 1 v 7 1 3 7 0 . 0 1 5 2 : v i X r\\n\\na\\n\\nQuantum-enhanced Computer Vision: Going Beyond Classical Algorithms\\n\\nNatacha Kuete Meli1 Tat-Jun Chin2\\n\\nTolga Birdal3\\n\\nShuteng Wang4 Marcel Seelbach Benkner1 Michele Sasdelli2\\n\\nVladislav Golyanik4\\n\\nMichael Moeller1\\n\\nniversity of Siegen\\n\\n2University of Adelaide\\n\\n3Imperial College London\\n\\nAMPI for Informatics\\n\\nAbstract—Quantum-enhanced Computer Vision (QeCV) is a new research field at the intersection of computer vision, optimisation theory, machine learning and quantum computing. It has high potential to transform how visual signals are processed and interpreted with the help of quantum computing that leverages quantum-mechanical effects in computations inaccessible to classical (i.e. non-quantum) computers. In scenarios where existing non-quantum methods cannot find a solution in a reasonable time or compute only approximate solutions, quantum computers can provide, among others, advantages in terms of better time scalability for multiple problem classes. Parametrised quantum circuits can also become, in the long term, a considerable alternative to classical neural networks in computer vision. However, specialised and fundamentally new algorithms must be developed to enable compatibility with quantum hardware and unveil the potential of quantum computational paradigms in computer vision. This survey contributes to the existing literature on QeCV with a holistic review of this research field. It is designed as a quantum computing reference for the computer vision community, targeting computer vision students, scientists and readers with related backgrounds who want to familiarise themselves with QeCV. We provide a comprehensive introduction to QeCV, its specifics, and methodologies for formulations compatible with quantum hardware and QeCV methods, leveraging two main quantum computational paradigms, i.e. gate-based quantum computing and quantum annealing. We elaborate on the operational principles of quantum computers and the available tools to access, program and simulate them in the context of QeCV. Finally, we review existing quantum computing tools and learning materials and discuss aspects related to publishing and reviewing QeCV papers, open challenges and potential social implications.\\n\\nIndex Terms—Quantum-enhanced Computer Vision, Applied Quantum Computing, Quantum Algorithms.\\n\\n✦\\n\\n1 INTRODUCTION\\n\\nIn addition, theoretical guarantees of classical solvers are only locally optimal [6], and as such, may not be sufficient for certain problems.\\n\\nHence, in many cases, progress on the algorithmic side is a valuable alternative to increasing demands in storage and computational resources. Such algorithmic improvements can be achieved in different ways, e.g. by approximative for- mulations and solutions instead of exact but computation- ally expensive or intractable ones. This survey investigates a specific type of those, namely related to how quantum computing can be leveraged in computer vision.\\n\\nQuantum Computing at a Glance. Since the proposal of quantum computers in the 1980s [7], [8], [9], substantial progress has been made in their practical experimental realisations. The new computational paradigm inspired a multitude of works on theoretical foundations of quantum computing (computer science) [10], [11] and quantum hard- ware realisations (physics) [12]. Researchers have been ac- tively working on various qubit technologies for the last 25 years, and quantum computers based on them are available now for research purposes. Such machines and hardware that are and will become available in the near future— with up to several hundred qubits—are often called Noisy Intermediate-scale Quantum (NISQ) computers [13]. In 2019, a quantum computer, Google’s Sycamore, was able to perform a particular calculation tens of orders of magnitude quicker than a classical computer [14].\\n\\nQuantum computation fundamentally revolves around\\n\\nC Omputer Vision (CV) studies automatic processing of visual and spatial information. Data representing such information is acquired in the form of 2D images, videos, depth maps, 3D point clouds, and different combinations of these inputs, possibly along with other sensory signals (e.g. data from inertial measurement units) [1], [2]. The long- term aim of computer vision is to come up with intelligent, high-level interpretations of the observed scenes, inspired by the capabilities of the Human Visual System (HVS). Today, CV and Machine Learning (ML) constantly inform each other: modern CV strongly relies on ML techniques while developments in ML are often driven by unsolved problems in CV [3].\\n\\nMany modern CV systems are inherently complex: They include multiple components and process large data vol- umes (e.g. during neural network training) [4], [5]. Pri- mary tools of computer vision, such as deep learning, have started to demand unreasonable and hard-to-satisfy GPU resources when classical computation is concerned. Neural architectures in many areas of CV are becoming increasingly larger, parameter-heavy and require more and more time to converge during training. Moreover, many CV problems contain combinatorial objectives that cannot be solved by an exhaustive search in a reasonable time. As a consequence, they require hardware that is able to fulfil high data pro- cessing demands (e.g. graphical processing units (GPUs)).\\n\\n5 2 0 2 t c O 8 ] V C . s c [ 1 v 7 1 3 7 0 . 0 1 5 2 : v i X r\\n\\na\\n\\nQuantum-enhanced Computer Vision: Going Beyond Classical Algorithms\\n\\nNatacha Kuete Meli1 Tat-Jun Chin2\\n\\nTolga Birdal3\\n\\nShuteng Wang4 Marcel Seelbach Benkner1 Michele Sasdelli2\\n\\nVladislav Golyanik4\\n\\nMichael Moeller1\\n\\nniversity of Siegen\\n\\n2University of Adelaide\\n\\n3Imperial College London\\n\\nAMPI for Informatics\\n\\nAbstract—Quantum-enhanced Computer Vision (QeCV) is a new research field at the intersection of computer vision, optimisation theory, machine learning and quantum computing. It has high potential to transform how visual signals are processed and interpreted with the help of quantum computing that leverages quantum-mechanical effects in computations inaccessible to classical (i.e. non-quantum) computers. In scenarios where existing non-quantum methods cannot find a solution in a reasonable time or compute only approximate solutions, quantum computers can provide, among others, advantages in terms of better time scalability for multiple problem classes. Parametrised quantum circuits can also become, in the long term, a considerable alternative to classical neural networks in computer vision. However, specialised and fundamentally new algorithms must be developed to enable compatibility with quantum hardware and unveil the potential of quantum computational paradigms in computer vision. This survey contributes to the existing literature on QeCV with a holistic review of this research field. It is designed as a quantum computing reference for the computer vision community, targeting computer vision students, scientists and readers with related backgrounds who want to familiarise themselves with QeCV. We provide a comprehensive introduction to QeCV, its specifics, and methodologies for formulations compatible with quantum hardware and QeCV methods, leveraging two main quantum computational paradigms, i.e. gate-based quantum computing and quantum annealing. We elaborate on the operational principles of quantum computers and the available tools to access, program and simulate them in the context of QeCV. Finally, we review existing quantum computing tools and learning materials and discuss aspects related to publishing and reviewing QeCV papers, open challenges and potential social implications.\\n\\nIndex Terms—Quantum-enhanced Computer Vision, Applied Quantum Computing, Quantum Algorithms.\\n\\n✦\\n\\n1 INTRODUCTION\\n\\nIn addition, theoretical guarantees of classical solvers are only locally optimal [6], and as such, may not be sufficient for certain problems.\\n\\nHence, in many cases, progress on the algorithmic side is a valuable alternative to increasing demands in storage and computational resources. Such algorithmic improvements can be achieved in different ways, e.g. by approximative for- mulations and solutions instead of exact but computation- ally expensive or intractable ones. This survey investigates a specific type of those, namely related to how quantum computing can be leveraged in computer vision.\\n\\nQuantum Computing at a Glance. Since the proposal of quantum computers in the 1980s [7], [8], [9], substantial progress has been made in their practical experimental realisations. The new computational paradigm inspired a multitude of works on theoretical foundations of quantum computing (computer science) [10], [11] and quantum hard- ware realisations (physics) [12]. Researchers have been ac- tively working on various qubit technologies for the last 25 years, and quantum computers based on them are available now for research purposes. Such machines and hardware that are and will become available in the near future— with up to several hundred qubits—are often called Noisy Intermediate-scale Quantum (NISQ) computers [13]. In 2019, a quantum computer, Google’s Sycamore, was able to perform a particular calculation tens of orders of magnitude quicker than a classical computer [14].\\n\\nQuantum computation fundamentally revolves around\\n\\nC Omputer Vision (CV) studies automatic processing of visual and spatial information. Data representing such information is acquired in the form of 2D images, videos, depth maps, 3D point clouds, and different combinations of these inputs, possibly along with other sensory signals (e.g. data from inertial measurement units) [1], [2]. The long- term aim of computer vision is to come up with intelligent, high-level interpretations of the observed scenes, inspired by the capabilities of the Human Visual System (HVS). Today, CV and Machine Learning (ML) constantly inform each other: modern CV strongly relies on ML techniques while developments in ML are often driven by unsolved problems in CV [3].\\n\\nMany modern CV systems are inherently complex: They include multiple components and process large data vol- umes (e.g. during neural network training) [4], [5]. Pri- mary tools of computer vision, such as deep learning, have started to demand unreasonable and hard-to-satisfy GPU resources when classical computation is concerned. Neural architectures in many areas of CV are becoming increasingly larger, parameter-heavy and require more and more time to converge during training. Moreover, many CV problems contain combinatorial objectives that cannot be solved by an exhaustive search in a reasonable time. As a consequence, they require hardware that is able to fulfil high data pro- cessing demands (e.g. graphical processing units (GPUs)).\\n\\nevolving quantum systems into distinct states using quan- tum mechanical effects. To harness these effects for prac- tical computation, two primary paradigms, governed by two physical principles, have emerged: Adiabatic Quantum Computing (AQC) [11], [15] and gate-based quantum com- puting [10], [16]. AQC relies on the smooth evolution or annealing of a so-called Hamiltonian to guide the system toward its lowest-energy state, making it naturally suited for optimisation problems, particularly those expressed in the Quadratic Unconstrained Binary Optimisation (QUBO) form. In contrast, gate-based quantum computing employs discrete unitary transformations, offering greater flexibility in algorithm design. Together, these paradigms define mod- ern quantum computation, each with distinct advantages that depend on the specific problem domain.\\n\\nBroadly speaking, Quantum Computing (QC) allows designing algorithms to solve several classes of computa- tionally challenging problems with possible computational gains depending on the type of QC and the problem. The type of computational gains can range from improvements in asymptotic complexity (compared to the best classical counterparts) [17] to the acceleration of computations in absolute terms (in the same complexity class) [18]. Quantum computers operate on qubits, counterparts of classical bits that leverage quantum effects. These qubits abstractly span a Hilbert space, where computation takes place. In theory, a quantum computer can perform everything a classical com- puter can perform and vice versa. However, the correspond- ing Hilbert space of multiple qubits is exponentially large (2n-dimensional for n qubits), due to the tensor product structure of quantum mechanics, and so-called entangled states where the qubits cannot be described separately. Classical computation of the corresponding exponentially large operators would, of course, be highly time-consuming.\\n\\nAs of 2023-2025, we observe a transition in the field thanks to the proliferation of real quantum hardware: If previously (pre-2015), quantum hardware was accessible predominantly to researchers developing it, multiple quan- tum and quantum-inspired computers (e.g. adiabatic, gate- based machines, photonic machines, quantum simulators and quantum-inspired optimisers) can nowadays be ac- cessed by researchers from different fields, and the devel- oped methods can be tested on real quantum hardware. This is in contrast to the vast majority of papers on quantum algorithms published before, including highly influential ones that have been tested on very small problems on real quantum hardware so far [19], [20], [21]. For the next two decades, experts predict a super-linear increase in the number of qubits [22] and substantial improvements in the properties of the individual qubits (such as their decoher- ence and connectivity properties). We argue that these pre- dictions should be taken seriously, because the investments in quantum computing and algorithm development are sub- stantial; many national and international initiatives related to quantum computing were brought into being within the last several years. Quantum computing power on selected algorithms scales super-linearly (up to exponentially) with the number of qubits. Almost all the big technology com- panies including Intel, IBM, Google, Microsoft, Amazon, NVIDIA and D-Wave are aiming at a steady increase in the number of qubits and are investing heavily in quantum\\n\\ntechnology, as the limits of Moore’s law is approached1. For example, Google has publicly announced its goal to build a commercial quantum computer composed of 1M qubits by 20292. The Willow chip achieving successful quantum error correction [24] constitutes an encouraging milestone for quantum computing.\\n\\nIn response to the aforementioned algorithmic chal- lenges in computer vision and the opportunities of quan- tum computing, computer vision researchers started turning their attention to this new (for the research field) com- putational paradigm. From both theoretical and practical perspectives, it is both desirable and interesting to inves- tigate new (quantum) algorithms for long-standing (vision) problems. Based on different principles than before, such algorithms can bring previously unexpected advantages and new properties to computer vision systems.\\n\\nTo make the best use of the strongly increasing quantum computing power, we need to make the best use of quantum machines, and we need QeCV algorithms to be ready and scalable when practical quantum computing arrives. In the long term, Quantum Processing Units (QPUs) promise to extend the available arsenal of reliable computer vision tools and computational accelerators (with GPUs being an example of currently widely-used technology).\\n\\nFig. 1 provides an overview of different computer vision problems and quantum computational paradigms that can be used to address them, adiabatic and gate-based quantum computing; it also highlights the common steps of every quantum-compatible computer vision approach (e.g. prob- lem embedding to the hardware, annealing or execution of quantum gates, and, finally, read-out of the solution). We discuss both quantum computational paradigms in Sec. 2.\\n\\n1.1 Computer Vision meets Quantum Computing\\n\\nIn the broad sense, Quantum-enhanced Computer Vision (QeCV) encompasses computer vision methods and systems ex- ecuted entirely or partially on quantum hardware. The term “en- hanced” refers to the fact that QeCV methods include clas- sical parts to different degrees (e.g. data encoding, weight matrix preparation, classical neural network parts in hybrid quantum-classical neural architectures) and are boosted (or enhanced) through quantum computations. Hence, modern QeCV methods are hybrid and designed as an interplay between classical and quantum parts. QeCV falls into the category of applied quantum computing, perhaps among the first such disciplines across the fields.\\n\\nThe goal of QeCV is the development of innovative computer vision techniques (improved or funda- mentally new ones) leveraging quantum computa- tional paradigms and surpassing classical methods in terms of processing speed, required resources, ac- curacy or the ability to learn patterns from complex visual data.\\n\\n1. Researchers even started to use the term Neven’s Law [23], i.e. re- ferring to the doubly-exponential growth in quantum compute power.\\n\\n2. https://quantumai.google/learn/map\\n\\nevolving quantum systems into distinct states using quan- tum mechanical effects. To harness these effects for prac- tical computation, two primary paradigms, governed by two physical principles, have emerged: Adiabatic Quantum Computing (AQC) [11], [15] and gate-based quantum com- puting [10], [16]. AQC relies on the smooth evolution or annealing of a so-called Hamiltonian to guide the system toward its lowest-energy state, making it naturally suited for optimisation problems, particularly those expressed in the Quadratic Unconstrained Binary Optimisation (QUBO) form. In contrast, gate-based quantum computing employs discrete unitary transformations, offering greater flexibility in algorithm design. Together, these paradigms define mod- ern quantum computation, each with distinct advantages that depend on the specific problem domain.\\n\\nBroadly speaking, Quantum Computing (QC) allows designing algorithms to solve several classes of computa- tionally challenging problems with possible computational gains depending on the type of QC and the problem. The type of computational gains can range from improvements in asymptotic complexity (compared to the best classical counterparts) [17] to the acceleration of computations in absolute terms (in the same complexity class) [18]. Quantum computers operate on qubits, counterparts of classical bits that leverage quantum effects. These qubits abstractly span a Hilbert space, where computation takes place. In theory, a quantum computer can perform everything a classical com- puter can perform and vice versa. However, the correspond- ing Hilbert space of multiple qubits is exponentially large (2n-dimensional for n qubits), due to the tensor product structure of quantum mechanics, and so-called entangled states where the qubits cannot be described separately. Classical computation of the corresponding exponentially large operators would, of course, be highly time-consuming.\\n\\nAs of 2023-2025, we observe a transition in the field thanks to the proliferation of real quantum hardware: If previously (pre-2015), quantum hardware was accessible predominantly to researchers developing it, multiple quan- tum and quantum-inspired computers (e.g. adiabatic, gate- based machines, photonic machines, quantum simulators and quantum-inspired optimisers) can nowadays be ac- cessed by researchers from different fields, and the devel- oped methods can be tested on real quantum hardware. This is in contrast to the vast majority of papers on quantum algorithms published before, including highly influential ones that have been tested on very small problems on real quantum hardware so far [19], [20], [21]. For the next two decades, experts predict a super-linear increase in the number of qubits [22] and substantial improvements in the properties of the individual qubits (such as their decoher- ence and connectivity properties). We argue that these pre- dictions should be taken seriously, because the investments in quantum computing and algorithm development are sub- stantial; many national and international initiatives related to quantum computing were brought into being within the last several years. Quantum computing power on selected algorithms scales super-linearly (up to exponentially) with the number of qubits. Almost all the big technology com- panies including Intel, IBM, Google, Microsoft, Amazon, NVIDIA and D-Wave are aiming at a steady increase in the number of qubits and are investing heavily in quantum\\n\\ntechnology, as the limits of Moore’s law is approached1. For example, Google has publicly announced its goal to build a commercial quantum computer composed of 1M qubits by 20292. The Willow chip achieving successful quantum error correction [24] constitutes an encouraging milestone for quantum computing.\\n\\nIn response to the aforementioned algorithmic chal- lenges in computer vision and the opportunities of quan- tum computing, computer vision researchers started turning their attention to this new (for the research field) com- putational paradigm. From both theoretical and practical perspectives, it is both desirable and interesting to inves- tigate new (quantum) algorithms for long-standing (vision) problems. Based on different principles than before, such algorithms can bring previously unexpected advantages and new properties to computer vision systems.\\n\\nTo make the best use of the strongly increasing quantum computing power, we need to make the best use of quantum machines, and we need QeCV algorithms to be ready and scalable when practical quantum computing arrives. In the long term, Quantum Processing Units (QPUs) promise to extend the available arsenal of reliable computer vision tools and computational accelerators (with GPUs being an example of currently widely-used technology).\\n\\nFig. 1 provides an overview of different computer vision problems and quantum computational paradigms that can be used to address them, adiabatic and gate-based quantum computing; it also highlights the common steps of every quantum-compatible computer vision approach (e.g. prob- lem embedding to the hardware, annealing or execution of quantum gates, and, finally, read-out of the solution). We discuss both quantum computational paradigms in Sec. 2.\\n\\n1.1 Computer Vision meets Quantum Computing\\n\\nIn the broad sense, Quantum-enhanced Computer Vision (QeCV) encompasses computer vision methods and systems ex- ecuted entirely or partially on quantum hardware. The term “en- hanced” refers to the fact that QeCV methods include clas- sical parts to different degrees (e.g. data encoding, weight matrix preparation, classical neural network parts in hybrid quantum-classical neural architectures) and are boosted (or enhanced) through quantum computations. Hence, modern QeCV methods are hybrid and designed as an interplay between classical and quantum parts. QeCV falls into the category of applied quantum computing, perhaps among the first such disciplines across the fields.\\n\\nThe goal of QeCV is the development of innovative computer vision techniques (improved or funda- mentally new ones) leveraging quantum computa- tional paradigms and surpassing classical methods in terms of processing speed, required resources, ac- curacy or the ability to learn patterns from complex visual data.\\n\\n1. Researchers even started to use the term Neven’s Law [23], i.e. re- ferring to the doubly-exponential growth in quantum compute power.\\n\\n2. https://quantumai.google/learn/map\\n\\nFig. 1: Quantum-enhanced Computer Vision. . (A): First, a target problem must be formulated in a form consumable by modern quantum machines, e.g. as a QUBO problem for AQC devices or as a gate sequence for gate-based QC. This operation is performed on a host (classical CPU). (B): In AQC, the resulting QUBO defines a logical problem—binary variables that become qubits during optimisation on an idealised quantum annealer with full qubit connectivity. Alternatively, gate-based QC uses a gate sequence to drive the system into a solution-encoding state. (C): To run on a quantum computer with limited connectivity, a logical problem must be minor-embedded or transpiled. During this mapping step, each logical qubit is assigned to one or more physical qubits to match hardware constraints. (D): An AQC device performs annealing for computation, while a gate-based QC device alternatively executes the algorithm describing gates. Adiabatic computers leverage quantum mechanical effects of superposition and tunnelling to find optima of QUBOs. Gate-based computers can additionally harness entanglement and interference to speed up computations, surpassing the capabilities of classical ones. (E): Measured qubit values are unembedded from the hardware and aggregated in the AQC paradigm, or directly read out in gate-based QC. The measurement is repeated several times, and a solution distribution is returned to the host. The bit-strings are processed and interpreted in terms of the original problem. Image sources, if applicable (from left to right and top to bottom in each step): (A): [25, IEEE ©2025], [26, IEEE ©2025], [27, IEEE ©2025], [28], [29, Springer Nature ©2025], [30, reproduced under the dataset’s academic license], [31, Springer Nature ©2025], [32, Springer Nature ©2025], (B): [25, IEEE ©2025], [33], [34, IEEE ©2025], (C): [35, Springer Nature ©2025], [36, IonQ ©2025], (D): [37, reproduced under the CC BY-AS 4.0 license], [38, reproduced under the CC BY 3.0 license].\\n\\nWe emphasise that this definition is intended to be as- pirational, reflecting the evolving and exploratory nature of the field. The use of quantum hardware in QeCV approaches must be justified not only from a computational standpoint but also through demonstrable benefits in solution quality or predictive accuracy. In other words, it is insufficient to dub a method as quantum solely because it can be executed on quantum hardware. Recall that a universal quantum computer can execute everything a classical binary machine can, but not the other way around (in reasonable time; classical computers can accurately simulate quantum com- putations with exponential growth of classical resources, which becomes infeasible for large numbers of qubits [39]).\\n\\nQeCV is an emerging field. The first QeCV method published at a primary computer vision conference was\\n\\nan approach for correspondence problems on point sets [40]; it appeared on arXiv.org in 2019. The paper provides an introduction to modern quantum annealers and pro- poses algorithms for transformation estimation and point set alignment that can be executed on a quantum annealer once the inputs are pre-processed to a form admissible to it. The term Quantum Computer Vision—or QeCV as it is more often called today—was coined later in Birdal and Golyanik et al. [25] and since then, it is used to denote computer vision methods relying on quantum hardware.\\n\\nApplications of Quantum Computers in Vision. Not many problems in computer vision can be formulated in a form consumable by modern quantum hardware. The estimated number of qubits necessary for practical computer vision problems with gate quantum computing formulations that\\n\\nFig. 1: Quantum-enhanced Computer Vision. . (A): First, a target problem must be formulated in a form consumable by modern quantum machines, e.g. as a QUBO problem for AQC devices or as a gate sequence for gate-based QC. This operation is performed on a host (classical CPU). (B): In AQC, the resulting QUBO defines a logical problem—binary variables that become qubits during optimisation on an idealised quantum annealer with full qubit connectivity. Alternatively, gate-based QC uses a gate sequence to drive the system into a solution-encoding state. (C): To run on a quantum computer with limited connectivity, a logical problem must be minor-embedded or transpiled. During this mapping step, each logical qubit is assigned to one or more physical qubits to match hardware constraints. (D): An AQC device performs annealing for computation, while a gate-based QC device alternatively executes the algorithm describing gates. Adiabatic computers leverage quantum mechanical effects of superposition and tunnelling to find optima of QUBOs. Gate-based computers can additionally harness entanglement and interference to speed up computations, surpassing the capabilities of classical ones. (E): Measured qubit values are unembedded from the hardware and aggregated in the AQC paradigm, or directly read out in gate-based QC. The measurement is repeated several times, and a solution distribution is returned to the host. The bit-strings are processed and interpreted in terms of the original problem. Image sources, if applicable (from left to right and top to bottom in each step): (A): [25, IEEE ©2025], [26, IEEE ©2025], [27, IEEE ©2025], [28], [29, Springer Nature ©2025], [30, reproduced under the dataset’s academic license], [31, Springer Nature ©2025], [32, Springer Nature ©2025], (B): [25, IEEE ©2025], [33], [34, IEEE ©2025], (C): [35, Springer Nature ©2025], [36, IonQ ©2025], (D): [37, reproduced under the CC BY-AS 4.0 license], [38, reproduced under the CC BY 3.0 license].\\n\\nWe emphasise that this definition is intended to be as- pirational, reflecting the evolving and exploratory nature of the field. The use of quantum hardware in QeCV approaches must be justified not only from a computational standpoint but also through demonstrable benefits in solution quality or predictive accuracy. In other words, it is insufficient to dub a method as quantum solely because it can be executed on quantum hardware. Recall that a universal quantum computer can execute everything a classical binary machine can, but not the other way around (in reasonable time; classical computers can accurately simulate quantum com- putations with exponential growth of classical resources, which becomes infeasible for large numbers of qubits [39]).\\n\\nQeCV is an emerging field. The first QeCV method published at a primary computer vision conference was\\n\\nan approach for correspondence problems on point sets [40]; it appeared on arXiv.org in 2019. The paper provides an introduction to modern quantum annealers and pro- poses algorithms for transformation estimation and point set alignment that can be executed on a quantum annealer once the inputs are pre-processed to a form admissible to it. The term Quantum Computer Vision—or QeCV as it is more often called today—was coined later in Birdal and Golyanik et al. [25] and since then, it is used to denote computer vision methods relying on quantum hardware.\\n\\nApplications of Quantum Computers in Vision. Not many problems in computer vision can be formulated in a form consumable by modern quantum hardware. The estimated number of qubits necessary for practical computer vision problems with gate quantum computing formulations that\\n\\nmake use of provably advantageous quantum algorithms is typically larger than available on NISQ architectures. It is rare that QUBO forms are available and can be directly tested on an AQC. Often, it is the case that the target problem first has to be mapped to QUBO, and the solution has to be encoded in binary form. It is, therefore, of broad scientific interest that computer vision (and machine learn- ing) problems can be formulated for quantum computing and efficiently solved with it, while offering advantages compared to their classical method counterparts. Moreover, using a quantum mindset to CV problems can provide a new perspective, leading to new insights for classical CV and new methods, especially for the field in which many things are empirical. We discuss these aspects in Sec. 3. Related Research Fields. Several research fields related to QeCV can be identified in the literature, such as quantum-inspired computer vision, Quantum Image Pro- cessing (QIP), and Quantum Machine Learning (QML).\\n\\nThe first category simulates quantum-mechanical effects or draws inspiration from quantum mechanics [41], [42], [43], [44], [45]. These algorithms are not meant for execution on quantum hardware and are solely inspired by quantum phenomena. They should not be confused with techniques that can execute on quantum hardware. The second method category, i.e. QIP, is a field of quantum information pro- cessing focusing on representing and processing images as quantum states [46], [47], [48]. QIP provides several faster algorithms (in theory) for multiple widely-used linear alge- bra operations and common low-level operations on images [49], [50], [51].\\n\\nBoth method categories above can be broadly seen as sub-fields of QeCV, though distinct from this survey’s focus on mid- and high-level computer vision tasks (e.g. point set or mesh alignment, object tracking, and robust fitting). In contrast, just as classical ML is deeply intertwined with CV, quantum machine learning [52], [53], [54] explores the intersection of quantum computing and machine learning to enhance learning algorithms. It holds promise for QeCV, potentially accelerating tasks such as image classification, object detection, and pattern recognition. However, current QML methods remain largely theoretical, with limited prac- tical applications due to quantum hardware and scalability constraints. Hence, QML remains distant from this survey’s focus on practical applications to CV.\\n\\n1.2 Motivation and Scope of this Survey\\n\\nIn the broad sense, this survey shows by reviewing the first works in the field, how quantum computations and quantum hardware can be leveraged for computer vision. It showcases recent and ongoing progress towards practical quantum computing and computer vision, discusses the current state of the art, limitations therein, expected progress and its impact on computer vision. Our goal is to provide a shared, computer-science-friendly language and mathematical for- mulation of quantum computing, covering its two modern paradigms, i.e. gate-based quantum computing and adia- batic quantum computation. We identify and classify com- puter vision problems that can be addressed by quantum computers and analyse what they do have in common.\\n\\nWe observe that the quantum technology acts more and more like a booster for algorithm development in CV. The\\n\\nfact that the developed methods could run on real quantum hardware often brings us to interesting discoveries. Thus, we can often obtain methods that work better than existing classical ones.\\n\\nThere are several considerations regarding the gate- based model vs adiabatic model. Gate-based QCs are cur- rently severely limited in the number of qubits, their con- nectivity patterns and the accuracy of operations (gates) that can be applied to the qubits. Additionally, decoher- ence poses a significant obstacle. Thus, the largest num- ber factorised on gate-based quantum hardware using the celebrated Shor’s algorithm remains 21 for more than ten years as of the moment of writing [55], [56]. In contrast, the number of qubits in modern quantum annealers such as D- Wave is larger, which in combination with their connectivity and qubit characteristics allows for solving combinatorial optimisation problems in the Ising encodings of sizes rele- vant to computer vision and real-world applications. Also, qubit coherence times required for AQC are shorter than for gate-based machines, which partially explains the better scalability of quantum annealers compared to gate-based machines. Hence, the interest in especially applying AQC in computer vision has grown substantially over the last three years.\\n\\nThis survey focuses on computer vision methods for gate-based quantum computers and quantum annealers that have been evaluated on real quantum hardware (in the case of adiabatic quantum annealers) or simulators of gate-based quantum computers. We include a few theoretical works on gate-based quantum computing in computer vision without experiments on a simulator though this survey is gener- ally structured to reflect the applied focus of QeCV. Since computer vision is an applied science, we believe that the criterion of experimentation on quantum hardware is piv- otal, especially because modern quantum machines already allow solving problems of sizes encountered in practical applications. That is why we believe it is important that the methods are evaluated and the exposition is not restricted to theoretical considerations.\\n\\nPaper Selection Criteria. QeCV is a nascent field. In line with its aspirational definition, this survey presents an overview of methods designed for full or hybrid execution (i.e. classical-quantum) on quantum hardware, emphasising approaches that report theoretically grounded results with potential relevance to QeCV. Hence, the main paper selec- tion criterion for this survey is experimental evaluation of the proposed techniques and, at least, some results obtained on real quantum hardware. Moreover, we focus on works published at computer vision conferences (CVPR, ICCV and ECCV) and other venues (perhaps interdisciplinary) that are interested in the application of quantum computers in vision. We also include several technical reports on arXiv.org if they fulfil the main paper selection criterion. The authors made efforts to provide as complete review of the field as possible, but cannot claim completeness in every aspect, since the field is getting momentum and growing. We also recommend interested readers to read the discussed papers for more details.\\n\\nRelated Surveys. Recently, Quantum Machine Learning (QML) has gained significant attention as it is now a fast- growing area. The large body of works in the QML space has\\n\\nmake use of provably advantageous quantum algorithms is typically larger than available on NISQ architectures. It is rare that QUBO forms are available and can be directly tested on an AQC. Often, it is the case that the target problem first has to be mapped to QUBO, and the solution has to be encoded in binary form. It is, therefore, of broad scientific interest that computer vision (and machine learn- ing) problems can be formulated for quantum computing and efficiently solved with it, while offering advantages compared to their classical method counterparts. Moreover, using a quantum mindset to CV problems can provide a new perspective, leading to new insights for classical CV and new methods, especially for the field in which many things are empirical. We discuss these aspects in Sec. 3. Related Research Fields. Several research fields related to QeCV can be identified in the literature, such as quantum-inspired computer vision, Quantum Image Pro- cessing (QIP), and Quantum Machine Learning (QML).\\n\\nThe first category simulates quantum-mechanical effects or draws inspiration from quantum mechanics [41], [42], [43], [44], [45]. These algorithms are not meant for execution on quantum hardware and are solely inspired by quantum phenomena. They should not be confused with techniques that can execute on quantum hardware. The second method category, i.e. QIP, is a field of quantum information pro- cessing focusing on representing and processing images as quantum states [46], [47], [48]. QIP provides several faster algorithms (in theory) for multiple widely-used linear alge- bra operations and common low-level operations on images [49], [50], [51].\\n\\nBoth method categories above can be broadly seen as sub-fields of QeCV, though distinct from this survey’s focus on mid- and high-level computer vision tasks (e.g. point set or mesh alignment, object tracking, and robust fitting). In contrast, just as classical ML is deeply intertwined with CV, quantum machine learning [52], [53], [54] explores the intersection of quantum computing and machine learning to enhance learning algorithms. It holds promise for QeCV, potentially accelerating tasks such as image classification, object detection, and pattern recognition. However, current QML methods remain largely theoretical, with limited prac- tical applications due to quantum hardware and scalability constraints. Hence, QML remains distant from this survey’s focus on practical applications to CV.\\n\\n1.2 Motivation and Scope of this Survey\\n\\nIn the broad sense, this survey shows by reviewing the first works in the field, how quantum computations and quantum hardware can be leveraged for computer vision. It showcases recent and ongoing progress towards practical quantum computing and computer vision, discusses the current state of the art, limitations therein, expected progress and its impact on computer vision. Our goal is to provide a shared, computer-science-friendly language and mathematical for- mulation of quantum computing, covering its two modern paradigms, i.e. gate-based quantum computing and adia- batic quantum computation. We identify and classify com- puter vision problems that can be addressed by quantum computers and analyse what they do have in common.\\n\\nWe observe that the quantum technology acts more and more like a booster for algorithm development in CV. The\\n\\nfact that the developed methods could run on real quantum hardware often brings us to interesting discoveries. Thus, we can often obtain methods that work better than existing classical ones.\\n\\nThere are several considerations regarding the gate- based model vs adiabatic model. Gate-based QCs are cur- rently severely limited in the number of qubits, their con- nectivity patterns and the accuracy of operations (gates) that can be applied to the qubits. Additionally, decoher- ence poses a significant obstacle. Thus, the largest num- ber factorised on gate-based quantum hardware using the celebrated Shor’s algorithm remains 21 for more than ten years as of the moment of writing [55], [56]. In contrast, the number of qubits in modern quantum annealers such as D- Wave is larger, which in combination with their connectivity and qubit characteristics allows for solving combinatorial optimisation problems in the Ising encodings of sizes rele- vant to computer vision and real-world applications. Also, qubit coherence times required for AQC are shorter than for gate-based machines, which partially explains the better scalability of quantum annealers compared to gate-based machines. Hence, the interest in especially applying AQC in computer vision has grown substantially over the last three years.\\n\\nThis survey focuses on computer vision methods for gate-based quantum computers and quantum annealers that have been evaluated on real quantum hardware (in the case of adiabatic quantum annealers) or simulators of gate-based quantum computers. We include a few theoretical works on gate-based quantum computing in computer vision without experiments on a simulator though this survey is gener- ally structured to reflect the applied focus of QeCV. Since computer vision is an applied science, we believe that the criterion of experimentation on quantum hardware is piv- otal, especially because modern quantum machines already allow solving problems of sizes encountered in practical applications. That is why we believe it is important that the methods are evaluated and the exposition is not restricted to theoretical considerations.\\n\\nPaper Selection Criteria. QeCV is a nascent field. In line with its aspirational definition, this survey presents an overview of methods designed for full or hybrid execution (i.e. classical-quantum) on quantum hardware, emphasising approaches that report theoretically grounded results with potential relevance to QeCV. Hence, the main paper selec- tion criterion for this survey is experimental evaluation of the proposed techniques and, at least, some results obtained on real quantum hardware. Moreover, we focus on works published at computer vision conferences (CVPR, ICCV and ECCV) and other venues (perhaps interdisciplinary) that are interested in the application of quantum computers in vision. We also include several technical reports on arXiv.org if they fulfil the main paper selection criterion. The authors made efforts to provide as complete review of the field as possible, but cannot claim completeness in every aspect, since the field is getting momentum and growing. We also recommend interested readers to read the discussed papers for more details.\\n\\nRelated Surveys. Recently, Quantum Machine Learning (QML) has gained significant attention as it is now a fast- growing area. The large body of works in the QML space has\\n\\nbeen reviewed in [57], [58], whereas [57], [59], [60] also focus on quantum approaches for tackling optimisation problems. Out of all machine learning models, quantum deep learning requires special attention due to the impact of these learning machines in the field of AI. Massoli et al. [61] gather, compare and analyse the current state-of-the-art concerning Quantum Neural Networks (QNN). Yarkoni et al. [62], on the other hand, look into possible industry applications of Quantum Annealing (QA); their survey is perhaps closest to our work in terms of focusing on an application area of QA.\\n\\nMohseni et al. [18] review different methods for solv- ing Ising problems (exactly or approximately) and discuss quantum annealing as one meta-heuristic. A few surveys focus on quantum annealing for physicists [63], whereas our work is written for researchers with a computer vision back- ground. The short survey by Larasati et al. [64] provides a summary of a few quantum computer vision methods (<5% of what we cover) and is much less comprehensive than our article. Another brief paper by Mebtouche et al. [65] reviews the foundations of quantum computing, machine learning and multiple low-level vision and image processing tech- niques. In this regard, ours is the first comprehensive survey of its kind with an emphasis on QeCV.\\n\\nTarget Audience. This survey it is written for applied com- puter scientists, especially computer vision researchers and practitioners, who may have little to no background in theoretical physics. Unlike previous surveys [18], [57], [58], [59], [60], [61], [65] on quantum algorithms, mostly published outside traditional computer vision venues, our goal is to make QeCV accessible and actionable for a broader audience of computer vision researchers and engineers. By consolidating multiple contributions, we aim to bridge that gap and provide a comprehensive overview of existing QeCV methods, along with insights into future directions.\\n\\nWe present a practical “cookbook” for computer vision practitioners to begin exploring quantum-enhanced tech- niques today and to prepare for the growing impact of quan- tum technologies. While the survey avoids deep dives into physics and quantum theory, it includes sufficient technical detail to ensure a solid foundation for the presented results and the validity thereof. Physical assumptions are clearly separated from the mathematical consequences relevant to algorithmic design. The intended readership includes:\\n\\n• Computer vision researchers seeking to understand whether their work can benefit from quantum comput- ing;\\n\\n• Computer vision practitioners interested in experiment- ing with QeCV algorithms or integrating quantum solvers into their workflows;\\n\\n• Computer science students at various levels who are curious about the intersection of computer vision and applied quantum computing.\\n\\nWe hope this survey will serve as a gateway for new researchers to enter the field and contribute to its devel- opment.\\n\\n1.3 Structure of this Survey\\n\\nThis survey is structured in six sections. Sec. 2 reviews foundations of quantum computing relevant to quantum- enhanced computer vision methods including two main\\n\\nquantum computational paradigms, i.e. gate-based (or circuit-based) and adiabatic quantum computing, as well as the design and hardware of quantum computers. Moreover, as part of it, Sec. 3.1.2 discusses the encoding of classical data as quantum states suitable for processing on quantum hardware, which is one of the essential steps in quantum- enhanced computer vision. Next, Sec. 3 is devoted to algo- rithms and applications. It reviews methods for different problems such as point set alignment, mesh registration, object tracking, model fitting, quantum machine learning for vision, implicit representations and generative approaches, among others. We then discuss open challenges, specifics of the field and social implications in Sec. 5, and conclude in Sec. 6.\\n\\n2 OPERATIONAL PRINCIPLES OF QCS\\n\\nThis section reviews the foundations of quantum computing necessary as a short introduction (or a refresher) to the field and sufficient for understanding the published literature reviewed in Sec. 3. Sec. 2.1 introduces fundamentals such as notations and elementary operations on qubits. Secs. 2.2 and 2.3 describe the two main quantum computational paradigms—namely, gate-based quantum computing and quantum annealing, respectively—and Sec. 2.4 establishes their connections and equivalency. Sec. 2.5 examines po- tential advantages of quantum computing over classical computing and Sec. 2.6 discusses hardware designs and realizations of quantum computers.\\n\\nNote that we do not review complexity classes associated with the quantum computational paradigm, as their discus- sion is outside the scope. In this survey, it is sufficient to mention that quantum computers can accelerate computa- tions relevant to computer vision and provide different and favourable properties in quantum-enhanced learning mod- els; both of these characteristics provide strong motivation for the newly established research direction of QeCV. We recommend references [66], [67] to readers wishing to learn more about quantum complexity theory.\\n\\n2.1 Fundamentals and Notations\\n\\nThis background section introduces the fundamentals of quantum computing through both gate-based and adiabatic paradigms. It also establishes the notation and terminology used throughout the survey. For clarity and ease of refer- ence, a comprehensive overview of symbols and acronyms is provided in Tabs. 1 and 2; note that the acronym list excludes method names and company identifiers.\\n\\nMany concepts in quantum computing have direct ana- logues in classical computing and optimization theory. For example, Hamiltonians represent energy functions, with eigenstates corresponding to energy levels and ground states denoting the lowest-energy configurations. Through- out the survey, we assume these physical Hamiltonians to be Hermitian operators, and unless otherwise stated, the quantum systems considered are closed—i.e., they do not exchange particles with their environment. We adopt the bra–ket notation to concisely express familiar linear alge- braic constructs such as row and column vectors, inner and outer products, and tensor products. This notation stream- lines the presentation of quantum algorithms and aligns\\n\\nbeen reviewed in [57], [58], whereas [57], [59], [60] also focus on quantum approaches for tackling optimisation problems. Out of all machine learning models, quantum deep learning requires special attention due to the impact of these learning machines in the field of AI. Massoli et al. [61] gather, compare and analyse the current state-of-the-art concerning Quantum Neural Networks (QNN). Yarkoni et al. [62], on the other hand, look into possible industry applications of Quantum Annealing (QA); their survey is perhaps closest to our work in terms of focusing on an application area of QA.\\n\\nMohseni et al. [18] review different methods for solv- ing Ising problems (exactly or approximately) and discuss quantum annealing as one meta-heuristic. A few surveys focus on quantum annealing for physicists [63], whereas our work is written for researchers with a computer vision back- ground. The short survey by Larasati et al. [64] provides a summary of a few quantum computer vision methods (<5% of what we cover) and is much less comprehensive than our article. Another brief paper by Mebtouche et al. [65] reviews the foundations of quantum computing, machine learning and multiple low-level vision and image processing tech- niques. In this regard, ours is the first comprehensive survey of its kind with an emphasis on QeCV.\\n\\nTarget Audience. This survey it is written for applied com- puter scientists, especially computer vision researchers and practitioners, who may have little to no background in theoretical physics. Unlike previous surveys [18], [57], [58], [59], [60], [61], [65] on quantum algorithms, mostly published outside traditional computer vision venues, our goal is to make QeCV accessible and actionable for a broader audience of computer vision researchers and engineers. By consolidating multiple contributions, we aim to bridge that gap and provide a comprehensive overview of existing QeCV methods, along with insights into future directions.\\n\\nWe present a practical “cookbook” for computer vision practitioners to begin exploring quantum-enhanced tech- niques today and to prepare for the growing impact of quan- tum technologies. While the survey avoids deep dives into physics and quantum theory, it includes sufficient technical detail to ensure a solid foundation for the presented results and the validity thereof. Physical assumptions are clearly separated from the mathematical consequences relevant to algorithmic design. The intended readership includes:\\n\\n• Computer vision researchers seeking to understand whether their work can benefit from quantum comput- ing;\\n\\n• Computer vision practitioners interested in experiment- ing with QeCV algorithms or integrating quantum solvers into their workflows;\\n\\n• Computer science students at various levels who are curious about the intersection of computer vision and applied quantum computing.\\n\\nWe hope this survey will serve as a gateway for new researchers to enter the field and contribute to its devel- opment.\\n\\n1.3 Structure of this Survey\\n\\nThis survey is structured in six sections. Sec. 2 reviews foundations of quantum computing relevant to quantum- enhanced computer vision methods including two main\\n\\nquantum computational paradigms, i.e. gate-based (or circuit-based) and adiabatic quantum computing, as well as the design and hardware of quantum computers. Moreover, as part of it, Sec. 3.1.2 discusses the encoding of classical data as quantum states suitable for processing on quantum hardware, which is one of the essential steps in quantum- enhanced computer vision. Next, Sec. 3 is devoted to algo- rithms and applications. It reviews methods for different problems such as point set alignment, mesh registration, object tracking, model fitting, quantum machine learning for vision, implicit representations and generative approaches, among others. We then discuss open challenges, specifics of the field and social implications in Sec. 5, and conclude in Sec. 6.\\n\\n2 OPERATIONAL PRINCIPLES OF QCS\\n\\nThis section reviews the foundations of quantum computing necessary as a short introduction (or a refresher) to the field and sufficient for understanding the published literature reviewed in Sec. 3. Sec. 2.1 introduces fundamentals such as notations and elementary operations on qubits. Secs. 2.2 and 2.3 describe the two main quantum computational paradigms—namely, gate-based quantum computing and quantum annealing, respectively—and Sec. 2.4 establishes their connections and equivalency. Sec. 2.5 examines po- tential advantages of quantum computing over classical computing and Sec. 2.6 discusses hardware designs and realizations of quantum computers.\\n\\nNote that we do not review complexity classes associated with the quantum computational paradigm, as their discus- sion is outside the scope. In this survey, it is sufficient to mention that quantum computers can accelerate computa- tions relevant to computer vision and provide different and favourable properties in quantum-enhanced learning mod- els; both of these characteristics provide strong motivation for the newly established research direction of QeCV. We recommend references [66], [67] to readers wishing to learn more about quantum complexity theory.\\n\\n2.1 Fundamentals and Notations\\n\\nThis background section introduces the fundamentals of quantum computing through both gate-based and adiabatic paradigms. It also establishes the notation and terminology used throughout the survey. For clarity and ease of refer- ence, a comprehensive overview of symbols and acronyms is provided in Tabs. 1 and 2; note that the acronym list excludes method names and company identifiers.\\n\\nMany concepts in quantum computing have direct ana- logues in classical computing and optimization theory. For example, Hamiltonians represent energy functions, with eigenstates corresponding to energy levels and ground states denoting the lowest-energy configurations. Through- out the survey, we assume these physical Hamiltonians to be Hermitian operators, and unless otherwise stated, the quantum systems considered are closed—i.e., they do not exchange particles with their environment. We adopt the bra–ket notation to concisely express familiar linear alge- braic constructs such as row and column vectors, inner and outer products, and tensor products. This notation stream- lines the presentation of quantum algorithms and aligns\\n\\nTABLE 1: List of symbols used in the survey.\\n\\nSymbol Description N Set of natural numbers R Set of real numbers C Set of complex numbers i ℏ Complex unit (if not stated otherwise) Reduced Planck’s constant θ Optimisable rotation angle n Number of qubits in the system |ψ⟩ State vector of the system ⟨ψ| Conjugate transpose of |ψ⟩ ⟨ψ1|ψ2⟩ ∥□∥ |□| Complex inner product of |ψ1⟩ and |ψ2⟩ L2-norm of □ Absolute value of □ ρ Density operator, i.e. ρ = |ψ⟩⟨ψ| ⊗ Kronecker product, Tensor product G Generator of a unitary gate I Identity matrix or operator (per context) H Hamiltonian or Hadamard gate (per context) HI Initial Hamiltonian HP Problem Hamiltonian M Measurement observable U Unitary operator σ{x,y,z} Pauli-X, -Y and -Z operators Tr Trace operator λ Eigenvalue or penalty factor (per context) T Total time ∆t Time step s x v Ising variables, i.e. s ∈ {−1,1}n QUBO variables, i.e. x ∈ {0,1}n General binary variable, i.e. v ∈ {v1,v2}n J,b Couplings and biases of an Ising problem Q,c Couplings and biases of a QUBO problem W,w Couplings and biases of a general quadratic binary decision problem A,b Matrix and vector of linear constraints f Schedule or objective function (per context) L Loss function of PQCs\\n\\nTABLE 2: List of frequent acronyms used in the survey.\\n\\nAcronym Description AQC Adiabatic quantum computing BNN Binary neural networks CPU Central processing unit CV Computer vision CVPR Computer Vision and Pattern Recognition ECCV European Conference on Computer Vision GPU Graphics processing unit GQC Gate-based quantum computing ICCV International Conference on Computer Vision MLP Multi-layer perceptron ML Machine learning NISQ Noisy intermediate-scale quantum NN Neural networks PQC Parameterized quantum circuit QA Quantum annealing QBO Quantum binary optimization QC Quantum computing QCT Quantum complexity theory QCNN Quantum convolutional neural network QCVML Quantum computer vision and machine learning QDK Quantum development kit QeCV Quantum-enhanced computer vision QIP Quantum image processing QML Quantum machine learning QNN Quantum neural networks QPU Quantum processing unit QUBO Quadratic unconstrained binary optimization SDK Software development kit SSD Sum of squared distances\\n\\nwith conventions in both physics and quantum information theory.\\n\\nSingle Qubits. Let us start with the most fundamental building block and elementary information unit of a quan- tum computer, the qubit.\\n\\nThe information carrier in a quantum comput- ing system called qubit is described by a two- dimensional complex vector of length one, i.e.\\n\\n(1)\\n\\nWe adopt the widely-used bra–ket notation common in physics and quantum computing to write vectors and their conjugate transposes: |ψ⟩ (ket) denotes a column vector and ⟨ψ| = |ψ⟩∗ (ket) denotes its conjugate transpose. The multiplication of a bra- and a ket-vector ⟨ψ1| and |ψ2⟩, denoted ⟨ψ1|ψ2⟩, results in their inner product in the C2 Hilbert space3. We can write an arbitrary two-dimensional complex vector as a column vector:\\n\\nwith real coefficients a,b,c,d ∈ R. The normalisation con- dition of the qubit’s state vector in Eq. (1) then yields a2 + b2 + c2 + d2 = 1.\\n\\nNext, we translate a quantum-physical phenomenon, namely that particles can also be in superposition of states. Only after measurement do they collapse to one of the classical states with a certain probability. This is modeled mathematically in the following way: Let |0⟩ ∈ C2 and |1⟩ ∈ C2 form an orthonormal basis of C2 (whose precise form depends on the setup of the physical system). For example, one can have in column vector notation:\\n\\nWhen the state of a qubit |ψ⟩ = α|0⟩ + β |1⟩ ∈ C2 is measured (with respect to the basis {|0⟩,|1⟩}), it results in a state\\n\\n[0) 1)\\n\\nwith probability |α|2 = |⟨0|ψ⟩|2,\\n\\nwith probability |β|2 = |⟨1|ψ⟩|2.\\n\\n(4)\\n\\nIn other words, a qubit exists in a superposition of classi- cal states as |ψ⟩ = α|0⟩+β |1⟩, where α and β are probability amplitudes. Upon measurement, the qubit collapses into either |0⟩ or |1⟩, with the measurement probability given by the square of the absolute value of the respective amplitude: |α|2 for |0⟩ and |β|2 for |1⟩. This is also called collapse of the\\n\\n3. Note that in quantum mechanics, states can also have higher dimension than qubits. For non-discrete quantities like momentum or position, these states are not finite-dimensional but functions from the Hilbert space L2(R3), i.e. a so-called Lebesgue space. In this case, the bra vectors can be understood as functionals from L2. Furthermore, even the formulation on L2 still has shortcomings and a full math- ematical treatment would have to resort to rigged Hilbert spaces to describe distributions and deal with unbounded operators [68].\\n\\n(2)\\n\\n(3)\\n\\nTABLE 1: List of symbols used in the survey.\\n\\nSymbol Description N Set of natural numbers R Set of real numbers C Set of complex numbers i ℏ Complex unit (if not stated otherwise) Reduced Planck’s constant θ Optimisable rotation angle n Number of qubits in the system |ψ⟩ State vector of the system ⟨ψ| Conjugate transpose of |ψ⟩ ⟨ψ1|ψ2⟩ ∥□∥ |□| Complex inner product of |ψ1⟩ and |ψ2⟩ L2-norm of □ Absolute value of □ ρ Density operator, i.e. ρ = |ψ⟩⟨ψ| ⊗ Kronecker product, Tensor product G Generator of a unitary gate I Identity matrix or operator (per context) H Hamiltonian or Hadamard gate (per context) HI Initial Hamiltonian HP Problem Hamiltonian M Measurement observable U Unitary operator σ{x,y,z} Pauli-X, -Y and -Z operators Tr Trace operator λ Eigenvalue or penalty factor (per context) T Total time ∆t Time step s x v Ising variables, i.e. s ∈ {−1,1}n QUBO variables, i.e. x ∈ {0,1}n General binary variable, i.e. v ∈ {v1,v2}n J,b Couplings and biases of an Ising problem Q,c Couplings and biases of a QUBO problem W,w Couplings and biases of a general quadratic binary decision problem A,b Matrix and vector of linear constraints f Schedule or objective function (per context) L Loss function of PQCs\\n\\nTABLE 2: List of frequent acronyms used in the survey.\\n\\nAcronym Description AQC Adiabatic quantum computing BNN Binary neural networks CPU Central processing unit CV Computer vision CVPR Computer Vision and Pattern Recognition ECCV European Conference on Computer Vision GPU Graphics processing unit GQC Gate-based quantum computing ICCV International Conference on Computer Vision MLP Multi-layer perceptron ML Machine learning NISQ Noisy intermediate-scale quantum NN Neural networks PQC Parameterized quantum circuit QA Quantum annealing QBO Quantum binary optimization QC Quantum computing QCT Quantum complexity theory QCNN Quantum convolutional neural network QCVML Quantum computer vision and machine learning QDK Quantum development kit QeCV Quantum-enhanced computer vision QIP Quantum image processing QML Quantum machine learning QNN Quantum neural networks QPU Quantum processing unit QUBO Quadratic unconstrained binary optimization SDK Software development kit SSD Sum of squared distances\\n\\nwith conventions in both physics and quantum information theory.\\n\\nSingle Qubits. Let us start with the most fundamental building block and elementary information unit of a quan- tum computer, the qubit.\\n\\nThe information carrier in a quantum comput- ing system called qubit is described by a two- dimensional complex vector of length one, i.e.\\n\\n(1)\\n\\nWe adopt the widely-used bra–ket notation common in physics and quantum computing to write vectors and their conjugate transposes: |ψ⟩ (ket) denotes a column vector and ⟨ψ| = |ψ⟩∗ (ket) denotes its conjugate transpose. The multiplication of a bra- and a ket-vector ⟨ψ1| and |ψ2⟩, denoted ⟨ψ1|ψ2⟩, results in their inner product in the C2 Hilbert space3. We can write an arbitrary two-dimensional complex vector as a column vector:\\n\\nwith real coefficients a,b,c,d ∈ R. The normalisation con- dition of the qubit’s state vector in Eq. (1) then yields a2 + b2 + c2 + d2 = 1.\\n\\nNext, we translate a quantum-physical phenomenon, namely that particles can also be in superposition of states. Only after measurement do they collapse to one of the classical states with a certain probability. This is modeled mathematically in the following way: Let |0⟩ ∈ C2 and |1⟩ ∈ C2 form an orthonormal basis of C2 (whose precise form depends on the setup of the physical system). For example, one can have in column vector notation:\\n\\nWhen the state of a qubit |ψ⟩ = α|0⟩ + β |1⟩ ∈ C2 is measured (with respect to the basis {|0⟩,|1⟩}), it results in a state\\n\\n[0) 1)\\n\\nwith probability |α|2 = |⟨0|ψ⟩|2,\\n\\nwith probability |β|2 = |⟨1|ψ⟩|2.\\n\\n(4)\\n\\nIn other words, a qubit exists in a superposition of classi- cal states as |ψ⟩ = α|0⟩+β |1⟩, where α and β are probability amplitudes. Upon measurement, the qubit collapses into either |0⟩ or |1⟩, with the measurement probability given by the square of the absolute value of the respective amplitude: |α|2 for |0⟩ and |β|2 for |1⟩. This is also called collapse of the\\n\\n3. Note that in quantum mechanics, states can also have higher dimension than qubits. For non-discrete quantities like momentum or position, these states are not finite-dimensional but functions from the Hilbert space L2(R3), i.e. a so-called Lebesgue space. In this case, the bra vectors can be understood as functionals from L2. Furthermore, even the formulation on L2 still has shortcomings and a full math- ematical treatment would have to resort to rigged Hilbert spaces to describe distributions and deal with unbounded operators [68].\\n\\n(2)\\n\\n(3)\\n\\nFig. 2: Visualising an arbitrary state of a qubit |ψ⟩ on the Bloch sphere along with its several widely encountered states. Although in the original space C2 the states |0⟩, |1⟩ are orthogonal, they are visualised as opposite poles on the Bloch sphere.\\n\\nwave function. The act of measurement w.r.t. a basis changes the state into one of the basis elements, with probabilities defined by the projections of the state onto the basis.\\n\\nBloch Sphere. As the probabilities of obtaining certain measurement outcomes depend only on the magnitude of the coefficients, it is easy to see that a change of global phase, i.e. a multiplication by a global factor eiϕ with ϕ ∈ R, does not affect any measurable properties of a qubit. Due to this ambiguity in the complex phase, it is common to fix α, the coefficient for the first basis vector |0⟩, to be real, i.e. α ∈ R, thereby resolving the ambiguity.\\n\\nAlong with the normalisation condition |α|2 + |β|2 = 1 as given in Eq. (1), any qubit state can be expressed as\\n\\nThe two angles θ ∈ [0,π] and φ ∈ [0,2π] naturally define a point on the unit sphere in three dimensions, known as the Bloch sphere. The state of a qubit |ψ⟩ is frequently visualised in this representation; see Fig. 2.\\n\\nMulti-Qubit Systems. When multiple qubits |ψ1⟩,|ψ2⟩,...,|ψn⟩ are considered, their overall state |ψ⟩ is described by the tensor or Kronecker product of the individual qubit states and results in a vector consisting of all combinations of products of the individual coefficients:\\n\\nFor two qubits |ψ1⟩ = α|0⟩ + β |1⟩ and |ψ⟩ = γ |0⟩ + δ |1⟩, this product state reads:\\n\\nA useful shorthand notation for the Kronecker product is\\n\\nand such a system of multiple qubits is called quantum register.\\n\\nNote that although the combination of multiple qubits is represented in a 2n-dimensional complex vector space, one would expect that actual (physically meaningful) vectors oc- cupy a low dimensional subset of only those 2n dimensional\\n\\n(5)\\n\\n(6)\\n\\n(7)\\n\\n(8)\\n\\nvectors that can be decomposed into a tensor product (6). Note, for example, that the tensor product of two vectors |ψ1⟩ and |ψ2⟩ is (a vectorisation of) their outer product, such that 2-qubit-states of the form (7) can be identified with rank-1 matrices in the space of all complex 2×2 matrices. The extremely interesting and important concept of entan- glement implies that multiple-qubit systems are not limited to so-called separable states that decompose as Eq. (6). The state\\n\\nfor example, cannot be decomposed as |ψ1⟩ ⊗ |ψ2⟩ from Eq. (7) since there are no α,β,γ,δ that can simultane- √ 2, βγ = ously satisfy the equations αγ = 0, αδ = 1/ √ 1/ 2, βδ = 0. This is one of the famous Einstein-Podolsky- Rosen states [10].\\n\\nA system of n many qubits can evolve into any state in C2n . States that can be represented as tensor- product states via Eq. (6) are called separable, states that do not admit such a representation are called entangled.\\n\\nNote that Assumption 2 extends to multi-qubit systems, irrespective of whether the system is separable or entangled. In other words, the probability of measuring a specific multi-qubit basis state is equal to the squared magnitude of the corresponding coefficient.\\n\\nQubit Evolution. It is possible to manipulate the state of an n-qubit quantum physical system experimentally. For example, let |ψ(0)⟩ be the state of the quantum system at time t. We can prepare an initial state |ψ(0)⟩ and manipulate it at any time t with external influences H(t) ∈ C2n×2n determining the system’s energies by letting the n-many qubits experience a certain coupling. The way the state vector behaves under such an evolution is described by the famous Schr¨odinger Equation.\\n\\nThe (time) evolution of a quantum state |ψ(t)⟩ is described by the Schr¨odinger Equation\\n\\n(10)\\n\\nwhere the Hamiltonian H(t) ∈ C2n×2n is a Hermitian matrix determined by the experimental setup, i is the imaginary unit and ℏ is the reduced Planck constant.\\n\\nA closer look at the solutions to this differential equation reveals that they follow some unitary time evolution [10]. This is in particular consistent with what we learned that quantum states have norm 1, since the length of the compo- nents is related to measurement probabilities.\\n\\n(9)\\n\\nFig. 2: Visualising an arbitrary state of a qubit |ψ⟩ on the Bloch sphere along with its several widely encountered states. Although in the original space C2 the states |0⟩, |1⟩ are orthogonal, they are visualised as opposite poles on the Bloch sphere.\\n\\nwave function. The act of measurement w.r.t. a basis changes the state into one of the basis elements, with probabilities defined by the projections of the state onto the basis.\\n\\nBloch Sphere. As the probabilities of obtaining certain measurement outcomes depend only on the magnitude of the coefficients, it is easy to see that a change of global phase, i.e. a multiplication by a global factor eiϕ with ϕ ∈ R, does not affect any measurable properties of a qubit. Due to this ambiguity in the complex phase, it is common to fix α, the coefficient for the first basis vector |0⟩, to be real, i.e. α ∈ R, thereby resolving the ambiguity.\\n\\nAlong with the normalisation condition |α|2 + |β|2 = 1 as given in Eq. (1), any qubit state can be expressed as\\n\\nThe two angles θ ∈ [0,π] and φ ∈ [0,2π] naturally define a point on the unit sphere in three dimensions, known as the Bloch sphere. The state of a qubit |ψ⟩ is frequently visualised in this representation; see Fig. 2.\\n\\nMulti-Qubit Systems. When multiple qubits |ψ1⟩,|ψ2⟩,...,|ψn⟩ are considered, their overall state |ψ⟩ is described by the tensor or Kronecker product of the individual qubit states and results in a vector consisting of all combinations of products of the individual coefficients:\\n\\nFor two qubits |ψ1⟩ = α|0⟩ + β |1⟩ and |ψ⟩ = γ |0⟩ + δ |1⟩, this product state reads:\\n\\nA useful shorthand notation for the Kronecker product is\\n\\nand such a system of multiple qubits is called quantum register.\\n\\nNote that although the combination of multiple qubits is represented in a 2n-dimensional complex vector space, one would expect that actual (physically meaningful) vectors oc- cupy a low dimensional subset of only those 2n dimensional\\n\\n(5)\\n\\n(6)\\n\\n(7)\\n\\n(8)\\n\\nvectors that can be decomposed into a tensor product (6). Note, for example, that the tensor product of two vectors |ψ1⟩ and |ψ2⟩ is (a vectorisation of) their outer product, such that 2-qubit-states of the form (7) can be identified with rank-1 matrices in the space of all complex 2×2 matrices. The extremely interesting and important concept of entan- glement implies that multiple-qubit systems are not limited to so-called separable states that decompose as Eq. (6). The state\\n\\nfor example, cannot be decomposed as |ψ1⟩ ⊗ |ψ2⟩ from Eq. (7) since there are no α,β,γ,δ that can simultane- √ 2, βγ = ously satisfy the equations αγ = 0, αδ = 1/ √ 1/ 2, βδ = 0. This is one of the famous Einstein-Podolsky- Rosen states [10].\\n\\nA system of n many qubits can evolve into any state in C2n . States that can be represented as tensor- product states via Eq. (6) are called separable, states that do not admit such a representation are called entangled.\\n\\nNote that Assumption 2 extends to multi-qubit systems, irrespective of whether the system is separable or entangled. In other words, the probability of measuring a specific multi-qubit basis state is equal to the squared magnitude of the corresponding coefficient.\\n\\nQubit Evolution. It is possible to manipulate the state of an n-qubit quantum physical system experimentally. For example, let |ψ(0)⟩ be the state of the quantum system at time t. We can prepare an initial state |ψ(0)⟩ and manipulate it at any time t with external influences H(t) ∈ C2n×2n determining the system’s energies by letting the n-many qubits experience a certain coupling. The way the state vector behaves under such an evolution is described by the famous Schr¨odinger Equation.\\n\\nThe (time) evolution of a quantum state |ψ(t)⟩ is described by the Schr¨odinger Equation\\n\\n(10)\\n\\nwhere the Hamiltonian H(t) ∈ C2n×2n is a Hermitian matrix determined by the experimental setup, i is the imaginary unit and ℏ is the reduced Planck constant.\\n\\nA closer look at the solutions to this differential equation reveals that they follow some unitary time evolution [10]. This is in particular consistent with what we learned that quantum states have norm 1, since the length of the compo- nents is related to measurement probabilities.\\n\\n(9)\\n\\n2.2 Gate-Based Quantum Computers\\n\\nAs alluded to in Sec. 2.1, the Schr¨odinger equation (10) gov- erns the evolution of a closed quantum system over time. Moreover, it can be proven that solving the Schr¨odinger equation always leads to a unitary time evolution [10, Chap- ter 2]. Gate-based quantum computers manipulate quantum states through a controlled sequence of unitary Hamil- tonian evolutions—each abstracted as a quantum gate—to perform computations. The specific algorithm implemented depends on the particular quantum gates chosen (Sec. 2.2.1). While the earlier quantum algorithms, including the famous algorithms of Shor [19] and Grover [69], are based on “handcrafted” quantum gates, increasingly quantum gates are selected by “training” on data to optimise an objective function [70], [71], [72] (Sec. 2.2.2).\\n\\n2.2.1 An Overview of Gate-based Quantum Algorithms\\n\\nSince algorithms under the gate-based model are often defined using a sequence of quantum gates, quantum algo- rithms are often framed as quantum circuits. Fig. 3 provides an overview of the major steps in a quantum algorithm, dis- played as an n-qubit quantum circuit. First, a basic quantum state (e.g. one of the classical states) is generated. Typically, the basic state is brought into superposition as a form of initialisation. Then, a sequence of quantum gates is applied to the initial state to achieve a final quantum state. Since a sequence of non-interrupted (e.g. without intermediate measurements) unitary transformations is equivalent to a single unitary transformation, the successive quantum gates that define the algorithm can be seen as a single quan- tum gate “block”. The final quantum state is not directly usable until it is measured, at which point it collapses to yield classical (binary) information. This collapse reduces the degrees of freedom available for further computation, which is why many quantum algorithms repeat the ini- tialise–transform–measure cycle multiple times—often with intermediate feedback—to amplify the probability of suc- cess and extract reliable outcomes from probabilistic quan- tum processes. Measurement mathematically manifests in projecting the quantum state onto the eigenbasis of a Hermi- tian operator known as an observable, with the output given by the eigenvalue of one of its eigenvectors (see Eq. (4) for an elementary example). The choice of observable depends on the algorithm, while the likelihood of obtaining a particular outcome is determined by the final quantum state.\\n\\nSimilar to classical algorithm design—where we are not concerned with the underlying implementation of the basic computational units (i.e. logical gates)—in quantum algorithm design, details of the physical realisation of the quantum gates and measurement devices are typically also abstracted away.\\n\\nConstraining the operations to be unitary transforma- tions can seem restrictive. However, it can be shown that all (classical) logical gates can be made reversible with- out significantly adding more bits or logical operations, thereby allowing the logical gates to be interpreted as unitary transformations [10, Sec. 1.4]. It turns out that any efficient classical algorithm can also be efficiently executed on a quantum computer in a analogous way [10, Sec. 3.2]. However, significant interest in gate-based quantum com- puters draws from their ability to solve certain problems\\n\\nFig. 3: Common processing stages in a typical quantum circuit. Here, the quantum circuit operates on n qubits, which at the beginning are often initialised to a basic state, e.g. |ψ1ψ2 ...ψn⟩ = |10...0⟩. Then, the basic state is pre- pared to yield an initial (usually superimposed) state, before a sequence of M quantum gates is invoked. The specific gates employed define the algorithm implemented. The final quantum state is then measured to produce classical outcomes that correspond to the final results, which are often probabilistic. It is also common for a circuit to be repetitively invoked or iterated.\\n\\nthat are beyond the reach of classical machines [19], [69] (Sec. A in the Appendix provides a rudimentary example of an algorithm with a theoretical speedup due to quantum effects). Before reviewing some applications of gate-based quantum computing to QeCV, it is essential to elaborate deeper on quantum gates and quantum circuits.\\n\\n2.2.2 Quantum Gates and Circuits\\n\\nAs alluded above, all classical logical gates (e.g. AND, OR, NAND) can be made reversible, i.e. the inputs can be recom- puted from the outputs. This can be achieved through the Toffoli gate, which has three input and three output bits4. By hardwiring selected inputs (to 0 or 1), the Toffoli gate can simulate all logical gates in a reversible manner. Details of reversible logical gates are beyond the scope of this survey; for that, we refer the reader to [10, Sec. 1.4].\\n\\nSingle-Qubit Gates. Single-qubit gates are elementary gates acting on one single input qubit |ψ⟩. For example, the Pauli gates\\n\\n(11)\\n\\nare used in many quantum algorithms. They are often graphically displayed as follows:\\n\\nDefining quantum gates as matrices, as in Eq. (11), allows quantum operations to be performed as multiplications of\\n\\n4. A basic requirement for a reversible gate is that the number of input and output bits are the same.\\n\\n2.2 Gate-Based Quantum Computers\\n\\nAs alluded to in Sec. 2.1, the Schr¨odinger equation (10) gov- erns the evolution of a closed quantum system over time. Moreover, it can be proven that solving the Schr¨odinger equation always leads to a unitary time evolution [10, Chap- ter 2]. Gate-based quantum computers manipulate quantum states through a controlled sequence of unitary Hamil- tonian evolutions—each abstracted as a quantum gate—to perform computations. The specific algorithm implemented depends on the particular quantum gates chosen (Sec. 2.2.1). While the earlier quantum algorithms, including the famous algorithms of Shor [19] and Grover [69], are based on “handcrafted” quantum gates, increasingly quantum gates are selected by “training” on data to optimise an objective function [70], [71], [72] (Sec. 2.2.2).\\n\\n2.2.1 An Overview of Gate-based Quantum Algorithms\\n\\nSince algorithms under the gate-based model are often defined using a sequence of quantum gates, quantum algo- rithms are often framed as quantum circuits. Fig. 3 provides an overview of the major steps in a quantum algorithm, dis- played as an n-qubit quantum circuit. First, a basic quantum state (e.g. one of the classical states) is generated. Typically, the basic state is brought into superposition as a form of initialisation. Then, a sequence of quantum gates is applied to the initial state to achieve a final quantum state. Since a sequence of non-interrupted (e.g. without intermediate measurements) unitary transformations is equivalent to a single unitary transformation, the successive quantum gates that define the algorithm can be seen as a single quan- tum gate “block”. The final quantum state is not directly usable until it is measured, at which point it collapses to yield classical (binary) information. This collapse reduces the degrees of freedom available for further computation, which is why many quantum algorithms repeat the ini- tialise–transform–measure cycle multiple times—often with intermediate feedback—to amplify the probability of suc- cess and extract reliable outcomes from probabilistic quan- tum processes. Measurement mathematically manifests in projecting the quantum state onto the eigenbasis of a Hermi- tian operator known as an observable, with the output given by the eigenvalue of one of its eigenvectors (see Eq. (4) for an elementary example). The choice of observable depends on the algorithm, while the likelihood of obtaining a particular outcome is determined by the final quantum state.\\n\\nSimilar to classical algorithm design—where we are not concerned with the underlying implementation of the basic computational units (i.e. logical gates)—in quantum algorithm design, details of the physical realisation of the quantum gates and measurement devices are typically also abstracted away.\\n\\nConstraining the operations to be unitary transforma- tions can seem restrictive. However, it can be shown that all (classical) logical gates can be made reversible with- out significantly adding more bits or logical operations, thereby allowing the logical gates to be interpreted as unitary transformations [10, Sec. 1.4]. It turns out that any efficient classical algorithm can also be efficiently executed on a quantum computer in a analogous way [10, Sec. 3.2]. However, significant interest in gate-based quantum com- puters draws from their ability to solve certain problems\\n\\nFig. 3: Common processing stages in a typical quantum circuit. Here, the quantum circuit operates on n qubits, which at the beginning are often initialised to a basic state, e.g. |ψ1ψ2 ...ψn⟩ = |10...0⟩. Then, the basic state is pre- pared to yield an initial (usually superimposed) state, before a sequence of M quantum gates is invoked. The specific gates employed define the algorithm implemented. The final quantum state is then measured to produce classical outcomes that correspond to the final results, which are often probabilistic. It is also common for a circuit to be repetitively invoked or iterated.\\n\\nthat are beyond the reach of classical machines [19], [69] (Sec. A in the Appendix provides a rudimentary example of an algorithm with a theoretical speedup due to quantum effects). Before reviewing some applications of gate-based quantum computing to QeCV, it is essential to elaborate deeper on quantum gates and quantum circuits.\\n\\n2.2.2 Quantum Gates and Circuits\\n\\nAs alluded above, all classical logical gates (e.g. AND, OR, NAND) can be made reversible, i.e. the inputs can be recom- puted from the outputs. This can be achieved through the Toffoli gate, which has three input and three output bits4. By hardwiring selected inputs (to 0 or 1), the Toffoli gate can simulate all logical gates in a reversible manner. Details of reversible logical gates are beyond the scope of this survey; for that, we refer the reader to [10, Sec. 1.4].\\n\\nSingle-Qubit Gates. Single-qubit gates are elementary gates acting on one single input qubit |ψ⟩. For example, the Pauli gates\\n\\n(11)\\n\\nare used in many quantum algorithms. They are often graphically displayed as follows:\\n\\nDefining quantum gates as matrices, as in Eq. (11), allows quantum operations to be performed as multiplications of\\n\\n4. A basic requirement for a reversible gate is that the number of input and output bits are the same.\\n\\nunitary matrices, which can be verified: Consider the vector form of a one-qubit state\\n\\n.\\n\\nApplying the Pauli-X on |ψ⟩ implies conducting a matrix- vector multiplication, yielding X |ψ⟩ = β |0⟩ + α|1⟩. In particular, setting α = 1 and β = 0 yields X |ψ⟩ = |1⟩, which can be interpreted as “flipping” the basic state of |0⟩ to |1⟩ (and vice versa by if α = 0 and β = 1). Indeed, the Pauli X gate is actually a (reversible) NOT gate.\\n\\nAnother basic quantum gate is the Hadamard gate\\n\\nwhich is commonly used to bring basic states into super- position (and vice versa). e.g. applying H on |1⟩ yields H |1⟩ = 1√ (|0⟩ − |1⟩), i.e. a state of uniform superposition. 2 Applying H again on the above result yields the initial basic state: H(H |1⟩) = HH |1⟩ = |1⟩. As can be deduced, apply- ing quantum gates successively is equivalent to performing a quantum operation that is defined by the multiplication of the corresponding unitary matrices. The following quantum circuits depict the application of H and HH on the input qubits |ψ⟩:\\n\\n|ψ⟩\\n\\nI2\\n\\nComposition of Single-Qubit Gates. As introduced in Sec. 2.1, the state space of a n-qubit system is the tensor product space C2n . Accordingly, single-qubit quantum gates (specifically, their matrix representations) should also be tensored to obtain the corresponding n-qubit quantum gate. An example is the 2-qubit quantum gate H⊗2, which has the following circuit representation:\\n\\nIn matrix form, the H⊗2-gate is given by\\n\\n.\\n\\nThe application of a composite gate on a separable quantum state follows the calculation rule of the Kronecker product: With appropriate dimensions, the product (A ⊗ B) · (x ⊗ y) of two Kronecker products is the Kronecker product (Ax)⊗ (By) of two products. Hence, applying for example H⊗2 on the 2-qubit quantum state |01⟩ results in H⊗2 |01⟩ = H |0⟩⊗ H |1⟩, which is a 2-qubit state in uniform superposition. It is easy to verify that sequentially applying H⊗2 twice is the same as performing the identity operation I4. Controlled Gates. Entangled quantum states cannot be constructed by applying composite single-qubit gates to a separable initial state. Those particular states are obtained with controlled gates, which condition the application of a gate to a set of qubits on the state of other qubits. A popular\\n\\n(12)\\n\\n(13)\\n\\n(14)\\n\\nrepresentative of controlled gates is the controlled-NOT (CNOT) gate. It has the following circuit representation:\\n\\nIn the above circuit, we apply the X or NOT gate on qubit |ψ2⟩ if qubit |ψ1⟩ is in the |1⟩-state and do nothing otherwise. The CNOT gate has the following matrix representation:\\n\\nExample applications of CNOT on basis states yield CNOT|00⟩ = |00⟩ and CNOT|10⟩ = |11⟩.\\n\\nSimilar to classical computers, where any logical circuit of arbitrary complexity can be composed from a small set of universal logical gates (e.g NAND by itself is a universal logical gate), there exist universal quantum gates, which are a small set of one- or two-qubit gates (e.g. the Hadamard, controlled-NOT and phase shift make up a set of universal quantum gates). Through more elaborate applications of sequential multiplication and tensor operations, a set of universal quantum gates can simulate any unitary trans- formation in C2n up to arbitrary accuracy [10, Chapter 4.5]. Hence, in theory, a gate quantum computer needs only to physically implement a small set of unique quantum gates. Parameterised Gates. Devising quantum algorithms by handcrafting quantum circuits can be non-intuitive. Increas- ingly, quantum circuits are learned or optimised from data, typically in conjunction with classical optimisation. To this end, parametrised quantum gates play a crucial role.\\n\\nAs we have seen so far, all quantum gates are unitary operators preserving the magnitude of the state vector. So, it is natural to think of them as rotations on the Bloch sphere around a specific angle and axis. Indeed, up to a global phase factor eiφ that is negligible in measurement, any single-qubit gate U can be expressed as a special case of a parameter-dependent operator\\n\\nwhere θ ∈ R is the gate parameter, I is the identity and G is a certain unitary and Hermitian operator called generator of U. The operator U(θ) is a rotation of angle θ around the axis supporting the eigenvectors of G. Common generators are Pauli-X,Y,Z matrices, cf. Eq. (11), which turn U(θ) into a rotation of angle θ and about the x,y,z axis respectively. It is easy to verify that U(θ) in Eq. (16) is unitary.\\n\\nParameterised quantum gates play an important role in variational quantum computation [70], [71], [72]. They can be used to calculate the ground state (i.e. the Hamil- tonian eigenvector which returns the lowest eigenvalue) of a certain Hermitian operator or Hamiltonian M, which cannot be efficiently calculated classically. The approach is to design a so-called Parameterised Quantum Circuit (PQC) consisting of a unitary block U(θ), made up of several parameterised and/or controlled gates, which acts on an initial state vector |ψ⟩, resulting in a parameterised eigen- vector |ψ(θ)⟩ = U(θ)|0⟩. The parameter vector θ is then\\n\\n(15)\\n\\n(16)\\n\\nunitary matrices, which can be verified: Consider the vector form of a one-qubit state\\n\\n.\\n\\nApplying the Pauli-X on |ψ⟩ implies conducting a matrix- vector multiplication, yielding X |ψ⟩ = β |0⟩ + α|1⟩. In particular, setting α = 1 and β = 0 yields X |ψ⟩ = |1⟩, which can be interpreted as “flipping” the basic state of |0⟩ to |1⟩ (and vice versa by if α = 0 and β = 1). Indeed, the Pauli X gate is actually a (reversible) NOT gate.\\n\\nAnother basic quantum gate is the Hadamard gate\\n\\nwhich is commonly used to bring basic states into super- position (and vice versa). e.g. applying H on |1⟩ yields H |1⟩ = 1√ (|0⟩ − |1⟩), i.e. a state of uniform superposition. 2 Applying H again on the above result yields the initial basic state: H(H |1⟩) = HH |1⟩ = |1⟩. As can be deduced, apply- ing quantum gates successively is equivalent to performing a quantum operation that is defined by the multiplication of the corresponding unitary matrices. The following quantum circuits depict the application of H and HH on the input qubits |ψ⟩:\\n\\n|ψ⟩\\n\\nI2\\n\\nComposition of Single-Qubit Gates. As introduced in Sec. 2.1, the state space of a n-qubit system is the tensor product space C2n . Accordingly, single-qubit quantum gates (specifically, their matrix representations) should also be tensored to obtain the corresponding n-qubit quantum gate. An example is the 2-qubit quantum gate H⊗2, which has the following circuit representation:\\n\\nIn matrix form, the H⊗2-gate is given by\\n\\n.\\n\\nThe application of a composite gate on a separable quantum state follows the calculation rule of the Kronecker product: With appropriate dimensions, the product (A ⊗ B) · (x ⊗ y) of two Kronecker products is the Kronecker product (Ax)⊗ (By) of two products. Hence, applying for example H⊗2 on the 2-qubit quantum state |01⟩ results in H⊗2 |01⟩ = H |0⟩⊗ H |1⟩, which is a 2-qubit state in uniform superposition. It is easy to verify that sequentially applying H⊗2 twice is the same as performing the identity operation I4. Controlled Gates. Entangled quantum states cannot be constructed by applying composite single-qubit gates to a separable initial state. Those particular states are obtained with controlled gates, which condition the application of a gate to a set of qubits on the state of other qubits. A popular\\n\\n(12)\\n\\n(13)\\n\\n(14)\\n\\nrepresentative of controlled gates is the controlled-NOT (CNOT) gate. It has the following circuit representation:\\n\\nIn the above circuit, we apply the X or NOT gate on qubit |ψ2⟩ if qubit |ψ1⟩ is in the |1⟩-state and do nothing otherwise. The CNOT gate has the following matrix representation:\\n\\nExample applications of CNOT on basis states yield CNOT|00⟩ = |00⟩ and CNOT|10⟩ = |11⟩.\\n\\nSimilar to classical computers, where any logical circuit of arbitrary complexity can be composed from a small set of universal logical gates (e.g NAND by itself is a universal logical gate), there exist universal quantum gates, which are a small set of one- or two-qubit gates (e.g. the Hadamard, controlled-NOT and phase shift make up a set of universal quantum gates). Through more elaborate applications of sequential multiplication and tensor operations, a set of universal quantum gates can simulate any unitary trans- formation in C2n up to arbitrary accuracy [10, Chapter 4.5]. Hence, in theory, a gate quantum computer needs only to physically implement a small set of unique quantum gates. Parameterised Gates. Devising quantum algorithms by handcrafting quantum circuits can be non-intuitive. Increas- ingly, quantum circuits are learned or optimised from data, typically in conjunction with classical optimisation. To this end, parametrised quantum gates play a crucial role.\\n\\nAs we have seen so far, all quantum gates are unitary operators preserving the magnitude of the state vector. So, it is natural to think of them as rotations on the Bloch sphere around a specific angle and axis. Indeed, up to a global phase factor eiφ that is negligible in measurement, any single-qubit gate U can be expressed as a special case of a parameter-dependent operator\\n\\nwhere θ ∈ R is the gate parameter, I is the identity and G is a certain unitary and Hermitian operator called generator of U. The operator U(θ) is a rotation of angle θ around the axis supporting the eigenvectors of G. Common generators are Pauli-X,Y,Z matrices, cf. Eq. (11), which turn U(θ) into a rotation of angle θ and about the x,y,z axis respectively. It is easy to verify that U(θ) in Eq. (16) is unitary.\\n\\nParameterised quantum gates play an important role in variational quantum computation [70], [71], [72]. They can be used to calculate the ground state (i.e. the Hamil- tonian eigenvector which returns the lowest eigenvalue) of a certain Hermitian operator or Hamiltonian M, which cannot be efficiently calculated classically. The approach is to design a so-called Parameterised Quantum Circuit (PQC) consisting of a unitary block U(θ), made up of several parameterised and/or controlled gates, which acts on an initial state vector |ψ⟩, resulting in a parameterised eigen- vector |ψ(θ)⟩ = U(θ)|0⟩. The parameter vector θ is then\\n\\n(15)\\n\\n(16)\\n\\nthe unknown that needs to be tuned in a gradient-based or gradient-free manner to minimize the cost function ⟨ψ(θ)|M|ψ(θ)⟩.\\n\\nSimilar to classical learning-based paradigms, PQCs can also be trained to perform other tasks with proper M and circuit U(θ) defined. Typically, the loss function for training a PQC has the following form:\\n\\nwhere x are training samples, ρx(θ) = |ψx(θ)⟩⟨ψx(θ)| is the density operator describing the quantum state before measurement, Mx is a certain problem-dependent observ- able, Tr is the trace operator and fx is the function that pushes the quantum circuit to learn the pattern of the data. Standard f(x) include mean-squared errors, mean- averaged errors for regressions and others for classifica- tions. In simple terms, Tr(ρx(θ)Mx) is the expectation value on the observable Mx on the quantum state |ψx(θ)⟩, so Tr(ρx(θ)Mx) = ⟨ψx(θ)|Mx|ψx(θ)⟩. PQCs are differentiable and their gradient can be evaluated using the so-called parameter-shift rule [73]. Small-scale PQCs can even be simulated classically, in which case auto-differentiation and back-propagation can be used for the training. PQCs belong to a larger class of quantum algorithm named Quantum Machine Learning (QML) [53], [54], [74], [75].\\n\\n2.3 Adiabatic QC and Quantum Annealing\\n\\nAdiabatic Quantum Computing (AQC) is another quantum computational paradigm different from gate-based quan- tum computing discussed in the previous section. Instead of representing unitary transformations using a sequence of well-defined (pre-defined or learned) elementary unitary transformation blocks, AQC performs useful calculations through a continuous transition between Hamiltonians. To put it simply, in classical computing terms, Hamiltonians can be thought of as energy functions; those are mathemati- cal expressions that describe how energy is distributed in a system. A transitioning, or a time-dependent Hamiltonian, then acts as an evolving energy landscape, with the goal being to guide the quantum system toward an optimal solu- tion. The choice of this Hamiltonian is crucial and structured in a way that naturally aligns with the optimization problem being solved. Next, we discuss the operational principle of AQC grounded on the adiabatic theorem of quantum mechanics [76].\\n\\nSuppose we are given a fixed initial Hamiltonian H(0) = HI and that it is physically possible to create an initial quantum state |ψ(0)⟩ that is an eigenvector to the smallest eigenvalue of the operator H(0). Moreover, it is possible (with certain system-dependent limitations) to create n2 many interactions, so-called couplings, Ji,j ∈ R between the individual qubits along with n biases bi ∈ R acting on each qubit that amount to a Hamiltonian of the form\\n\\n,\\n\\n(17)\\n\\n(18)\\n\\n(19)\\n\\n(20)\\n\\nOne can show that these n2 + n many terms lead to a Hamiltonian HP that is a diagonal 2n × 2n matrix whose diagonal entries are the enumeration of all costs obtained via s⊤Js + s⊤b for s ∈ {−1,+1}n; see Sec. B.1 for details. Thanks to this enumeration, the eigenvector |ψ⟩ to the small- est eigenvalue of HP is a unit vector whose only entry equal to one can be identified with a particular s ∈ {−1,+1}n that is the solution to\\n\\nProblem (21) is a widely encountered combinatorial opti- mization problem known as Ising problem. An equivalent formulation over binary variables, called quadratic uncon- strained binary optimisation (QUBO), is found by variable substitution xi = (si + 1)/2, yielding\\n\\nfor Q = J/4 and c = (b + 1⊤J)/2, where 1⊤J sums the columns of J.\\n\\nThe central idea of adiabatic quantum computing then merely follows from the free evolution of the state vector |ψ(t)⟩ (Assumption 3 in Sec. 2.1) under the Schr¨odinger equation (Assumption 4 in Sec. 2.1) with an experimentally constructed Hamiltonian of a form\\n\\nfor a function f : [0,T] → [0,1] that slowly ', 'text_length': 100000, 'message': 'Retrieved 122 pages for 2510v07317v1.pdf'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3mCould not parse tool input: {'arguments': '{\"__arg1\":\"Quantum-enhanced Computer Vision: Going Beyond Classical Algorithms\\\\n\\\\nNatacha Kuete Meli1 Tat-Jun Chin2\\\\n\\\\nTolga Birdal3\\\\n\\\\nShuteng Wang4 Marcel Seelbach Benkner1 Michele Sasdelli2\\\\n\\\\nVladislav Golyanik4\\\\n\\\\nMichael Moeller1\\\\n\\\\nniversity of Siegen\\\\n\\\\n2University of Adelaide\\\\n\\\\n3Imperial College London\\\\n\\\\nAMPI for Informatics\\\\n\\\\nAbstract—Quantum-enhanced Computer Vision (QeCV) is a new research field at the intersection of computer vision, optimisation theory, machine learning and quantum computing. It has high potential to transform how visual signals are processed and interpreted with the help of quantum computing that leverages quantum-mechanical effects in computations inaccessible to classical (i.e. non-quantum) computers. In scenarios where existing non-quantum methods cannot find a solution in a reasonable time or compute only approximate solutions, quantum computers can provide, among others, advantages in terms of better time scalability for multiple problem classes. Parametrised quantum circuits can also become, in the long term, a considerable alternative to classical neural networks in computer vision. However, specialised and fundamentally new algorithms must be developed to enable compatibility with quantum hardware and unveil the potential of quantum computational paradigms in computer vision. This survey contributes to the existing literature on QeCV with a holistic review of this research field. It is designed as a quantum computing reference for the computer vision community, targeting computer vision students, scientists and readers with related backgrounds who want to familiarise themselves with QeCV. We provide a comprehensive introduction to QeCV, its specifics, and methodologies for formulations compatible with quantum hardware and QeCV methods, leveraging two main quantum computational paradigms, i.e. gate-based quantum computing and quantum annealing. We elaborate on the operational principles of quantum computers and the available tools to access, program and simulate them in the context of QeCV. Finally, we review existing quantum computing tools and learning materials and discuss aspects related to publishing and reviewing QeCV papers, open challenges and potential social implications.\\\\n\\\\nIndex Terms—Quantum-enhanced Computer Vision, Applied Quantum Computing, Quantum Algorithms.\\\\n\\\\n✦\\\\n\\\\n1 INTRODUCTION\\\\n\\\\nIn addition, theoretical guarantees of classical solvers are only locally optimal [6], and as such, may not be sufficient for certain problems.\\\\n\\\\nHence, in many cases, progress on the algorithmic side is a valuable alternative to increasing demands in storage and computational resources. Such algorithmic improvements can be achieved in different ways, e.g. by approximative for- mulations and solutions instead of exact but computation- ally expensive or intractable ones. This survey investigates a specific type of those, namely related to how quantum computing can be leveraged in computer vision.\\\\n\\\\nQuantum Computing at a Glance. Since the proposal of quantum computers in the 1980s [7], [8], [9], substantial progress has been made in their practical experimental realisations. The new computational paradigm inspired a multitude of works on theoretical foundations of quantum computing (computer science) [10], [11] and quantum hard- ware realisations (physics) [12]. Researchers have been ac- tively working on various qubit technologies for the last 25 years, and quantum computers based on them are available now for research purposes. Such machines and hardware that are and will become available in the near future— with up to several hundred qubits—are often called Noisy Intermediate-scale Quantum (NISQ) computers [13]. In 2019, a quantum computer, Google’s Sycamore, was able to perform a particular calculation tens of orders of magnitude quicker than a classical computer [14].\\\\n\\\\nQuantum computation fundamentally revolves around\\\\n\\\\nC Omputer Vision (CV) studies automatic processing of visual and spatial information. Data representing such information is acquired in the form of 2D images, videos, depth maps, 3D point clouds, and different combinations of these inputs, possibly along with other sensory signals (e.g. data from inertial measurement units) [1], [2]. The long- term aim of computer vision is to come up with intelligent, high-level interpretations of the observed scenes, inspired by the capabilities of the Human Visual System (HVS). Today, CV and Machine Learning (ML) constantly inform each other: modern CV strongly relies on ML techniques while developments in ML are often driven by unsolved problems in CV [3].\\\\n\\\\nMany modern CV systems are inherently complex: They include multiple components and process large data vol- umes (e.g. during neural network training) [4], [5]. Pri- mary tools of computer vision, such as deep learning, have started to demand unreasonable and hard-to-satisfy GPU resources when classical computation is concerned. Neural architectures in many areas of CV are becoming increasingly larger, parameter-heavy and require more and more time to converge during training. Moreover, many CV problems contain combinatorial objectives that cannot be solved by an exhaustive search in a reasonable time. As a consequence, they require hardware that is able to fulfil high data pro- cessing demands (e.g. graphical processing units (GPUs)).\\\\n\\\\n5 2 0 2 t c O 8 ] V C . s c [ 1 v 7 1 3 7 0 . 0 1 5 2 : v i X r\\\\n\\\\na\\\\n\\\\nQuantum-enhanced Computer Vision: Going Beyond Classical Algorithms\\\\n\\\\nNatacha Kuete Meli1 Tat-Jun Chin2\\\\n\\\\nTolga Birdal3\\\\n\\\\nShuteng Wang4 Marcel Seelbach Benkner1 Michele Sasdelli2\\\\n\\\\nVladislav Golyanik4\\\\n\\\\nMichael Moeller1\\\\n\\\\nniversity of Siegen\\\\n\\\\n2University of Adelaide\\\\n\\\\n3Imperial College London\\\\n\\\\nAMPI for Informatics\\\\n\\\\nAbstract—Quantum-enhanced Computer Vision (QeCV) is a new research field at the intersection of computer vision, optimisation theory, machine learning and quantum computing. It has high potential to transform how visual signals are processed and interpreted with the help of quantum computing that leverages quantum-mechanical effects in computations inaccessible to classical (i.e. non-quantum) computers. In scenarios where existing non-quantum methods cannot find a solution in a reasonable time or compute only approximate solutions, quantum computers can provide, among others, advantages in terms of better time scalability for multiple problem classes. Parametrised quantum circuits can also become, in the long term, a considerable alternative to classical neural networks in computer vision. However, specialised and fundamentally new algorithms must be developed to enable compatibility with quantum hardware and unveil the potential of quantum computational paradigms in computer vision. This survey contributes to the existing literature on QeCV with a holistic review of this research field. It is designed as a quantum computing reference for the computer vision community, targeting computer vision students, scientists and readers with related backgrounds who want to familiarise themselves with QeCV. We provide a comprehensive introduction to QeCV, its specifics, and methodologies for formulations compatible with quantum hardware and QeCV methods, leveraging two main quantum computational paradigms, i.e. gate-based quantum computing and quantum annealing. We elaborate on the operational principles of quantum computers and the available tools to access, program and simulate them in the context of QeCV. Finally, we review existing quantum computing tools and learning materials and discuss aspects related to publishing and reviewing QeCV papers, open challenges and potential social implications.\\\\n\\\\nIndex Terms—Quantum-enhanced Computer Vision, Applied Quantum Computing, Quantum Algorithms.\\\\n\\\\n✦\\\\n\\\\n1 INTRODUCTION\\\\n\\\\nIn addition, theoretical guarantees of classical solvers are only locally optimal [6], and as such, may not be sufficient for certain problems.\\\\n\\\\nHence, in many cases, progress on the algorithmic side is a valuable alternative to increasing demands in storage and computational resources. Such algorithmic improvements can be achieved in different ways, e.g. by approximative for- mulations and solutions instead of exact but computation- ally expensive or intractable ones. This survey investigates a specific type of those, namely related to how quantum computing can be leveraged in computer vision.\\\\n\\\\nQuantum Computing at a Glance. Since the proposal of quantum computers in the 1980s [7], [8], [9], substantial progress has been made in their practical experimental realisations. The new computational paradigm inspired a multitude of works on theoretical foundations of quantum computing (computer science) [10], [11] and quantum hard- ware realisations (physics) [12]. Researchers have been ac- tively working on various qubit technologies for the last 25 years, and quantum computers based on them are available now for research purposes. Such machines and hardware that are and will become available in the near future— with up to several hundred qubits—are often called Noisy Intermediate-scale Quantum (NISQ) computers [13]. In 2019, a quantum computer, Google’s Sycamore, was able to perform a particular calculation tens of orders of magnitude quicker than a classical computer [14].\\\\n\\\\nQuantum computation fundamentally revolves around\\\\n\\\\nC Omputer Vision (CV) studies automatic processing of visual and spatial information. Data representing such information is acquired in the form of 2D images, videos, depth maps, 3D point clouds, and different combinations of these inputs, possibly along with other sensory signals (e.g. data from inertial measurement units) [1], [2]. The long- term aim of computer vision is to come up with intelligent, high-level interpretations of the observed scenes, inspired by the capabilities of the Human Visual System (HVS). Today, CV and Machine Learning (ML) constantly inform each other: modern CV strongly relies on ML techniques while developments in ML are often driven by unsolved problems in CV [3].\\\\n\\\\nMany modern CV systems are inherently complex: They include multiple components and process large data vol- umes (e.g. during neural network training) [4], [5]. Pri- mary tools of computer vision, such as deep learning, have started to demand unreasonable and hard-to-satisfy GPU resources when classical computation is concerned. Neural architectures in many areas of CV are becoming increasingly larger, parameter-heavy and require more and more time to converge during training. Moreover, many CV problems contain combinatorial objectives that cannot be solved by an exhaustive search in a reasonable time. As a consequence, they require hardware that is able to fulfil high data pro- cessing demands (e.g. graphical processing units (GPUs)).\\\\n\\\\nevolving quantum systems into distinct states using quan- tum mechanical effects. To harness these effects for prac- tical computation, two primary paradigms, governed by two physical principles, have emerged: Adiabatic Quantum Computing (AQC) [11], [15] and gate-based quantum com- puting [10], [16]. AQC relies on the smooth evolution or annealing of a so-called Hamiltonian to guide the system toward its lowest-energy state, making it naturally suited for optimisation problems, particularly those expressed in the Quadratic Unconstrained Binary Optimisation (QUBO) form. In contrast, gate-based quantum computing employs discrete unitary transformations, offering greater flexibility in algorithm design. Together, these paradigms define mod- ern quantum computation, each with distinct advantages that depend on the specific problem domain.\\\\n\\\\nBroadly speaking, Quantum Computing (QC) allows designing algorithms to solve several classes of computa- tionally challenging problems with possible computational gains depending on the type of QC and the problem. The type of computational gains can range from improvements in asymptotic complexity (compared to the best classical counterparts) [17] to the acceleration of computations in absolute terms (in the same complexity class) [18]. Quantum computers operate on qubits, counterparts of classical bits that leverage quantum effects. These qubits abstractly span a Hilbert space, where computation takes place. In theory, a quantum computer can perform everything a classical com- puter can perform and vice versa. However, the correspond- ing Hilbert space of multiple qubits is exponentially large (2n-dimensional for n qubits), due to the tensor product structure of quantum mechanics, and so-called entangled states where the qubits cannot be described separately. Classical computation of the corresponding exponentially large operators would, of course, be highly time-consuming.\\\\n\\\\nAs of 2023-2025, we observe a transition in the field thanks to the proliferation of real quantum hardware: If previously (pre-2015), quantum hardware was accessible predominantly to researchers developing it, multiple quan- tum and quantum-inspired computers (e.g. adiabatic, gate- based machines, photonic machines, quantum simulators and quantum-inspired optimisers) can nowadays be ac- cessed by researchers from different fields, and the devel- oped methods can be tested on real quantum hardware. This is in contrast to the vast majority of papers on quantum algorithms published before, including highly influential ones that have been tested on very small problems on real quantum hardware so far [19], [20], [21]. For the next two decades, experts predict a super-linear increase in the number of qubits [22] and substantial improvements in the properties of the individual qubits (such as their decoher- ence and connectivity properties). We argue that these pre- dictions should be taken seriously, because the investments in quantum computing and algorithm development are sub- stantial; many national and international initiatives related to quantum computing were brought into being within the last several years. Quantum computing power on selected algorithms scales super-linearly (up to exponentially) with the number of qubits. Almost all the big technology com- panies including Intel, IBM, Google, Microsoft, Amazon, NVIDIA and D-Wave are aiming at a steady increase in the number of qubits and are investing heavily in quantum\\\\n\\\\ntechnology, as the limits of Moore’s law is approached1. For example, Google has publicly announced its goal to build a commercial quantum computer composed of 1M qubits by 20292. The Willow chip achieving successful quantum error correction [24] constitutes an encouraging milestone for quantum computing.\\\\n\\\\nIn response to the aforementioned algorithmic chal- lenges in computer vision and the opportunities of quan- tum computing, computer vision researchers started turning their attention to this new (for the research field) com- putational paradigm. From both theoretical and practical perspectives, it is both desirable and interesting to inves- tigate new (quantum) algorithms for long-standing (vision) problems. Based on different principles than before, such algorithms can bring previously unexpected advantages and new properties to computer vision systems.\\\\n\\\\nTo make the best use of the strongly increasing quantum computing power, we need to make the best use of quantum machines, and we need QeCV algorithms to be ready and scalable when practical quantum computing arrives. In the long term, Quantum Processing Units (QPUs) promise to extend the available arsenal of reliable computer vision tools and computational accelerators (with GPUs being an example of currently widely-used technology).\\\\n\\\\nFig. 1 provides an overview of different computer vision problems and quantum computational paradigms that can be used to address them, adiabatic and gate-based quantum computing; it also highlights the common steps of every quantum-compatible computer vision approach (e.g. prob- lem embedding to the hardware, annealing or execution of quantum gates, and, finally, read-out of the solution). We discuss both quantum computational paradigms in Sec. 2.\\\\n\\\\n1.1 Computer Vision meets Quantum Computing\\\\n\\\\nIn the broad sense, Quantum-enhanced Computer Vision (QeCV) encompasses computer vision methods and systems ex- ecuted entirely or partially on quantum hardware. The term “en- hanced” refers to the fact that QeCV methods include clas- sical parts to different degrees (e.g. data encoding, weight matrix preparation, classical neural network parts in hybrid quantum-classical neural architectures) and are boosted (or enhanced) through quantum computations. Hence, modern QeCV methods are hybrid and designed as an interplay between classical and quantum parts. QeCV falls into the category of applied quantum computing, perhaps among the first such disciplines across the fields.\\\\n\\\\nThe goal of QeCV is the development of innovative computer vision techniques (improved or funda- mentally new ones) leveraging quantum computa- tional paradigms and surpassing classical methods in terms of processing speed, required resources, ac- curacy or the ability to learn patterns from complex visual data.\\\\n\\\\n1. Researchers even started to use the term Neven’s Law [23], i.e. re- ferring to the doubly-exponential growth in quantum compute power.\\\\n\\\\n2. https://quantumai.google/learn/map\\\\n\\\\nevolving quantum systems into distinct states using quan- tum mechanical effects. To harness these effects for prac- tical computation, two primary paradigms, governed by two physical principles, have emerged: Adiabatic Quantum Computing (AQC) [11], [15] and gate-based quantum com- puting [10], [16]. AQC relies on the smooth evolution or annealing of a so-called Hamiltonian to guide the system toward its lowest-energy state, making it naturally suited for optimisation problems, particularly those expressed in the Quadratic Unconstrained Binary Optimisation (QUBO) form. In contrast, gate-based quantum computing employs discrete unitary transformations, offering greater flexibility in algorithm design. Together, these paradigms define mod- ern quantum computation, each with distinct advantages that depend on the specific problem domain.\\\\n\\\\nBroadly speaking, Quantum Computing (QC) allows designing algorithms to solve several classes of computa- tionally challenging problems with possible computational gains depending on the type of QC and the problem. The type of computational gains can range from improvements in asymptotic complexity (compared to the best classical counterparts) [17] to the acceleration of computations in absolute terms (in the same complexity class) [18]. Quantum computers operate on qubits, counterparts of classical bits that leverage quantum effects. These qubits abstractly span a Hilbert space, where computation takes place. In theory, a quantum computer can perform everything a classical com- puter can perform and vice versa. However, the correspond- ing Hilbert space of multiple qubits is exponentially large (2n-dimensional for n qubits), due to the tensor product structure of quantum mechanics, and so-called entangled states where the qubits cannot be described separately. Classical computation of the corresponding exponentially large operators would, of course, be highly time-consuming.\\\\n\\\\nAs of 2023-2025, we observe a transition in the field thanks to the proliferation of real quantum hardware: If previously (pre-2015), quantum hardware was accessible predominantly to researchers developing it, multiple quan- tum and quantum-inspired computers (e.g. adiabatic, gate- based machines, photonic machines, quantum simulators and quantum-inspired optimisers) can nowadays be ac- cessed by researchers from different fields, and the devel- oped methods can be tested on real quantum hardware. This is in contrast to the vast majority of papers on quantum algorithms published before, including highly influential ones that have been tested on very small problems on real quantum hardware so far [19], [20], [21]. For the next two decades, experts predict a super-linear increase in the number of qubits [22] and substantial improvements in the properties of the individual qubits (such as their decoher- ence and connectivity properties). We argue that these pre- dictions should be taken seriously, because the investments in quantum computing and algorithm development are sub- stantial; many national and international initiatives related to quantum computing were brought into being within the last several years. Quantum computing power on selected algorithms scales super-linearly (up to exponentially) with the number of qubits. Almost all the big technology com- panies including Intel, IBM, Google, Microsoft, Amazon, NVIDIA and D-Wave are aiming at a steady increase in the number of qubits and are investing heavily in quantum\\\\n\\\\ntechnology, as the limits of Moore’s law is approached1. For example, Google has publicly announced its goal to build a commercial quantum computer composed of 1M qubits by 20292. The Willow chip achieving successful quantum error correction [24] constitutes an encouraging milestone for quantum computing.\\\\n\\\\nIn response to the aforementioned algorithmic chal- lenges in computer vision and the opportunities of quan- tum computing, computer vision researchers started turning their attention to this new (for the research field) com- putational paradigm. From both theoretical and practical perspectives, it is both desirable and interesting to inves- tigate new (quantum) algorithms for long-standing (vision) problems. Based on different principles than before, such algorithms can bring previously unexpected advantages and new properties to computer vision systems.\\\\n\\\\nTo make the best use of the strongly increasing quantum computing power, we need to make the best use of quantum machines, and we need QeCV algorithms to be ready and scalable when practical quantum computing arrives. In the long term, Quantum Processing Units (QPUs) promise to extend the available arsenal of reliable computer vision tools and computational accelerators (with GPUs being an example of currently widely-used technology).\\\\n\\\\nFig. 1 provides an overview of different computer vision problems and quantum computational paradigms that can be used to address them, adiabatic and gate-based quantum computing; it also highlights the common steps of every quantum-compatible computer vision approach (e.g. prob- lem embedding to the hardware, annealing or execution of quantum gates, and, finally, read-out of the solution). We discuss both quantum computational paradigms in Sec. 2.\\\\n\\\\n1.1 Computer Vision meets Quantum Computing\\\\n\\\\nIn the broad sense, Quantum-enhanced Computer Vision (QeCV) encompasses computer vision methods and systems ex- ecuted entirely or partially on quantum hardware. The term “en- hanced” refers to the fact that QeCV methods include clas- sical parts to different degrees (e.g. data encoding, weight matrix preparation, classical neural network parts in hybrid quantum-classical neural architectures) and are boosted (or enhanced) through quantum computations. Hence, modern QeCV methods are hybrid and designed as an interplay between classical and quantum parts. QeCV falls into the category of applied quantum computing, perhaps among the first such disciplines across the fields.\\\\n\\\\nThe goal of QeCV is the development of innovative computer vision techniques (improved or funda- mentally new ones) leveraging quantum computa- tional paradigms and surpassing classical methods in terms of processing speed, required resources, ac- curacy or the ability to learn patterns from complex visual data.\\\\n\\\\n1. Researchers even started to use the term Neven’s Law [23], i.e. re- ferring to the doubly-exponential growth in quantum compute power.\\\\n\\\\n2. https://quantumai.google/learn/map\\\\n\\\\nFig. 1: Quantum-enhanced Computer Vision. . (A): First, a target problem must be formulated in a form consumable by modern quantum machines, e.g. as a QUBO problem for AQC devices or as a gate sequence for gate-based QC. This operation is performed on a host (classical CPU). (B): In AQC, the resulting QUBO defines a logical problem—binary variables that become qubits during optimisation on an idealised quantum annealer with full qubit connectivity. Alternatively, gate-based QC uses a gate sequence to drive the system into a solution-encoding state. (C): To run on a quantum computer with limited connectivity, a logical problem must be minor-embedded or transpiled. During this mapping step, each logical qubit is assigned to one or more physical qubits to match hardware constraints. (D): An AQC device performs annealing for computation, while a gate-based QC device alternatively executes the algorithm describing gates. Adiabatic computers leverage quantum mechanical effects of superposition and tunnelling to find optima of QUBOs. Gate-based computers can additionally harness entanglement and interference to speed up computations, surpassing the capabilities of classical ones. (E): Measured qubit values are unembedded from the hardware and aggregated in the AQC paradigm, or directly read out in gate-based QC. The measurement is repeated several times, and a solution distribution is returned to the host. The bit-strings are processed and interpreted in terms of the original problem. Image sources, if applicable (from left to right and top to bottom in each step): (A): [25, IEEE ©2025], [26, IEEE ©2025], [27, IEEE ©2025], [28], [29, Springer Nature ©2025], [30, reproduced under the dataset’s academic license], [31, Springer Nature ©2025], [32, Springer Nature ©2025], (B): [25, IEEE ©2025], [33], [34, IEEE ©2025], (C): [35, Springer Nature ©2025], [36, IonQ ©2025], (D): [37, reproduced under the CC BY-AS 4.0 license], [38, reproduced under the CC BY 3.0 license].\\\\n\\\\nWe emphasise that this definition is intended to be as- pirational, reflecting the evolving and exploratory nature of the field. The use of quantum hardware in QeCV approaches must be justified not only from a computational standpoint but also through demonstrable benefits in solution quality or predictive accuracy. In other words, it is insufficient to dub a method as quantum solely because it can be executed on quantum hardware. Recall that a universal quantum computer can execute everything a classical binary machine can, but not the other way around (in reasonable time; classical computers can accurately simulate quantum com- putations with exponential growth of classical resources, which becomes infeasible for large numbers of qubits [39]).\\\\n\\\\nQeCV is an emerging field. The first QeCV method published at a primary computer vision conference was\\\\n\\\\nan approach for correspondence problems on point sets [40]; it appeared on arXiv.org in 2019. The paper provides an introduction to modern quantum annealers and pro- poses algorithms for transformation estimation and point set alignment that can be executed on a quantum annealer once the inputs are pre-processed to a form admissible to it. The term Quantum Computer Vision—or QeCV as it is more often called today—was coined later in Birdal and Golyanik et al. [25] and since then, it is used to denote computer vision methods relying on quantum hardware.\\\\n\\\\nApplications of Quantum Computers in Vision. Not many problems in computer vision can be formulated in a form consumable by modern quantum hardware. The estimated number of qubits necessary for practical computer vision problems with gate quantum computing formulations that\\\\n\\\\nFig. 1: Quantum-enhanced Computer Vision. . (A): First, a target problem must be formulated in a form consumable by modern quantum machines, e.g. as a QUBO problem for AQC devices or as a gate sequence for gate-based QC. This operation is performed on a host (classical CPU). (B): In AQC, the resulting QUBO defines a logical problem—binary variables that become qubits during optimisation on an idealised quantum annealer with full qubit connectivity. Alternatively, gate-based QC uses a gate sequence to drive the system into a solution-encoding state. (C): To run on a quantum computer with limited connectivity, a logical problem must be minor-embedded or transpiled. During this mapping step, each logical qubit is assigned to one or more physical qubits to match hardware constraints. (D): An AQC device performs annealing for computation, while a gate-based QC device alternatively executes the algorithm describing gates. Adiabatic computers leverage quantum mechanical effects of superposition and tunnelling to find optima of QUBOs. Gate-based computers can additionally harness entanglement and interference to speed up computations, surpassing the capabilities of classical ones. (E): Measured qubit values are unembedded from the hardware and aggregated in the AQC paradigm, or directly read out in gate-based QC. The measurement is repeated several times, and a solution distribution is returned to the host. The bit-strings are processed and interpreted in terms of the original problem. Image sources, if applicable (from left to right and top to bottom in each step): (A): [25, IEEE ©2025], [26, IEEE ©2025], [27, IEEE ©2025], [28], [29, Springer Nature ©2025], [30, reproduced under the dataset’s academic license], [31, Springer Nature ©2025], [32, Springer Nature ©2025], (B): [25, IEEE ©2025], [33], [34, IEEE ©2025], (C): [35, Springer Nature ©2025], [36, IonQ ©2025], (D): [37, reproduced under the CC BY-AS 4.0 license], [38, reproduced under the CC BY 3.0 license].\\\\n\\\\nWe emphasise that this definition is intended to be as- pirational, reflecting the evolving and exploratory nature of the field. The use of quantum hardware in QeCV approaches must be justified not only from a computational standpoint but also through demonstrable benefits in solution quality or predictive accuracy. In other words, it is insufficient to dub a method as quantum solely because it can be executed on quantum hardware. Recall that a universal quantum computer can execute everything a classical binary machine can, but not the other way around (in reasonable time; classical computers can accurately simulate quantum com- putations with exponential growth of classical resources, which becomes infeasible for large numbers of qubits [39]).\\\\n\\\\nQeCV is an emerging field. The first QeCV method published at a primary computer vision conference was\\\\n\\\\nan approach for correspondence problems on point sets [40]; it appeared on arXiv.org in 2019. The paper provides an introduction to modern quantum annealers and pro- poses algorithms for transformation estimation and point set alignment that can be executed on a quantum annealer once the inputs are pre-processed to a form admissible to it. The term Quantum Computer Vision—or QeCV as it is more often called today—was coined later in Birdal and Golyanik et al. [25] and since then, it is used to denote computer vision methods relying on quantum hardware.\\\\n\\\\nApplications of Quantum Computers in Vision. Not many problems in computer vision can be formulated in a form consumable by modern quantum hardware. The estimated number of qubits necessary for practical computer vision problems with gate quantum computing formulations that\\\\n\\\\nmake use of provably advantageous quantum algorithms is typically larger than available on NISQ architectures. It is rare that QUBO forms are available and can be directly tested on an AQC. Often, it is the case that the target problem first has to be mapped to QUBO, and the solution has to be encoded in binary form. It is, therefore, of broad scientific interest that computer vision (and machine learn- ing) problems can be formulated for quantum computing and efficiently solved with it, while offering advantages compared to their classical method counterparts. Moreover, using a quantum mindset to CV problems can provide a new perspective, leading to new insights for classical CV and new methods, especially for the field in which many things are empirical. We discuss these aspects in Sec. 3. Related Research Fields. Several research fields related to QeCV can be identified in the literature, such as quantum-inspired computer vision, Quantum Image Pro- cessing (QIP), and Quantum Machine Learning (QML).\\\\n\\\\nThe first category simulates quantum-mechanical effects or draws inspiration from quantum mechanics [41], [42], [43], [44], [45]. These algorithms are not meant for execution on quantum hardware and are solely inspired by quantum phenomena. They should not be confused with techniques that can execute on quantum hardware. The second method category, i.e. QIP, is a field of quantum information pro- cessing focusing on representing and processing images as quantum states [46], [47], [48]. QIP provides several faster algorithms (in theory) for multiple widely-used linear alge- bra operations and common low-level operations on images [49], [50], [51].\\\\n\\\\nBoth method categories above can be broadly seen as sub-fields of QeCV, though distinct from this survey’s focus on mid- and high-level computer vision tasks (e.g. point set or mesh alignment, object tracking, and robust fitting). In contrast, just as classical ML is deeply intertwined with CV, quantum machine learning [52], [53], [54] explores the intersection of quantum computing and machine learning to enhance learning algorithms. It holds promise for QeCV, potentially accelerating tasks such as image classification, object detection, and pattern recognition. However, current QML methods remain largely theoretical, with limited prac- tical applications due to quantum hardware and scalability constraints. Hence, QML remains distant from this survey’s focus on practical applications to CV.\\\\n\\\\n1.2 Motivation and Scope of this Survey\\\\n\\\\nIn the broad sense, this survey shows by reviewing the first works in the field, how quantum computations and quantum hardware can be leveraged for computer vision. It showcases recent and ongoing progress towards practical quantum computing and computer vision, discusses the current state of the art, limitations therein, expected progress and its impact on computer vision. Our goal is to provide a shared, computer-science-friendly language and mathematical for- mulation of quantum computing, covering its two modern paradigms, i.e. gate-based quantum computing and adia- batic quantum computation. We identify and classify com- puter vision problems that can be addressed by quantum computers and analyse what they do have in common.\\\\n\\\\nWe observe that the quantum technology acts more and more like a booster for algorithm development in CV. The\\\\n\\\\nfact that the developed methods could run on real quantum hardware often brings us to interesting discoveries. Thus, we can often obtain methods that work better than existing classical ones.\\\\n\\\\nThere are several considerations regarding the gate- based model vs adiabatic model. Gate-based QCs are cur- rently severely limited in the number of qubits, their con- nectivity patterns and the accuracy of operations (gates) that can be applied to the qubits. Additionally, decoher- ence poses a significant obstacle. Thus, the largest num- ber factorised on gate-based quantum hardware using the celebrated Shor’s algorithm remains 21 for more than ten years as of the moment of writing [55], [56]. In contrast, the number of qubits in modern quantum annealers such as D- Wave is larger, which in combination with their connectivity and qubit characteristics allows for solving combinatorial optimisation problems in the Ising encodings of sizes rele- vant to computer vision and real-world applications. Also, qubit coherence times required for AQC are shorter than for gate-based machines, which partially explains the better scalability of quantum annealers compared to gate-based machines. Hence, the interest in especially applying AQC in computer vision has grown substantially over the last three years.\\\\n\\\\nThis survey focuses on computer vision methods for gate-based quantum computers and quantum annealers that have been evaluated on real quantum hardware (in the case of adiabatic quantum annealers) or simulators of gate-based quantum computers. We include a few theoretical works on gate-based quantum computing in computer vision without experiments on a simulator though this survey is gener- ally structured to reflect the applied focus of QeCV. Since computer vision is an applied science, we believe that the criterion of experimentation on quantum hardware is piv- otal, especially because modern quantum machines already allow solving problems of sizes encountered in practical applications. That is why we believe it is important that the methods are evaluated and the exposition is not restricted to theoretical considerations.\\\\n\\\\nPaper Selection Criteria. QeCV is a nascent field. In line with its aspirational definition, this survey presents an overview of methods designed for full or hybrid execution (i.e. classical-quantum) on quantum hardware, emphasising approaches that report theoretically grounded results with potential relevance to QeCV. Hence, the main paper selec- tion criterion for this survey is experimental evaluation of the proposed techniques and, at least, some results obtained on real quantum hardware. Moreover, we focus on works published at computer vision conferences (CVPR, ICCV and ECCV) and other venues (perhaps interdisciplinary) that are interested in the application of quantum computers in vision. We also include several technical reports on arXiv.org if they fulfil the main paper selection criterion. The authors made efforts to provide as complete review of the field as possible, but cannot claim completeness in every aspect, since the field is getting momentum and growing. We also recommend interested readers to read the discussed papers for more details.\\\\n\\\\nRelated Surveys. Recently, Quantum Machine Learning (QML) has gained significant attention as it is now a fast- growing area. The large body of works in the QML space has\\\\n\\\\nmake use of provably advantageous quantum algorithms is typically larger than available on NISQ architectures. It is rare that QUBO forms are available and can be directly tested on an AQC. Often, it is the case that the target problem first has to be mapped to QUBO, and the solution has to be encoded in binary form. It is, therefore, of broad scientific interest that computer vision (and machine learn- ing) problems can be formulated for quantum computing and efficiently solved with it, while offering advantages compared to their classical method counterparts. Moreover, using a quantum mindset to CV problems can provide a new perspective, leading to new insights for classical CV and new methods, especially for the field in which many things are empirical. We discuss these aspects in Sec. 3. Related Research Fields. Several research fields related to QeCV can be identified in the literature, such as quantum-inspired computer vision, Quantum Image Pro- cessing (QIP), and Quantum Machine Learning (QML).\\\\n\\\\nThe first category simulates quantum-mechanical effects or draws inspiration from quantum mechanics [41], [42], [43], [44], [45]. These algorithms are not meant for execution on quantum hardware and are solely inspired by quantum phenomena. They should not be confused with techniques that can execute on quantum hardware. The second method category, i.e. QIP, is a field of quantum information pro- cessing focusing on representing and processing images as quantum states [46], [47], [48]. QIP provides several faster algorithms (in theory) for multiple widely-used linear alge- bra operations and common low-level operations on images [49], [50], [51].\\\\n\\\\nBoth method categories above can be broadly seen as sub-fields of QeCV, though distinct from this survey’s focus on mid- and high-level computer vision tasks (e.g. point set or mesh alignment, object tracking, and robust fitting). In contrast, just as classical ML is deeply intertwined with CV, quantum machine learning [52], [53], [54] explores the intersection of quantum computing and machine learning to enhance learning algorithms. It holds promise for QeCV, potentially accelerating tasks such as image classification, object detection, and pattern recognition. However, current QML methods remain largely theoretical, with limited prac- tical applications due to quantum hardware and scalability constraints. Hence, QML remains distant from this survey’s focus on practical applications to CV.\\\\n\\\\n1.2 Motivation and Scope of this Survey\\\\n\\\\nIn the broad sense, this survey shows by reviewing the first works in the field, how quantum computations and quantum hardware can be leveraged for computer vision. It showcases recent and ongoing progress towards practical quantum computing and computer vision, discusses the current state of the art, limitations therein, expected progress and its impact on computer vision. Our goal is to provide a shared, computer-science-friendly language and mathematical for- mulation of quantum computing, covering its two modern paradigms, i.e. gate-based quantum computing and adia- batic quantum computation. We identify and classify com- puter vision problems that can be addressed by quantum computers and analyse what they do have in common.\\\\n\\\\nWe observe that the quantum technology acts more and more like a booster for algorithm development in CV. The\\\\n\\\\nfact that the developed methods could run on real quantum hardware often brings us to interesting discoveries. Thus, we can often obtain methods that work better than existing classical ones.\\\\n\\\\nThere are several considerations regarding the gate- based model vs adiabatic model. Gate-based QCs are cur- rently severely limited in the number of qubits, their con- nectivity patterns and the accuracy of operations (gates) that can be applied to the qubits. Additionally, decoher- ence poses a significant obstacle. Thus, the largest num- ber factorised on gate-based quantum hardware using the celebrated Shor’s algorithm remains 21 for more than ten years as of the moment of writing [55], [56]. In contrast, the number of qubits in modern quantum annealers such as D- Wave is larger, which in combination with their connectivity and qubit characteristics allows for solving combinatorial optimisation problems in the Ising encodings of sizes rele- vant to computer vision and real-world applications. Also, qubit coherence times required for AQC are shorter than for gate-based machines, which partially explains the better scalability of quantum annealers compared to gate-based machines. Hence, the interest in especially applying AQC in computer vision has grown substantially over the last three years.\\\\n\\\\nThis survey focuses on computer vision methods for gate-based quantum computers and quantum annealers that have been evaluated on real quantum hardware (in the case of adiabatic quantum annealers) or simulators of gate-based quantum computers. We include a few theoretical works on gate-based quantum computing in computer vision without experiments on a simulator though this survey is gener- ally structured to reflect the applied focus of QeCV. Since computer vision is an applied science, we believe that the criterion of experimentation on quantum hardware is piv- otal, especially because modern quantum machines already allow solving problems of sizes encountered in practical applications. That is why we believe it is important that the methods are evaluated and the exposition is not restricted to theoretical considerations.\\\\n\\\\nPaper Selection Criteria. QeCV is a nascent field. In line with its aspirational definition, this survey presents an overview of methods designed for full or hybrid execution (i.e. classical-quantum) on quantum hardware, emphasising approaches that report theoretically grounded results with potential relevance to QeCV. Hence, the main paper selec- tion criterion for this survey is experimental evaluation of the proposed techniques and, at least, some results obtained on real quantum hardware. Moreover, we focus on works published at computer vision conferences (CVPR, ICCV and ECCV) and other venues (perhaps interdisciplinary) that are interested in the application of quantum computers in vision. We also include several technical reports on arXiv.org if they fulfil the main paper selection criterion. The authors made efforts to provide as complete review of the field as possible, but cannot claim completeness in every aspect, since the field is getting momentum and growing. We also recommend interested readers to read the discussed papers for more details.\\\\n\\\\nRelated Surveys. Recently, Quantum Machine Learning (QML) has gained significant attention as it is now a fast- growing area. The large body of works in the QML space has\\\\n\\\\nbeen reviewed in [57], [58], whereas [57], [59], [60] also focus on quantum approaches for tackling optimisation problems. Out of all machine learning models, quantum deep learning requires special attention due to the impact of these learning machines in the field of AI. Massoli et al. [61] gather, compare and analyse the current state-of-the-art concerning Quantum Neural Networks (QNN). Yarkoni et al. [62], on the other hand, look into possible industry applications of Quantum Annealing (QA); their survey is perhaps closest to our work in terms of focusing on an application area of QA.\\\\n\\\\nMohseni et al. [18] review different methods for solv- ing Ising problems (exactly or approximately) and discuss quantum annealing as one meta-heuristic. A few surveys focus on quantum annealing for physicists [63], whereas our work is written for researchers with a computer vision back- ground. The short survey by Larasati et al. [64] provides a summary of a few quantum computer vision methods (<5% of what we cover) and is much less comprehensive than our article. Another brief paper by Mebtouche et al. [65] reviews the foundations of quantum computing, machine learning and multiple low-level vision and image processing tech- niques. In this regard, ours is the first comprehensive survey of its kind with an emphasis on QeCV.\\\\n\\\\nTarget Audience. This survey it is written for applied com- puter scientists, especially computer vision researchers and practitioners, who may have little to no background in theoretical physics. Unlike previous surveys [18], [57], [58], [59], [60], [61], [65] on quantum algorithms, mostly published outside traditional computer vision venues, our goal is to make QeCV accessible and actionable for a broader audience of computer vision researchers and engineers. By consolidating multiple contributions, we aim to bridge that gap and provide a comprehensive overview of existing QeCV methods, along with insights into future directions.\\\\n\\\\nWe present a practical “cookbook” for computer vision practitioners to begin exploring quantum-enhanced tech- niques today and to prepare for the growing impact of quan- tum technologies. While the survey avoids deep dives into physics and quantum theory, it includes sufficient technical detail to ensure a solid foundation for the presented results and the validity thereof. Physical assumptions are clearly separated from the mathematical consequences relevant to algorithmic design. The intended readership includes:\\\\n\\\\n• Computer vision researchers seeking to understand whether their work can benefit from quantum comput- ing;\\\\n\\\\n• Computer vision practitioners interested in experiment- ing with QeCV algorithms or integrating quantum solvers into their workflows;\\\\n\\\\n• Computer science students at various levels who are curious about the intersection of computer vision and applied quantum computing.\\\\n\\\\nWe hope this survey will serve as a gateway for new researchers to enter the field and contribute to its devel- opment.\\\\n\\\\n1.3 Structure of this Survey\\\\n\\\\nThis survey is structured in six sections. Sec. 2 reviews foundations of quantum computing relevant to quantum- enhanced computer vision methods including two main\\\\n\\\\nquantum computational paradigms, i.e. gate-based (or circuit-based) and adiabatic quantum computing, as well as the design and hardware of quantum computers. Moreover, as part of it, Sec. 3.1.2 discusses the encoding of classical data as quantum states suitable for processing on quantum hardware, which is one of the essential steps in quantum- enhanced computer vision. Next, Sec. 3 is devoted to algo- rithms and applications. It reviews methods for different problems such as point set alignment, mesh registration, object tracking, model fitting, quantum machine learning for vision, implicit representations and generative approaches, among others. We then discuss open challenges, specifics of the field and social implications in Sec. 5, and conclude in Sec. 6.\\\\n\\\\n2 OPERATIONAL PRINCIPLES OF QCS\\\\n\\\\nThis section reviews the foundations of quantum computing necessary as a short introduction (or a refresher) to the field and sufficient for understanding the published literature reviewed in Sec. 3. Sec. 2.1 introduces fundamentals such as notations and elementary operations on qubits. Secs. 2.2 and 2.3 describe the two main quantum computational paradigms—namely, gate-based quantum computing and quantum annealing, respectively—and Sec. 2.4 establishes their connections and equivalency. Sec. 2.5 examines po- tential advantages of quantum computing over classical computing and Sec. 2.6 discusses hardware designs and realizations of quantum computers.\\\\n\\\\nNote that we do not review complexity classes associated with the quantum computational paradigm, as their discus- sion is outside the scope. In this survey, it is sufficient to mention that quantum computers can accelerate computa- tions relevant to computer vision and provide different and favourable properties in quantum-enhanced learning mod- els; both of these characteristics provide strong motivation for the newly established research direction of QeCV. We recommend references [66], [67] to readers wishing to learn more about quantum complexity theory.\\\\n\\\\n2.1 Fundamentals and Notations\\\\n\\\\nThis background section introduces the fundamentals of quantum computing through both gate-based and adiabatic paradigms. It also establishes the notation and terminology used throughout the survey. For clarity and ease of refer- ence, a comprehensive overview of symbols and acronyms is provided in Tabs. 1 and 2; note that the acronym list excludes method names and company identifiers.\\\\n\\\\nMany concepts in quantum computing have direct ana- logues in classical computing and optimization theory. For example, Hamiltonians represent energy functions, with eigenstates corresponding to energy levels and ground states denoting the lowest-energy configurations. Through- out the survey, we assume these physical Hamiltonians to be Hermitian operators, and unless otherwise stated, the quantum systems considered are closed—i.e., they do not exchange particles with their environment. We adopt the bra–ket notation to concisely express familiar linear alge- braic constructs such as row and column vectors, inner and outer products, and tensor products. This notation stream- lines the presentation of quantum algorithms and aligns\\\\n\\\\nbeen reviewed in [57], [58], whereas [57], [59], [60] also focus on quantum approaches for tackling optimisation problems. Out of all machine learning models, quantum deep learning requires special attention due to the impact of these learning machines in the field of AI. Massoli et al. [61] gather, compare and analyse the current state-of-the-art concerning Quantum Neural Networks (QNN). Yarkoni et al. [62], on the other hand, look into possible industry applications of Quantum Annealing (QA); their survey is perhaps closest to our work in terms of focusing on an application area of QA.\\\\n\\\\nMohseni et al. [18] review different methods for solv- ing Ising problems (exactly or approximately) and discuss quantum annealing as one meta-heuristic. A few surveys focus on quantum annealing for physicists [63], whereas our work is written for researchers with a computer vision back- ground. The short survey by Larasati et al. [64] provides a summary of a few quantum computer vision methods (<5% of what we cover) and is much less comprehensive than our article. Another brief paper by Mebtouche et al. [65] reviews the foundations of quantum computing, machine learning and multiple low-level vision and image processing tech- niques. In this regard, ours is the first comprehensive survey of its kind with an emphasis on QeCV.\\\\n\\\\nTarget Audience. This survey it is written for applied com- puter scientists, especially computer vision researchers and practitioners, who may have little to no background in theoretical physics. Unlike previous surveys [18], [57], [58], [59], [60], [61], [65] on quantum algorithms, mostly published outside traditional computer vision venues, our goal is to make QeCV accessible and actionable for a broader audience of computer vision researchers and engineers. By consolidating multiple contributions, we aim to bridge that gap and provide a comprehensive overview of existing QeCV methods, along with insights into future directions.\\\\n\\\\nWe present a practical “cookbook” for computer vision practitioners to begin exploring quantum-enhanced tech- niques today and to prepare for the growing impact of quan- tum technologies. While the survey avoids deep dives into physics and quantum theory, it includes sufficient technical detail to ensure a solid foundation for the presented results and the validity thereof. Physical assumptions are clearly separated from the mathematical consequences relevant to algorithmic design. The intended readership includes:\\\\n\\\\n• Computer vision researchers seeking to understand whether their work can benefit from quantum comput- ing;\\\\n\\\\n• Computer vision practitioners interested in experiment- ing with QeCV algorithms or integrating quantum solvers into their workflows;\\\\n\\\\n• Computer science students at various levels who are curious about the intersection of computer vision and applied quantum computing.\\\\n\\\\nWe hope this survey will serve as a gateway for new researchers to enter the field and contribute to its devel- opment.\\\\n\\\\n1.3 Structure of this Survey\\\\n\\\\nThis survey is structured in six sections. Sec. 2 reviews foundations of quantum computing relevant to quantum- enhanced computer vision methods including two main\\\\n\\\\nquantum computational paradigms, i.e. gate-based (or circuit-based) and adiabatic quantum computing, as well as the design and hardware of quantum computers. Moreover, as part of it, Sec. 3.1.2 discusses the encoding of classical data as quantum states suitable for processing on quantum hardware, which is one of the essential steps in quantum- enhanced computer vision. Next, Sec. 3 is devoted to algo- rithms and applications. It reviews methods for different problems such as point set alignment, mesh registration, object tracking, model fitting, quantum machine learning for vision, implicit representations and generative approaches, among others. We then discuss open challenges, specifics of the field and social implications in Sec. 5, and conclude in Sec. 6.\\\\n\\\\n2 OPERATIONAL PRINCIPLES OF QCS\\\\n\\\\nThis section reviews the foundations of quantum computing necessary as a short introduction (or a refresher) to the field and sufficient for understanding the published literature reviewed in Sec. 3. Sec. 2.1 introduces fundamentals such as notations and elementary operations on qubits. Secs. 2.2 and 2.3 describe the two main quantum computational paradigms—namely, gate-based quantum computing and quantum annealing, respectively—and Sec. 2.4 establishes their connections and equivalency. Sec. 2.5 examines po- tential advantages of quantum computing over classical computing and Sec. 2.6 discusses hardware designs and realizations of quantum computers.\\\\n\\\\nNote that we do not review complexity classes associated with the quantum computational paradigm, as their discus- sion is outside the scope. In this survey, it is sufficient to mention that quantum computers can accelerate computa- tions relevant to computer vision and provide different and favourable properties in quantum-enhanced learning mod- els; both of these characteristics provide strong motivation for the newly established research direction of QeCV. We recommend references [66], [67] to readers wishing to learn more about quantum complexity theory.\\\\n\\\\n2.1 Fundamentals and Notations\\\\n\\\\nThis background section introduces the fundamentals of quantum computing through both gate-based and adiabatic paradigms. It also establishes the notation and terminology used throughout the survey. For clarity and ease of refer- ence, a comprehensive overview of symbols and acronyms is provided in Tabs. 1 and 2; note that the acronym list excludes method names and company identifiers.\\\\n\\\\nMany concepts in quantum computing have direct ana- logues in classical computing and optimization theory. For example, Hamiltonians represent energy functions, with eigenstates corresponding to energy levels and ground states denoting the lowest-energy configurations. Through- out the survey, we assume these physical Hamiltonians to be Hermitian operators, and unless otherwise stated, the quantum systems considered are closed—i.e., they do not exchange particles with their environment. We adopt the bra–ket notation to concisely express familiar linear alge- braic constructs such as row and column vectors, inner and outer products, and tensor products. This notation stream- lines the presentation of quantum algorithms and aligns\\\\n\\\\nTABLE 1: List of symbols used in the survey.\\\\n\\\\nSymbol Description N Set of natural numbers R Set of real numbers C Set of complex numbers i ℏ Complex unit (if not stated otherwise) Reduced Planck’s constant θ Optimisable rotation angle n Number of qubits in the system |ψ⟩ State vector of the system ⟨ψ| Conjugate transpose of |ψ⟩ ⟨ψ1|ψ2⟩ ∥□∥ |□| Complex inner product of |ψ1⟩ and |ψ2⟩ L2-norm of □ Absolute value of □ ρ Density operator, i.e. ρ = |ψ⟩⟨ψ| ⊗ Kronecker product, Tensor product G Generator of a unitary gate I Identity matrix or operator (per context) H Hamiltonian or Hadamard gate (per context) HI Initial Hamiltonian HP Problem Hamiltonian M Measurement observable U Unitary operator σ{x,y,z} Pauli-X, -Y and -Z operators Tr Trace operator λ Eigenvalue or penalty factor (per context) T Total time ∆t Time step s x v Ising variables, i.e. s ∈ {−1,1}n QUBO variables, i.e. x ∈ {0,1}n General binary variable, i.e. v ∈ {v1,v2}n J,b Couplings and biases of an Ising problem Q,c Couplings and biases of a QUBO problem W,w Couplings and biases of a general quadratic binary decision problem A,b Matrix and vector of linear constraints f Schedule or objective function (per context) L Loss function of PQCs\\\\n\\\\nTABLE 2: List of frequent acronyms used in the survey.\\\\n\\\\nAcronym Description AQC Adiabatic quantum computing BNN Binary neural networks CPU Central processing unit CV Computer vision CVPR Computer Vision and Pattern Recognition ECCV European Conference on Computer Vision GPU Graphics processing unit GQC Gate-based quantum computing ICCV International Conference on Computer Vision MLP Multi-layer perceptron ML Machine learning NISQ Noisy intermediate-scale quantum NN Neural networks PQC Parameterized quantum circuit QA Quantum annealing QBO Quantum binary optimization QC Quantum computing QCT Quantum complexity theory QCNN Quantum convolutional neural network QCVML Quantum computer vision and machine learning QDK Quantum development kit QeCV Quantum-enhanced computer vision QIP Quantum image processing QML Quantum machine learning QNN Quantum neural networks QPU Quantum processing unit QUBO Quadratic unconstrained binary optimization SDK Software development kit SSD Sum of squared distances\\\\n\\\\nwith conventions in both physics and quantum information theory.\\\\n\\\\nSingle Qubits. Let us start with the most fundamental building block and elementary information unit of a quan- tum computer, the qubit.\\\\n\\\\nThe information carrier in a quantum comput- ing system called qubit is described by a two- dimensional complex vector of length one, i.e.\\\\n\\\\n(1)\\\\n\\\\nWe adopt the widely-used bra–ket notation common in physics and quantum computing to write vectors and their conjugate transposes: |ψ⟩ (ket) denotes a column vector and ⟨ψ| = |ψ⟩∗ (ket) denotes its conjugate transpose. The multiplication of a bra- and a ket-vector ⟨ψ1| and |ψ2⟩, denoted ⟨ψ1|ψ2⟩, results in their inner product in the C2 Hilbert space3. We can write an arbitrary two-dimensional complex vector as a column vector:\\\\n\\\\nwith real coefficients a,b,c,d ∈ R. The normalisation con- dition of the qubit’s state vector in Eq. (1) then yields a2 + b2 + c2 + d2 = 1.\\\\n\\\\nNext, we translate a quantum-physical phenomenon, namely that particles can also be in superposition of states. Only after measurement do they collapse to one of the classical states with a certain probability. This is modeled mathematically in the following way: Let |0⟩ ∈ C2 and |1⟩ ∈ C2 form an orthonormal basis of C2 (whose precise form depends on the setup of the physical system). For example, one can have in column vector notation:\\\\n\\\\nWhen the state of a qubit |ψ⟩ = α|0⟩ + β |1⟩ ∈ C2 is measured (with respect to the basis {|0⟩,|1⟩}), it results in a state\\\\n\\\\n[0) 1)\\\\n\\\\nwith probability |α|2 = |⟨0|ψ⟩|2,\\\\n\\\\nwith probability |β|2 = |⟨1|ψ⟩|2.\\\\n\\\\n(4)\\\\n\\\\nIn other words, a qubit exists in a superposition of classi- cal states as |ψ⟩ = α|0⟩+β |1⟩, where α and β are probability amplitudes. Upon measurement, the qubit collapses into either |0⟩ or |1⟩, with the measurement probability given by the square of the absolute value of the respective amplitude: |α|2 for |0⟩ and |β|2 for |1⟩. This is also called collapse of the\\\\n\\\\n3. Note that in quantum mechanics, states can also have higher dimension than qubits. For non-discrete quantities like momentum or position, these states are not finite-dimensional but functions from the Hilbert space L2(R3), i.e. a so-called Lebesgue space. In this case, the bra vectors can be understood as functionals from L2. Furthermore, even the formulation on L2 still has shortcomings and a full math- ematical treatment would have to resort to rigged Hilbert spaces to describe distributions and deal with unbounded operators [68].\\\\n\\\\n(2)\\\\n\\\\n(3)\\\\n\\\\nTABLE 1: List of symbols used in the survey.\\\\n\\\\nSymbol Description N Set of natural numbers R Set of real numbers C Set of complex numbers i ℏ Complex unit (if not stated otherwise) Reduced Planck’s constant θ Optimisable rotation angle n Number of qubits in the system |ψ⟩ State vector of the system ⟨ψ| Conjugate transpose of |ψ⟩ ⟨ψ1|ψ2⟩ ∥□∥ |□| Complex inner product of |ψ1⟩ and |ψ2⟩ L2-norm of □ Absolute value of □ ρ Density operator, i.e. ρ = |ψ⟩⟨ψ| ⊗ Kronecker product, Tensor product G Generator of a unitary gate I Identity matrix or operator (per context) H Hamiltonian or Hadamard gate (per context) HI Initial Hamiltonian HP Problem Hamiltonian M Measurement observable U Unitary operator σ{x,y,z} Pauli-X, -Y and -Z operators Tr Trace operator λ Eigenvalue or penalty factor (per context) T Total time ∆t Time step s x v Ising variables, i.e. s ∈ {−1,1}n QUBO variables, i.e. x ∈ {0,1}n General binary variable, i.e. v ∈ {v1,v2}n J,b Couplings and biases of an Ising problem Q,c Couplings and biases of a QUBO problem W,w Couplings and biases of a general quadratic binary decision problem A,b Matrix and vector of linear constraints f Schedule or objective function (per context) L Loss function of PQCs\\\\n\\\\nTABLE 2: List of frequent acronyms used in the survey.\\\\n\\\\nAcronym Description AQC Adiabatic quantum computing BNN Binary neural networks CPU Central processing unit CV Computer vision CVPR Computer Vision and Pattern Recognition ECCV European Conference on Computer Vision GPU Graphics processing unit GQC Gate-based quantum computing ICCV International Conference on Computer Vision MLP Multi-layer perceptron ML Machine learning NISQ Noisy intermediate-scale quantum NN Neural networks PQC Parameterized quantum circuit QA Quantum annealing QBO Quantum binary optimization QC Quantum computing QCT Quantum complexity theory QCNN Quantum convolutional neural network QCVML Quantum computer vision and machine learning QDK Quantum development kit QeCV Quantum-enhanced computer vision QIP Quantum image processing QML Quantum machine learning QNN Quantum neural networks QPU Quantum processing unit QUBO Quadratic unconstrained binary optimization SDK Software development kit SSD Sum of squared distances\\\\n\\\\nwith conventions in both physics and quantum information theory.\\\\n\\\\nSingle Qubits. Let us start with the most fundamental building block and elementary information unit of a quan- tum computer, the qubit.\\\\n\\\\nThe information carrier in a quantum comput- ing system called qubit is described by a two- dimensional complex vector of length one, i.e.\\\\n\\\\n(1)\\\\n\\\\nWe adopt the widely-used bra–ket notation common in physics and quantum computing to write vectors and their conjugate transposes: |ψ⟩ (ket) denotes a column vector and ⟨ψ| = |ψ⟩∗ (ket) denotes its conjugate transpose. The multiplication of a bra- and a ket-vector ⟨ψ1| and |ψ2⟩, denoted ⟨ψ1|ψ2⟩, results in their inner product in the C2 Hilbert space3. We can write an arbitrary two-dimensional complex vector as a column vector:\\\\n\\\\nwith real coefficients a,b,c,d ∈ R. The normalisation con- dition of the qubit’s state vector in Eq. (1) then yields a2 + b2 + c2 + d2 = 1.\\\\n\\\\nNext, we translate a quantum-physical phenomenon, namely that particles can also be in superposition of states. Only after measurement do they collapse to one of the classical states with a certain probability. This is modeled mathematically in the following way: Let |0⟩ ∈ C2 and |1⟩ ∈ C2 form an orthonormal basis of C2 (whose precise form depends on the setup of the physical system). For example, one can have in column vector notation:\\\\n\\\\nWhen the state of a qubit |ψ⟩ = α|0⟩ + β |1⟩ ∈ C2 is measured (with respect to the basis {|0⟩,|1⟩}), it results in a state\\\\n\\\\n[0) 1)\\\\n\\\\nwith probability |α|2 = |⟨0|ψ⟩|2,\\\\n\\\\nwith probability |β|2 = |⟨1|ψ⟩|2.\\\\n\\\\n(4)\\\\n\\\\nIn other words, a qubit exists in a superposition of classi- cal states as |ψ⟩ = α|0⟩+β |1⟩, where α and β are probability amplitudes. Upon measurement, the qubit collapses into either |0⟩ or |1⟩, with the measurement probability given by the square of the absolute value of the respective amplitude: |α|2 for |0⟩ and |β|2 for |1⟩. This is also called collapse of the\\\\n\\\\n3. Note that in quantum mechanics, states can also have higher dimension than qubits. For non-discrete quantities like momentum or position, these states are not finite-dimensional but functions from the Hilbert space L2(R3), i.e. a so-called Lebesgue space. In this case, the bra vectors can be understood as functionals from L2. Furthermore, even the formulation on L2 still has shortcomings and a full math- ematical treatment would have to resort to rigged Hilbert spaces to describe distributions and deal with unbounded operators [68].\\\\n\\\\n(2)\\\\n\\\\n(3)\\\\n\\\\nFig. 2: Visualising an arbitrary state of a qubit |ψ⟩ on the Bloch sphere along with its several widely encountered states. Although in the original space C2 the states |0⟩, |1⟩ are orthogonal, they are visualised as opposite poles on the Bloch sphere.\\\\n\\\\nwave function. The act of measurement w.r.t. a basis changes the state into one of the basis elements, with probabilities defined by the projections of the state onto the basis.\\\\n\\\\nBloch Sphere. As the probabilities of obtaining certain measurement outcomes depend only on the magnitude of the coefficients, it is easy to see that a change of global phase, i.e. a multiplication by a global factor eiϕ with ϕ ∈ R, does not affect any measurable properties of a qubit. Due to this ambiguity in the complex phase, it is common to fix α, the coefficient for the first basis vector |0⟩, to be real, i.e. α ∈ R, thereby resolving the ambiguity.\\\\n\\\\nAlong with the normalisation condition |α|2 + |β|2 = 1 as given in Eq. (1), any qubit state can be expressed as\\\\n\\\\nThe two angles θ ∈ [0,π] and φ ∈ [0,2π] naturally define a point on the unit sphere in three dimensions, known as the Bloch sphere. The state of a qubit |ψ⟩ is frequently visualised in this representation; see Fig. 2.\\\\n\\\\nMulti-Qubit Systems. When multiple qubits |ψ1⟩,|ψ2⟩,...,|ψn⟩ are considered, their overall state |ψ⟩ is described by the tensor or Kronecker product of the individual qubit states and results in a vector consisting of all combinations of products of the individual coefficients:\\\\n\\\\nFor two qubits |ψ1⟩ = α|0⟩ + β |1⟩ and |ψ⟩ = γ |0⟩ + δ |1⟩, this product state reads:\\\\n\\\\nA useful shorthand notation for the Kronecker product is\\\\n\\\\nand such a system of multiple qubits is called quantum register.\\\\n\\\\nNote that although the combination of multiple qubits is represented in a 2n-dimensional complex vector space, one would expect that actual (physically meaningful) vectors oc- cupy a low dimensional subset of only those 2n dimensional\\\\n\\\\n(5)\\\\n\\\\n(6)\\\\n\\\\n(7)\\\\n\\\\n(8)\\\\n\\\\nvectors that can be decomposed into a tensor product (6). Note, for example, that the tensor product of two vectors |ψ1⟩ and |ψ2⟩ is (a vectorisation of) their outer product, such that 2-qubit-states of the form (7) can be identified with rank-1 matrices in the space of all complex 2×2 matrices. The extremely interesting and important concept of entan- glement implies that multiple-qubit systems are not limited to so-called separable states that decompose as Eq. (6). The state\\\\n\\\\nfor example, cannot be decomposed as |ψ1⟩ ⊗ |ψ2⟩ from Eq. (7) since there are no α,β,γ,δ that can simultane- √ 2, βγ = ously satisfy the equations αγ = 0, αδ = 1/ √ 1/ 2, βδ = 0. This is one of the famous Einstein-Podolsky- Rosen states [10].\\\\n\\\\nA system of n many qubits can evolve into any state in C2n . States that can be represented as tensor- product states via Eq. (6) are called separable, states that do not admit such a representation are called entangled.\\\\n\\\\nNote that Assumption 2 extends to multi-qubit systems, irrespective of whether the system is separable or entangled. In other words, the probability of measuring a specific multi-qubit basis state is equal to the squared magnitude of the corresponding coefficient.\\\\n\\\\nQubit Evolution. It is possible to manipulate the state of an n-qubit quantum physical system experimentally. For example, let |ψ(0)⟩ be the state of the quantum system at time t. We can prepare an initial state |ψ(0)⟩ and manipulate it at any time t with external influences H(t) ∈ C2n×2n determining the system’s energies by letting the n-many qubits experience a certain coupling. The way the state vector behaves under such an evolution is described by the famous Schr¨odinger Equation.\\\\n\\\\nThe (time) evolution of a quantum state |ψ(t)⟩ is described by the Schr¨odinger Equation\\\\n\\\\n(10)\\\\n\\\\nwhere the Hamiltonian H(t) ∈ C2n×2n is a Hermitian matrix determined by the experimental setup, i is the imaginary unit and ℏ is the reduced Planck constant.\\\\n\\\\nA closer look at the solutions to this differential equation reveals that they follow some unitary time evolution [10]. This is in particular consistent with what we learned that quantum states have norm 1, since the length of the compo- nents is related to measurement probabilities.\\\\n\\\\n(9)\\\\n\\\\nFig. 2: Visualising an arbitrary state of a qubit |ψ⟩ on the Bloch sphere along with its several widely encountered states. Although in the original space C2 the states |0⟩, |1⟩ are orthogonal, they are visualised as opposite poles on the Bloch sphere.\\\\n\\\\nwave function. The act of measurement w.r.t. a basis changes the state into one of the basis elements, with probabilities defined by the projections of the state onto the basis.\\\\n\\\\nBloch Sphere. As the probabilities of obtaining certain measurement outcomes depend only on the magnitude of the coefficients, it is easy to see that a change of global phase, i.e. a multiplication by a global factor eiϕ with ϕ ∈ R, does not affect any measurable properties of a qubit. Due to this ambiguity in the complex phase, it is common to fix α, the coefficient for the first basis vector |0⟩, to be real, i.e. α ∈ R, thereby resolving the ambiguity.\\\\n\\\\nAlong with the normalisation condition |α|2 + |β|2 = 1 as given in Eq. (1), any qubit state can be expressed as\\\\n\\\\nThe two angles θ ∈ [0,π] and φ ∈ [0,2π] naturally define a point on the unit sphere in three dimensions, known as the Bloch sphere. The state of a qubit |ψ⟩ is frequently visualised in this representation; see Fig. 2.\\\\n\\\\nMulti-Qubit Systems. When multiple qubits |ψ1⟩,|ψ2⟩,...,|ψn⟩ are considered, their overall state |ψ⟩ is described by the tensor or Kronecker product of the individual qubit states and results in a vector consisting of all combinations of products of the individual coefficients:\\\\n\\\\nFor two qubits |ψ1⟩ = α|0⟩ + β |1⟩ and |ψ⟩ = γ |0⟩ + δ |1⟩, this product state reads:\\\\n\\\\nA useful shorthand notation for the Kronecker product is\\\\n\\\\nand such a system of multiple qubits is called quantum register.\\\\n\\\\nNote that although the combination of multiple qubits is represented in a 2n-dimensional complex vector space, one would expect that actual (physically meaningful) vectors oc- cupy a low dimensional subset of only those 2n dimensional\\\\n\\\\n(5)\\\\n\\\\n(6)\\\\n\\\\n(7)\\\\n\\\\n(8)\\\\n\\\\nvectors that can be decomposed into a tensor product (6). Note, for example, that the tensor product of two vectors |ψ1⟩ and |ψ2⟩ is (a vectorisation of) their outer product, such that 2-qubit-states of the form (7) can be identified with rank-1 matrices in the space of all complex 2×2 matrices. The extremely interesting and important concept of entan- glement implies that multiple-qubit systems are not limited to so-called separable states that decompose as Eq. (6). The state\\\\n\\\\nfor example, cannot be decomposed as |ψ1⟩ ⊗ |ψ2⟩ from Eq. (7) since there are no α,β,γ,δ that can simultane- √ 2, βγ = ously satisfy the equations αγ = 0, αδ = 1/ √ 1/ 2, βδ = 0. This is one of the famous Einstein-Podolsky- Rosen states [10].\\\\n\\\\nA system of n many qubits can evolve into any state in C2n . States that can be represented as tensor- product states via Eq. (6) are called separable, states that do not admit such a representation are called entangled.\\\\n\\\\nNote that Assumption 2 extends to multi-qubit systems, irrespective of whether the system is separable or entangled. In other words, the probability of measuring a specific multi-qubit basis state is equal to the squared magnitude of the corresponding coefficient.\\\\n\\\\nQubit Evolution. It is possible to manipulate the state of an n-qubit quantum physical system experimentally. For example, let |ψ(0)⟩ be the state of the quantum system at time t. We can prepare an initial state |ψ(0)⟩ and manipulate it at any time t with external influences H(t) ∈ C2n×2n determining the system’s energies by letting the n-many qubits experience a certain coupling. The way the state vector behaves under such an evolution is described by the famous Schr¨odinger Equation.\\\\n\\\\nThe (time) evolution of a quantum state |ψ(t)⟩ is described by the Schr¨odinger Equation\\\\n\\\\n(10)\\\\n\\\\nwhere the Hamiltonian H(t) ∈ C2n×2n is a Hermitian matrix determined by the experimental setup, i is the imaginary unit and ℏ is the reduced Planck constant.\\\\n\\\\nA closer look at the solutions to this differential equation reveals that they follow some unitary time evolution [10]. This is in particular consistent with what we learned that quantum states have norm 1, since the length of the compo- nents is related to measurement probabilities.\\\\n\\\\n(9)\\\\n\\\\n2.2 Gate-Based Quantum Computers\\\\n\\\\nAs alluded to in Sec. 2.1, the Schr¨odinger equation (10) gov', 'name': 'generate_summary'} because the `arguments` is not valid JSON.\n", + "For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE \u001b[0mInvalid or incomplete response" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `collapse_summaries` with `[{\"summary\":\"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"},{\"summary\":\"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"}]`\n", + "\n", + "\n", + "\u001b[0m\u001b[38;5;200m\u001b[1;3m{'status': 'success', 'collapsed_summaries': [{'summary': \"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"}, {'summary': \"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"}], 'message': 'Summaries already fit in context (637 tokens). No collapse needed.'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_document_text` with `blog_faster-transformers_20251009_165057.html`\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'filename': 'blog_faster-transformers_20251009_165057.html', 'pages': 15, 'text': 'Back to Articles\\n\\nTricks from OpenAI gpt-oss YOU 🫵 can use with transformers\\n\\nPublished September 11, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n152\\n\\nAritra Roy Gosthipaty\\n\\nariG23498\\n\\nSergio Paniego\\n\\nsergiopaniego\\n\\nVaibhav Srivastav\\n\\nreach-vb\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nArthur Zucker\\n\\nArthurZ\\n\\nNathan Habib\\n\\nSaylorTwift\\n\\nCyril Vallez\\n\\ncyrilvallez\\n\\nOpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models.\\n\\nIn this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can benefit from them. Providing clean implementations of new methods in transformers also allows the community to quickly understand and adopt them. Frameworks such as MLX, llama.cpp or vLLM can use the transformers code as a reference to build their own implementations.\\n\\nFor this release, we worked on:\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nMXFP4 Quantization\\n\\nTensor Parallelism\\n\\nExpert Parallelism\\n\\nDynamic Sliding Window Layer & Cache\\n\\nContinuous Batching & Paged Attention\\n\\nLoad larger models faster\\n\\nBest part: Most of these features should work across all major models within transformers!\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nA kernel is a specialized, compact program that runs on accelerators to execute tasks like matrix multiplications, activations, or normalizations. In eager PyTorch, operations trigger individual kernels sequentially, which is straightforward but can incur extra memory transfers and launch overheads. PyTorch 2.0\\'s torch.compile with backends like TorchInductor addresses this by automatically fusing and optimizing kernels, delivering 2–10× performance gains.\\n\\nIn addition, the community has created custom kernels for frequent combinations of operations, not just individual PyTorch ops like matmul. For example, Flash Attention was created to optimize the critical attention block that defines the transformers architecture, and is present in many models including most LLMs. By carefully combining all the attention operations inside a single kernel, memory transfers are minimized, memory use is reduced, and speedups can be achieved.\\n\\nThe problem is that all these various kernels are available in separate libraries, which creates a dependency bloat if they were to be added to the transformers library. Furthermore, these kernels are not just Python code, they consist of low-level cuda code, glued together with C++ and exposed through a Python layer. This means they have to be compiled in the target system, which in turn requires whatever build system is required by each kernel library.\\n\\nThe kernels package solves this problem by downloading pre-built binaries of supported kernels from the Hub. You just indicate the kernel you want to use, and kernels will look for a version compatible with your system and download it on first use.\\n\\nCustom Kernels for GPT-OSS\\n\\nGPT-OSS, a Mixture of Experts (MoE) model, is a big user of Kernels from the Hub. It leverages several custom kernels:\\n\\nLiger RMSNorm, used as @use_kernel_forward_from_hub(\"RMSNorm\")`\\n\\nMegablocks MoE kernels: @use_kernel_forward_from_hub(\"MegaBlocksMoeMLP\")\\n\\nFlash Attention 3 with support for attention sinks.\\n\\nMXFP4 triton kernels (covered later)\\n\\nLet\\'s take a look at the first two ones.\\n\\nBehind the scenes, the decorators (1 and 2) simply point to community-contributed kernels. For example, RMSNorm comes from liger_kernels, while the MegaBlocksMoeMLP kernel comes from megablocks. Depending on your device (CUDA or ROCm) and whether you’re training or running inference, the right kernel is pulled in automatically.\\n\\nThis design is both specific and general: the RMSNorm liger kernels are already being reused across multiple models, and the MoE kernel could be applied to future MoEs as well.\\n\\nBecause kernels pulls code from the Hub, you have to opt-in to this feature by passing use_kernels=True in your model instantiation, as shown below. We enable INFO logging in the example so you can easily verify that downloadable kernels are in use.\\n\\nThese kernels are not compatible with mxfp4, so inference will happen in bfloat16 if you use them. Please, benchmark your system for the best combination in memory and throughput that suits your project!\\n\\nfrom transformers import AutoTokenizer, AutoModelForCausalLM import logging logging.basicConfig(level=logging.INFO) model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", use_kernels=True, )\\n\\nRunning a quick generation yields log messages like\\n\\nINFO:root:Using layer `LigerRMSNorm` from repo `kernels-community/liger_kernels` INFO:root:Using layer `MegaBlocksMoeMLP` from repo `kernels-community/megablocks`\\n\\nFigure 1 shows that, in the system we tested, these kernels work best for larger batch sizes. We always recommend to benchmark any performance-related changes as closely to your production conditions as possible.\\n\\nFigure 1: Benchmarking results of custom kernels\\n\\nYou can explore and play with the benchmarking script here\\n\\nFlash Attention 3\\n\\nOpenAI gpt-oss models use attention sinks, which improves quality and facilitates the use of longer contexts. The vLLM team added this feature to the latest version of Flash Attention (Flash Attention 3), and the resulting custom kernel is available on the Hub. Currently, this kernel is compatible with the Hopper architecture. If you have one, this is the way to enable it:\\n\\nmodel = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", + # Flash Attention with Sinks + attn_implementation=\"kernels-community/vllm-flash-attn3\", )\\n\\nMXFP4 Quantization\\n\\nLarge language models are memory-hungry. Quantization reduces memory footprint by storing weights (and sometimes activations) in lower-precision formats. For reference, FP32 uses 32 bits per number and BF16 uses 16. By reducing bit width, we trade some precision for smaller models and faster memory movement.\\n\\nIf you want a visual primer on quantization trade-offs, Maarten Grootendorst’s article is excellent: A Visual Guide to Quantization.\\n\\nWhat is MXFP4\\n\\nFigure 2: The E2M1 format used in the MXFP4 format\\n\\nMXFP4 is a 4-bit floating format with E2M1 layout: 1 sign bit, 2 exponent bits, and 1 mantissa bit, as shown in Figure 2. On its own, E2M1 is very coarse. MXFP4 compensates with blockwise scaling:\\n\\nVectors are grouped into blocks of 32 elements.\\n\\nEach block stores a shared scale that restores dynamic range when dequantizing.\\n\\nInside each block, 4-bit values represent numbers relative to that scale.\\n\\nThis blockwise scheme lets MXFP4 keep range while using very few bits. In practice, GPT-OSS 20B fits in roughly 16 GB of VRAM and GPT-OSS 120B fits in roughly 80 GB when MXFP4 is active, which is the difference between “cannot load” and “can run on a single GPU.” The catch is that matrix multiplies now have to respect block scales. Doing this efficiently at scale requires dedicated kernels.\\n\\nMXFP4 in transformers\\n\\ntransformers now includes native support for MXFP4, leveraging optimized triton (MXFP4) kernels for enhanced performance. This builds on the community-driven kernel distribution discussed earlier, utilizing pre-compiled kernels from the Hub to simplify deployment.\\n\\nKey implementation details:\\n\\nQuantizer logic: Found in the MXFP4 quantizer file, this handles the core quantization process for MXFP4.\\n\\nIntegration hooks: The MXFP4 integration file enables seamless use of MXFP4 within the transformers framework.\\n\\nTo check if a model supports MXFP4, inspect its configuration:\\n\\nfrom transformers import GptOssConfig model_id = \"openai/gpt-oss-120b\" cfg = GptOssConfig.from_pretrained(model_id) print(cfg.quantization_config) # Example output: # { # \\'modules_to_not_convert\\': [ # \\'model.layers.*.self_attn\\', # \\'model.layers.*.mlp.router\\', # \\'model.embed_tokens\\', # \\'lm_head\\' # ], # \\'quant_method\\': \\'mxfp4\\' # }\\n\\nIf \\'quant_method\\': \\'mxfp4\\' is present, the model will automatically use the MXFP4 pathway with Triton kernels when supported.\\n\\nThanks to this pull request, you can fine-tune gpt-oss models and save them directly to the Hub in MXFP4 format, streamlining deployment with optimized performance.\\n\\nRequirements and fallbacks\\n\\nTo run MXFP4 on GPU you need:\\n\\naccelerate, kernels, and triton>=3.4 installed. Note that Pytorch 2.8 already comes with triton 3.4, so you only need to manually install triton if using Pytorch 2.7.\\n\\nNVIDIA GPU with compute capability ≥ 7.5. This goes all the way back to Tesla, so you can run gpt-oss-20b on the free tiers of Google Colab and Kaggle, and on many consumer GPUs.\\n\\nIf these constraints are not met, transformers falls back to a higher-precision path (bfloat16 is used by default), which requires about 4× the memory of MXFP4.\\n\\nThe snippet loads GPT-OSS twice on CUDA: once with Mxfp4Config(dequantize=True) (memory intensive) and once in the default quantized path (memory efficient). Figure 3 shows the amount of used VRAM after each load so you can visualize the savings.\\n\\nFigure 3: Memory requirements for the quantized and dequantized models\\n\\nKernels for MXFP4\\n\\nEfficient MXFP4 requires kernels that understand 32-element blocks and their scales during GEMMs and fused ops. This is where Kernels from the Hub comes in again. transformers automatically pulls in the MXFP4-aware Triton kernels from the community repository when you load a model that needs them. The repository will appear in your local cache and will be used during the forward pass. For the MXFP4 kernels one does not need to use the use_kernels=True parameter like before, it is set to default in transformers.\\n\\nQuick sanity check with the Hugging Face cache CLI, after running gpt-oss-20b on a GPU compatible with the triton MXFP4 kernels:\\n\\nhf cache scan\\n\\nSample output:\\n\\nREPO ID REPO TYPE SIZE ON DISK -------------------------------- --------- ------------ kernels-community/triton_kernels model 536.2K openai/gpt-oss-20b model 13.8G\\n\\nThis indicates the MXFP4 kernels were fetched and are available for execution.\\n\\nLet\\'s run some benchmarks and see how well the MXFP4 kernels perform. In Figure 4, we see that the MXFP4 kernels are even better than the custom MoE and RMSNorm kernels for larger batches.\\n\\nFigure 4: MXFP4 kernel benchmark\\n\\nYou can explore and play with the benchmarking script here\\n\\nTensor Parallelism\\n\\nFigure 5: Explanation of tensor parallelism.\\n\\nTensor Parallelism (TP) splits tensors inside a layer across multiple GPUs (as shown in Figure 5). Each GPU multiplies its shard in parallel, and then partial results are collected using all-gather or all-reduce operations. This reduces per-GPU memory and keeps all GPUs working on the same layer, which improves throughput as sequence length or batch size grow. TP is communication-intensive and generally works best on a single machine with fast intra-node links.\\n\\nWhat this enables in transformers\\n\\ntransformers implements TP directly in from_pretrained. You can start with the predefined plan:\\n\\n# run with: torchrun --nproc-per-node 4 tp_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, tp_plan=\"auto\", # built in TP support dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nIf you don’t have the infrastructure to run the above, you can just spawn a process on our GPUs using Hugging Face Jobs!\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/tp_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 tp_gpt_oss.py\"\\n\\nhf jobs is available for all Hugging Face PRO & Enterprise users.\\n\\nUnder the hood, tp_plan=\"auto\" selects a predefined sharding recipe for each layer and wires the necessary collectives. You can inspect the active plan with print(model._tp_plan) if you want to verify what is being sharded.\\n\\nWhen to reach for TP\\n\\nUse TP when the model is too large for one GPU and you want parallel compute, not only memory placement. TP tends to scale throughput with more GPUs, especially for long sequences or larger batches.\\n\\nIf you are curious about how TP differs from device_map=\"auto\" (memory placement), this short Stack Overflow answer explains the distinction and when to use each.\\n\\nTo learn more about TP, here are two must-read resources:\\n\\ntransformers guide: Tensor parallelism, supported models, plans, and extension points.\\n\\nUltra-Scale Playbook: background on TP and its relationship to other parallelism modes.\\n\\nExpert Parallelism\\n\\nExpert Parallelism (EP) shards experts inside MoE layers across GPUs. Each token is routed to one or a few experts, so only those experts run their feed-forward pass. Since experts are independent MLPs, we can place different experts on different ranks and exchange only the hidden states for the routed tokens. This keeps the matrix multiplies intact on each rank and replaces tensor slicing with routing and collectives.\\n\\nRun with multiple processes using torchrun. EP is enabled via the distributed configuration and works with GPT-OSS MoE layers out of the box in transformers.\\n\\n# run with: torchrun --nproc-per-node 4 ep_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM from transformers.distributed import DistributedConfig model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, distributed_config=DistributedConfig(enable_expert_parallel=True), # enabling EP dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nHere is how you would run using hf jobs\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/ep_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 ep_gpt_oss.py\"\\n\\nWhen you enable Expert Parallelism, Tensor Parallelism is also activated. This means you enjoy the best of both worlds!\\n\\nDynamic Sliding Window Layer & Cache\\n\\nMany recent LLMs use sliding window attention, or a combination of sliding and global attention layers, as a means to save memory and reduce those expensive quadratic matmuls that grow with sequence length. However, the dynamic KV cache implementation in transformers used to continue to allocate space according to sequence length, without looking at the individual attention layers. You could always optimize memory using compilation (meaning, fixed shapes), but that\\'s a separate scenario altogether.\\n\\ntransformers now has a DynamicSlidingWindowLayer and a config‑aware DynamicCache. If the model config declares sliding‑window or hybrid attention (both sliding and global attention layers are used), the cache stops growing past the window for the sliding layers. If you don’t pass the config, behavior stays as before (full, ever‑growing KV as sequence length grows).\\n\\nFor models that only use sliding window layers, such as Mistral 7B, cache memory stops growing when the sequence reaches the window size (4096, in this case). This makes sense, because the sliding layers can\\'t look past the previous 4K tokens anyway.\\n\\nOpenAI gpt-oss alternates between sliding and global attention layers, which results in total KV cache memory being halved, as we\\'ll see, as sequence length increases. This provides us with:\\n\\nMuch lower KV‑cache memory for models with sliding or hybrid attention (e.g. GPT‑OSS). Cache growth plateaus once the window is reached (e.g., 4K for Mistral; 128 for GPT‑OSS sliding layers), instead of scaling linearly with total generated tokens. (GitHub, Transformers)\\n\\nSpeed/latency wins on long prompts/long generations: smaller KV tensors mean lighter attention reads/writes and less memory bandwidth pressure, especially after the window is hit. (This is the central motivation behind sliding‑window/hybrid LLMs.) (AI21, vLLM Blog)\\n\\nHow to use it\\n\\nThe optimized cache is set by default, that means you don\\'t have to make any changes to your existing code. If you want to create the DynamicCache explicitly here is how you would do it:\\n\\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Always respond in riddles\"}, {\"role\": \"user\", \"content\": \"What is the weather like in Madrid?\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) cache = DynamicCache(config=model.config) # create the cache with the model\\'s config generated = model.generate( **inputs, max_new_tokens=500, past_key_values=cache ) print(tokenizer.decode(generated[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nFigure 6 showcases how much of a difference it makes for us to use the Dynamic KV Cache with sliding window attention.\\n\\nFigure 6: The memory analysis of dynamic cache with sliding window attention\\n\\nContinuous Batching & Paged Attention\\n\\nA typical autoregressive generation process looks like Figure 7. You input the prefill tokens, and the model predicts each new token one after the other until it predicts the EOS (End of Sequence) token.\\n\\nFigure 7: Autoregressive token generation\\n\\nLet’s see what the generation process looks like when we pass a batch of inputs. In Figure 8 you notice that some generations finish off earlier than the others. This mismatch of length underutilizes the GPUs.\\n\\nFigure 8: Static batching of sequences\\n\\nThis type of batching sequences is called static batching. While this is simple and easy to understand, it inherently comes with inefficiencies. Only after each sentence is completely generated can we move on to the next batch.\\n\\nTo bypass this issue, we use dynamic batching (also known as continuous batching). Instead of waiting for all the generation to finish, we schedule incoming requests to the completed generations. That way, as soon as a generation in a batch is complete, we prefill the batch with the next request. The process looks like Figure 9.\\n\\nFigure 9: Continuous Batching of sequences\\n\\nTransformers supports continuous batching with the generate_batch API. This is not meant for production-grade model serving –frameworks like vLLM and SGLang are great at that–, but can be very helpful for evaluation and experimentation. Here is an example script that runs CB end to end on Qwen/Qwen3-4B-Instruct-2507.\\n\\nWe have also performed a benchmark between Continuous Batching and Static Batching with 100 samples. In Figure 9, we note that CB is quite faster than SB.\\n\\nFigure 9: Continuous vs Static Batching Tokens/Second\\n\\nYou can play around with the benchmark here: SB, CB\\n\\nLoad larger models faster\\n\\nWhen you load a large model into your GPU, PyTorch needs to reserve GPU memory for each layer’s weights. Each of these requests (per layer) takes time, and for multi-billion-parameter models it can mean thousands of tiny memory allocations, adding up to a long wait before the model is ready. Instead of asking the GPU for new memory every single time, it can hold on to a big chunk once and then hand out slices from it quickly.\\n\\nPyTorch allocators can do exactly this. The catch is that the allocator only gets fast after you’ve given it some memory to work with. If you don’t “stock the pantry” first, you still end up doing many slow trips to the market. This PR (🎉 #36380) taught transformers to pre-stock the pantry before it starts copying model weights.\\n\\nIt:\\n\\nLooks at the device_map (where each layer will live).\\n\\nPre-allocates a big enough block on each GPU.\\n\\nThen, as layers are copied in, they just slot neatly into this pre-reserved space.\\n\\nYou have to make no changes to your existing code, as this is default behaviour in transformers. If you use device_map=\"auto\" or provide your own device map, your model will now load faster automatically. If you’re running with Tensor Parallel (tp_plan=\"auto\") and torchrun you also benefit from companion changes that make multi-GPU loading smarter.\\n\\nConclusion\\n\\ntransformers moves quickly and it is community-first. The library evolves at the pace of the field because contributors shape it in the open. Pieces added for new models become part of the toolkit and are reused in future integrations.\\n\\nThis velocity enables day-zero integrations like the GPT-OSS series. As the stack becomes increasingly PyTorch-first, it trims bloat and doubles down on the PyTorch paths that matter in practice. The result is a cleaner core that unlocks new capabilities through community kernels, quantization, and parallelism plans, while also standardizing model definitions so that architectures supported in transformers are a reference and extend across the wider ecosystem.\\n\\nThis post is a one-time snapshot of a process we repeatedly iterate on towards the same direction: serve the needs of the community. To be up to date with the latest additions to transformers, check the docs and release notes. And please, keep sharing your feedback and releasing your models in transformers for the community to enjoy 🤗\\n\\nRead More\\n\\nIf you want to go further into particular topics, here is a list of links that one should visit:\\n\\nHugging Face GPT-OSS Recipes Repository\\n\\nWelcome GPT OSS: OpenAI\\'s New Open-Source Model Family\\n\\nOpenAI Cookbook: GPT-OSS Topic\\n\\nTransformers Documentation: Distributed Inference on Multiple GPUs\\n\\nMatthew Carrigan\\'s X Thread on GPT OSS Innovations\\n\\nYouTube Video: OpenAI GPT OSS Announcement\\n\\nTransformers PR #36380: Faster Model Loading on Accelerators\\n\\nTransformers PR #36335: Update from_pretrained for Tensor Parallelism\\n\\nTransformers PR #40039: New Dynamic Sliding Window Layer and Cache\\n\\nHAN Lab Blog: How Attention Sinks Keep Language Models Stable\\n\\nMore Articles from our Blog\\n\\nMake your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation\\n\\nBy September 2, 2025 • 66\\n\\nFrom Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels\\n\\nBy August 18, 2025 • 75\\n\\nCommunity\\n\\nmmalam786\\n\\nabout 19 hours ago\\n\\nVery interesting post — I appreciate how you highlight MXFP4 quantization and the performance trade-offs in Transformers. In my own experiments, I applied quantization to a mmalam786/distilbert-sst2-int8-onnx-demo and saw ~20-30% latency reduction with negligible accuracy drop. Do you have insights on how MXFP4 handles activation quantization vs weight quantization in edge scenarios? Happy to share my demo if useful: https://www.linkedin.com/posts/dr-mm-alam-93991120b_demofirst-aichips-edgeai-activity-7381674484098883584-0Rwn/?utm_source=share&utm_medium=member_desktop&rcm=ACoAADVZuP0BheDJgKL8dWk-bNo7Yd4zhsOnNL4\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n152\\n\\nBack to Articles\\n\\nTricks from OpenAI gpt-oss YOU 🫵 can use with transformers\\n\\nPublished September 11, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n152\\n\\nAritra Roy Gosthipaty\\n\\nariG23498\\n\\nSergio Paniego\\n\\nsergiopaniego\\n\\nVaibhav Srivastav\\n\\nreach-vb\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nArthur Zucker\\n\\nArthurZ\\n\\nNathan Habib\\n\\nSaylorTwift\\n\\nCyril Vallez\\n\\ncyrilvallez\\n\\nOpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models.\\n\\nIn this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can benefit from them. Providing clean implementations of new methods in transformers also allows the community to quickly understand and adopt them. Frameworks such as MLX, llama.cpp or vLLM can use the transformers code as a reference to build their own implementations.\\n\\nFor this release, we worked on:\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nMXFP4 Quantization\\n\\nTensor Parallelism\\n\\nExpert Parallelism\\n\\nDynamic Sliding Window Layer & Cache\\n\\nContinuous Batching & Paged Attention\\n\\nLoad larger models faster\\n\\nBest part: Most of these features should work across all major models within transformers!\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nA kernel is a specialized, compact program that runs on accelerators to execute tasks like matrix multiplications, activations, or normalizations. In eager PyTorch, operations trigger individual kernels sequentially, which is straightforward but can incur extra memory transfers and launch overheads. PyTorch 2.0\\'s torch.compile with backends like TorchInductor addresses this by automatically fusing and optimizing kernels, delivering 2–10× performance gains.\\n\\nIn addition, the community has created custom kernels for frequent combinations of operations, not just individual PyTorch ops like matmul. For example, Flash Attention was created to optimize the critical attention block that defines the transformers architecture, and is present in many models including most LLMs. By carefully combining all the attention operations inside a single kernel, memory transfers are minimized, memory use is reduced, and speedups can be achieved.\\n\\nThe problem is that all these various kernels are available in separate libraries, which creates a dependency bloat if they were to be added to the transformers library. Furthermore, these kernels are not just Python code, they consist of low-level cuda code, glued together with C++ and exposed through a Python layer. This means they have to be compiled in the target system, which in turn requires whatever build system is required by each kernel library.\\n\\nThe kernels package solves this problem by downloading pre-built binaries of supported kernels from the Hub. You just indicate the kernel you want to use, and kernels will look for a version compatible with your system and download it on first use.\\n\\nCustom Kernels for GPT-OSS\\n\\nGPT-OSS, a Mixture of Experts (MoE) model, is a big user of Kernels from the Hub. It leverages several custom kernels:\\n\\nLiger RMSNorm, used as @use_kernel_forward_from_hub(\"RMSNorm\")`\\n\\nMegablocks MoE kernels: @use_kernel_forward_from_hub(\"MegaBlocksMoeMLP\")\\n\\nFlash Attention 3 with support for attention sinks.\\n\\nMXFP4 triton kernels (covered later)\\n\\nLet\\'s take a look at the first two ones.\\n\\nBehind the scenes, the decorators (1 and 2) simply point to community-contributed kernels. For example, RMSNorm comes from liger_kernels, while the MegaBlocksMoeMLP kernel comes from megablocks. Depending on your device (CUDA or ROCm) and whether you’re training or running inference, the right kernel is pulled in automatically.\\n\\nThis design is both specific and general: the RMSNorm liger kernels are already being reused across multiple models, and the MoE kernel could be applied to future MoEs as well.\\n\\nBecause kernels pulls code from the Hub, you have to opt-in to this feature by passing use_kernels=True in your model instantiation, as shown below. We enable INFO logging in the example so you can easily verify that downloadable kernels are in use.\\n\\nThese kernels are not compatible with mxfp4, so inference will happen in bfloat16 if you use them. Please, benchmark your system for the best combination in memory and throughput that suits your project!\\n\\nfrom transformers import AutoTokenizer, AutoModelForCausalLM import logging logging.basicConfig(level=logging.INFO) model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", use_kernels=True, )\\n\\nRunning a quick generation yields log messages like\\n\\nINFO:root:Using layer `LigerRMSNorm` from repo `kernels-community/liger_kernels` INFO:root:Using layer `MegaBlocksMoeMLP` from repo `kernels-community/megablocks`\\n\\nFigure 1 shows that, in the system we tested, these kernels work best for larger batch sizes. We always recommend to benchmark any performance-related changes as closely to your production conditions as possible.\\n\\nFigure 1: Benchmarking results of custom kernels\\n\\nYou can explore and play with the benchmarking script here\\n\\nFlash Attention 3\\n\\nOpenAI gpt-oss models use attention sinks, which improves quality and facilitates the use of longer contexts. The vLLM team added this feature to the latest version of Flash Attention (Flash Attention 3), and the resulting custom kernel is available on the Hub. Currently, this kernel is compatible with the Hopper architecture. If you have one, this is the way to enable it:\\n\\nmodel = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", + # Flash Attention with Sinks + attn_implementation=\"kernels-community/vllm-flash-attn3\", )\\n\\nMXFP4 Quantization\\n\\nLarge language models are memory-hungry. Quantization reduces memory footprint by storing weights (and sometimes activations) in lower-precision formats. For reference, FP32 uses 32 bits per number and BF16 uses 16. By reducing bit width, we trade some precision for smaller models and faster memory movement.\\n\\nIf you want a visual primer on quantization trade-offs, Maarten Grootendorst’s article is excellent: A Visual Guide to Quantization.\\n\\nWhat is MXFP4\\n\\nFigure 2: The E2M1 format used in the MXFP4 format\\n\\nMXFP4 is a 4-bit floating format with E2M1 layout: 1 sign bit, 2 exponent bits, and 1 mantissa bit, as shown in Figure 2. On its own, E2M1 is very coarse. MXFP4 compensates with blockwise scaling:\\n\\nVectors are grouped into blocks of 32 elements.\\n\\nEach block stores a shared scale that restores dynamic range when dequantizing.\\n\\nInside each block, 4-bit values represent numbers relative to that scale.\\n\\nThis blockwise scheme lets MXFP4 keep range while using very few bits. In practice, GPT-OSS 20B fits in roughly 16 GB of VRAM and GPT-OSS 120B fits in roughly 80 GB when MXFP4 is active, which is the difference between “cannot load” and “can run on a single GPU.” The catch is that matrix multiplies now have to respect block scales. Doing this efficiently at scale requires dedicated kernels.\\n\\nMXFP4 in transformers\\n\\ntransformers now includes native support for MXFP4, leveraging optimized triton (MXFP4) kernels for enhanced performance. This builds on the community-driven kernel distribution discussed earlier, utilizing pre-compiled kernels from the Hub to simplify deployment.\\n\\nKey implementation details:\\n\\nQuantizer logic: Found in the MXFP4 quantizer file, this handles the core quantization process for MXFP4.\\n\\nIntegration hooks: The MXFP4 integration file enables seamless use of MXFP4 within the transformers framework.\\n\\nTo check if a model supports MXFP4, inspect its configuration:\\n\\nfrom transformers import GptOssConfig model_id = \"openai/gpt-oss-120b\" cfg = GptOssConfig.from_pretrained(model_id) print(cfg.quantization_config) # Example output: # { # \\'modules_to_not_convert\\': [ # \\'model.layers.*.self_attn\\', # \\'model.layers.*.mlp.router\\', # \\'model.embed_tokens\\', # \\'lm_head\\' # ], # \\'quant_method\\': \\'mxfp4\\' # }\\n\\nIf \\'quant_method\\': \\'mxfp4\\' is present, the model will automatically use the MXFP4 pathway with Triton kernels when supported.\\n\\nThanks to this pull request, you can fine-tune gpt-oss models and save them directly to the Hub in MXFP4 format, streamlining deployment with optimized performance.\\n\\nRequirements and fallbacks\\n\\nTo run MXFP4 on GPU you need:\\n\\naccelerate, kernels, and triton>=3.4 installed. Note that Pytorch 2.8 already comes with triton 3.4, so you only need to manually install triton if using Pytorch 2.7.\\n\\nNVIDIA GPU with compute capability ≥ 7.5. This goes all the way back to Tesla, so you can run gpt-oss-20b on the free tiers of Google Colab and Kaggle, and on many consumer GPUs.\\n\\nIf these constraints are not met, transformers falls back to a higher-precision path (bfloat16 is used by default), which requires about 4× the memory of MXFP4.\\n\\nThe snippet loads GPT-OSS twice on CUDA: once with Mxfp4Config(dequantize=True) (memory intensive) and once in the default quantized path (memory efficient). Figure 3 shows the amount of used VRAM after each load so you can visualize the savings.\\n\\nFigure 3: Memory requirements for the quantized and dequantized models\\n\\nKernels for MXFP4\\n\\nEfficient MXFP4 requires kernels that understand 32-element blocks and their scales during GEMMs and fused ops. This is where Kernels from the Hub comes in again. transformers automatically pulls in the MXFP4-aware Triton kernels from the community repository when you load a model that needs them. The repository will appear in your local cache and will be used during the forward pass. For the MXFP4 kernels one does not need to use the use_kernels=True parameter like before, it is set to default in transformers.\\n\\nQuick sanity check with the Hugging Face cache CLI, after running gpt-oss-20b on a GPU compatible with the triton MXFP4 kernels:\\n\\nhf cache scan\\n\\nSample output:\\n\\nREPO ID REPO TYPE SIZE ON DISK -------------------------------- --------- ------------ kernels-community/triton_kernels model 536.2K openai/gpt-oss-20b model 13.8G\\n\\nThis indicates the MXFP4 kernels were fetched and are available for execution.\\n\\nLet\\'s run some benchmarks and see how well the MXFP4 kernels perform. In Figure 4, we see that the MXFP4 kernels are even better than the custom MoE and RMSNorm kernels for larger batches.\\n\\nFigure 4: MXFP4 kernel benchmark\\n\\nYou can explore and play with the benchmarking script here\\n\\nTensor Parallelism\\n\\nFigure 5: Explanation of tensor parallelism.\\n\\nTensor Parallelism (TP) splits tensors inside a layer across multiple GPUs (as shown in Figure 5). Each GPU multiplies its shard in parallel, and then partial results are collected using all-gather or all-reduce operations. This reduces per-GPU memory and keeps all GPUs working on the same layer, which improves throughput as sequence length or batch size grow. TP is communication-intensive and generally works best on a single machine with fast intra-node links.\\n\\nWhat this enables in transformers\\n\\ntransformers implements TP directly in from_pretrained. You can start with the predefined plan:\\n\\n# run with: torchrun --nproc-per-node 4 tp_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, tp_plan=\"auto\", # built in TP support dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nIf you don’t have the infrastructure to run the above, you can just spawn a process on our GPUs using Hugging Face Jobs!\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/tp_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 tp_gpt_oss.py\"\\n\\nhf jobs is available for all Hugging Face PRO & Enterprise users.\\n\\nUnder the hood, tp_plan=\"auto\" selects a predefined sharding recipe for each layer and wires the necessary collectives. You can inspect the active plan with print(model._tp_plan) if you want to verify what is being sharded.\\n\\nWhen to reach for TP\\n\\nUse TP when the model is too large for one GPU and you want parallel compute, not only memory placement. TP tends to scale throughput with more GPUs, especially for long sequences or larger batches.\\n\\nIf you are curious about how TP differs from device_map=\"auto\" (memory placement), this short Stack Overflow answer explains the distinction and when to use each.\\n\\nTo learn more about TP, here are two must-read resources:\\n\\ntransformers guide: Tensor parallelism, supported models, plans, and extension points.\\n\\nUltra-Scale Playbook: background on TP and its relationship to other parallelism modes.\\n\\nExpert Parallelism\\n\\nExpert Parallelism (EP) shards experts inside MoE layers across GPUs. Each token is routed to one or a few experts, so only those experts run their feed-forward pass. Since experts are independent MLPs, we can place different experts on different ranks and exchange only the hidden states for the routed tokens. This keeps the matrix multiplies intact on each rank and replaces tensor slicing with routing and collectives.\\n\\nRun with multiple processes using torchrun. EP is enabled via the distributed configuration and works with GPT-OSS MoE layers out of the box in transformers.\\n\\n# run with: torchrun --nproc-per-node 4 ep_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM from transformers.distributed import DistributedConfig model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, distributed_config=DistributedConfig(enable_expert_parallel=True), # enabling EP dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nHere is how you would run using hf jobs\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/ep_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 ep_gpt_oss.py\"\\n\\nWhen you enable Expert Parallelism, Tensor Parallelism is also activated. This means you enjoy the best of both worlds!\\n\\nDynamic Sliding Window Layer & Cache\\n\\nMany recent LLMs use sliding window attention, or a combination of sliding and global attention layers, as a means to save memory and reduce those expensive quadratic matmuls that grow with sequence length. However, the dynamic KV cache implementation in transformers used to continue to allocate space according to sequence length, without looking at the individual attention layers. You could always optimize memory using compilation (meaning, fixed shapes), but that\\'s a separate scenario altogether.\\n\\ntransformers now has a DynamicSlidingWindowLayer and a config‑aware DynamicCache. If the model config declares sliding‑window or hybrid attention (both sliding and global attention layers are used), the cache stops growing past the window for the sliding layers. If you don’t pass the config, behavior stays as before (full, ever‑growing KV as sequence length grows).\\n\\nFor models that only use sliding window layers, such as Mistral 7B, cache memory stops growing when the sequence reaches the window size (4096, in this case). This makes sense, because the sliding layers can\\'t look past the previous 4K tokens anyway.\\n\\nOpenAI gpt-oss alternates between sliding and global attention layers, which results in total KV cache memory being halved, as we\\'ll see, as sequence length increases. This provides us with:\\n\\nMuch lower KV‑cache memory for models with sliding or hybrid attention (e.g. GPT‑OSS). Cache growth plateaus once the window is reached (e.g., 4K for Mistral; 128 for GPT‑OSS sliding layers), instead of scaling linearly with total generated tokens. (GitHub, Transformers)\\n\\nSpeed/latency wins on long prompts/long generations: smaller KV tensors mean lighter attention reads/writes and less memory bandwidth pressure, especially after the window is hit. (This is the central motivation behind sliding‑window/hybrid LLMs.) (AI21, vLLM Blog)\\n\\nHow to use it\\n\\nThe optimized cache is set by default, that means you don\\'t have to make any changes to your existing code. If you want to create the DynamicCache explicitly here is how you would do it:\\n\\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Always respond in riddles\"}, {\"role\": \"user\", \"content\": \"What is the weather like in Madrid?\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) cache = DynamicCache(config=model.config) # create the cache with the model\\'s config generated = model.generate( **inputs, max_new_tokens=500, past_key_values=cache ) print(tokenizer.decode(generated[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nFigure 6 showcases how much of a difference it makes for us to use the Dynamic KV Cache with sliding window attention.\\n\\nFigure 6: The memory analysis of dynamic cache with sliding window attention\\n\\nContinuous Batching & Paged Attention\\n\\nA typical autoregressive generation process looks like Figure 7. You input the prefill tokens, and the model predicts each new token one after the other until it predicts the EOS (End of Sequence) token.\\n\\nFigure 7: Autoregressive token generation\\n\\nLet’s see what the generation process looks like when we pass a batch of inputs. In Figure 8 you notice that some generations finish off earlier than the others. This mismatch of length underutilizes the GPUs.\\n\\nFigure 8: Static batching of sequences\\n\\nThis type of batching sequences is called static batching. While this is simple and easy to understand, it inherently comes with inefficiencies. Only after each sentence is completely generated can we move on to the next batch.\\n\\nTo bypass this issue, we use dynamic batching (also known as continuous batching). Instead of waiting for all the generation to finish, we schedule incoming requests to the completed generations. That way, as soon as a generation in a batch is complete, we prefill the batch with the next request. The process looks like Figure 9.\\n\\nFigure 9: Continuous Batching of sequences\\n\\nTransformers supports continuous batching with the generate_batch API. This is not meant for production-grade model serving –frameworks like vLLM and SGLang are great at that–, but can be very helpful for evaluation and experimentation. Here is an example script that runs CB end to end on Qwen/Qwen3-4B-Instruct-2507.\\n\\nWe have also performed a benchmark between Continuous Batching and Static Batching with 100 samples. In Figure 9, we note that CB is quite faster than SB.\\n\\nFigure 9: Continuous vs Static Batching Tokens/Second\\n\\nYou can play around with the benchmark here: SB, CB\\n\\nLoad larger models faster\\n\\nWhen you load a large model into your GPU, PyTorch needs to reserve GPU memory for each layer’s weights. Each of these requests (per layer) takes time, and for multi-billion-parameter models it can mean thousands of tiny memory allocations, adding up to a long wait before the model is ready. Instead of asking the GPU for new memory every single time, it can hold on to a big chunk once and then hand out slices from it quickly.\\n\\nPyTorch allocators can do exactly this. The catch is that the allocator only gets fast after you’ve given it some memory to work with. If you don’t “stock the pantry” first, you still end up doing many slow trips to the market. This PR (🎉 #36380) taught transformers to pre-stock the pantry before it starts copying model weights.\\n\\nIt:\\n\\nLooks at the device_map (where each layer will live).\\n\\nPre-allocates a big enough block on each GPU.\\n\\nThen, as layers are copied in, they just slot neatly into this pre-reserved space.\\n\\nYou have to make no changes to your existing code, as this is default behaviour in transformers. If you use device_map=\"auto\" or provide your own device map, your model will now load faster automatically. If you’re running with Tensor Parallel (tp_plan=\"auto\") and torchrun you also benefit from companion changes that make multi-GPU loading smarter.\\n\\nConclusion\\n\\ntransformers moves quickly and it is community-first. The library evolves at the pace of the field because contributors shape it in the open. Pieces added for new models become part of the toolkit and are reused in future integrations.\\n\\nThis velocity enables day-zero integrations like the GPT-OSS series. As the stack becomes increasingly PyTorch-first, it trims bloat and doubles down on the PyTorch paths that matter in practice. The result is a cleaner core that unlocks new capabilities through community kernels, quantization, and parallelism plans, while also standardizing model definitions so that architectures supported in transformers are a reference and extend across the wider ecosystem.\\n\\nThis post is a one-time snapshot of a process we repeatedly iterate on towards the same direction: serve the needs of the community. To be up to date with the latest additions to transformers, check the docs and release notes. And please, keep sharing your feedback and releasing your models in transformers for the community to enjoy 🤗\\n\\nRead More\\n\\nIf you want to go further into particular topics, here is a list of links that one should visit:\\n\\nHugging Face GPT-OSS Recipes Repository\\n\\nWelcome GPT OSS: OpenAI\\'s New Open-Source Model Family\\n\\nOpenAI Cookbook: GPT-OSS Topic\\n\\nTransformers Documentation: Distributed Inference on Multiple GPUs\\n\\nMatthew Carrigan\\'s X Thread on GPT OSS Innovations\\n\\nYouTube Video: OpenAI GPT OSS Announcement\\n\\nTransformers PR #36380: Faster Model Loading on Accelerators\\n\\nTransformers PR #36335: Update from_pretrained for Tensor Parallelism\\n\\nTransformers PR #40039: New Dynamic Sliding Window Layer and Cache\\n\\nHAN Lab Blog: How Attention Sinks Keep Language Models Stable\\n\\nMore Articles from our Blog\\n\\nMake your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation\\n\\nBy September 2, 2025 • 66\\n\\nFrom Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels\\n\\nBy August 18, 2025 • 75\\n\\nCommunity\\n\\nmmalam786\\n\\nabout 19 hours ago\\n\\nVery interesting post — I appreciate how you highlight MXFP4 quantization and the performance trade-offs in Transformers. In my own experiments, I applied quantization to a mmalam786/distilbert-sst2-int8-onnx-demo and saw ~20-30% latency reduction with negligible accuracy drop. Do you have insights on how MXFP4 handles activation quantization vs weight quantization in edge scenarios? Happy to share my demo if useful: https://www.linkedin.com/posts/dr-mm-alam-93991120b_demofirst-aichips-edgeai-activity-7381674484098883584-0Rwn/?utm_source=share&utm_medium=member_desktop&rcm=ACoAADVZuP0BheDJgKL8dWk-bNo7Yd4zhsOnNL4\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n152\\n\\nBack to Articles\\n\\nTricks from OpenAI gpt-oss YOU 🫵 can use with transformers\\n\\nPublished September 11, 2025\\n\\nUpdate on GitHub\\n\\nUpvote\\n\\n152\\n\\nAritra Roy Gosthipaty\\n\\nariG23498\\n\\nSergio Paniego\\n\\nsergiopaniego\\n\\nVaibhav Srivastav\\n\\nreach-vb\\n\\nPedro Cuenca\\n\\npcuenq\\n\\nArthur Zucker\\n\\nArthurZ\\n\\nNathan Habib\\n\\nSaylorTwift\\n\\nCyril Vallez\\n\\ncyrilvallez\\n\\nOpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models.\\n\\nIn this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can benefit from them. Providing clean implementations of new methods in transformers also allows the community to quickly understand and adopt them. Frameworks such as MLX, llama.cpp or vLLM can use the transformers code as a reference to build their own implementations.\\n\\nFor this release, we worked on:\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nMXFP4 Quantization\\n\\nTensor Parallelism\\n\\nExpert Parallelism\\n\\nDynamic Sliding Window Layer & Cache\\n\\nContinuous Batching & Paged Attention\\n\\nLoad larger models faster\\n\\nBest part: Most of these features should work across all major models within transformers!\\n\\nZero-build Kernels, downloadable from the Hub\\n\\nA kernel is a specialized, compact program that runs on accelerators to execute tasks like matrix multiplications, activations, or normalizations. In eager PyTorch, operations trigger individual kernels sequentially, which is straightforward but can incur extra memory transfers and launch overheads. PyTorch 2.0\\'s torch.compile with backends like TorchInductor addresses this by automatically fusing and optimizing kernels, delivering 2–10× performance gains.\\n\\nIn addition, the community has created custom kernels for frequent combinations of operations, not just individual PyTorch ops like matmul. For example, Flash Attention was created to optimize the critical attention block that defines the transformers architecture, and is present in many models including most LLMs. By carefully combining all the attention operations inside a single kernel, memory transfers are minimized, memory use is reduced, and speedups can be achieved.\\n\\nThe problem is that all these various kernels are available in separate libraries, which creates a dependency bloat if they were to be added to the transformers library. Furthermore, these kernels are not just Python code, they consist of low-level cuda code, glued together with C++ and exposed through a Python layer. This means they have to be compiled in the target system, which in turn requires whatever build system is required by each kernel library.\\n\\nThe kernels package solves this problem by downloading pre-built binaries of supported kernels from the Hub. You just indicate the kernel you want to use, and kernels will look for a version compatible with your system and download it on first use.\\n\\nCustom Kernels for GPT-OSS\\n\\nGPT-OSS, a Mixture of Experts (MoE) model, is a big user of Kernels from the Hub. It leverages several custom kernels:\\n\\nLiger RMSNorm, used as @use_kernel_forward_from_hub(\"RMSNorm\")`\\n\\nMegablocks MoE kernels: @use_kernel_forward_from_hub(\"MegaBlocksMoeMLP\")\\n\\nFlash Attention 3 with support for attention sinks.\\n\\nMXFP4 triton kernels (covered later)\\n\\nLet\\'s take a look at the first two ones.\\n\\nBehind the scenes, the decorators (1 and 2) simply point to community-contributed kernels. For example, RMSNorm comes from liger_kernels, while the MegaBlocksMoeMLP kernel comes from megablocks. Depending on your device (CUDA or ROCm) and whether you’re training or running inference, the right kernel is pulled in automatically.\\n\\nThis design is both specific and general: the RMSNorm liger kernels are already being reused across multiple models, and the MoE kernel could be applied to future MoEs as well.\\n\\nBecause kernels pulls code from the Hub, you have to opt-in to this feature by passing use_kernels=True in your model instantiation, as shown below. We enable INFO logging in the example so you can easily verify that downloadable kernels are in use.\\n\\nThese kernels are not compatible with mxfp4, so inference will happen in bfloat16 if you use them. Please, benchmark your system for the best combination in memory and throughput that suits your project!\\n\\nfrom transformers import AutoTokenizer, AutoModelForCausalLM import logging logging.basicConfig(level=logging.INFO) model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", use_kernels=True, )\\n\\nRunning a quick generation yields log messages like\\n\\nINFO:root:Using layer `LigerRMSNorm` from repo `kernels-community/liger_kernels` INFO:root:Using layer `MegaBlocksMoeMLP` from repo `kernels-community/megablocks`\\n\\nFigure 1 shows that, in the system we tested, these kernels work best for larger batch sizes. We always recommend to benchmark any performance-related changes as closely to your production conditions as possible.\\n\\nFigure 1: Benchmarking results of custom kernels\\n\\nYou can explore and play with the benchmarking script here\\n\\nFlash Attention 3\\n\\nOpenAI gpt-oss models use attention sinks, which improves quality and facilitates the use of longer contexts. The vLLM team added this feature to the latest version of Flash Attention (Flash Attention 3), and the resulting custom kernel is available on the Hub. Currently, this kernel is compatible with the Hopper architecture. If you have one, this is the way to enable it:\\n\\nmodel = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", + # Flash Attention with Sinks + attn_implementation=\"kernels-community/vllm-flash-attn3\", )\\n\\nMXFP4 Quantization\\n\\nLarge language models are memory-hungry. Quantization reduces memory footprint by storing weights (and sometimes activations) in lower-precision formats. For reference, FP32 uses 32 bits per number and BF16 uses 16. By reducing bit width, we trade some precision for smaller models and faster memory movement.\\n\\nIf you want a visual primer on quantization trade-offs, Maarten Grootendorst’s article is excellent: A Visual Guide to Quantization.\\n\\nWhat is MXFP4\\n\\nFigure 2: The E2M1 format used in the MXFP4 format\\n\\nMXFP4 is a 4-bit floating format with E2M1 layout: 1 sign bit, 2 exponent bits, and 1 mantissa bit, as shown in Figure 2. On its own, E2M1 is very coarse. MXFP4 compensates with blockwise scaling:\\n\\nVectors are grouped into blocks of 32 elements.\\n\\nEach block stores a shared scale that restores dynamic range when dequantizing.\\n\\nInside each block, 4-bit values represent numbers relative to that scale.\\n\\nThis blockwise scheme lets MXFP4 keep range while using very few bits. In practice, GPT-OSS 20B fits in roughly 16 GB of VRAM and GPT-OSS 120B fits in roughly 80 GB when MXFP4 is active, which is the difference between “cannot load” and “can run on a single GPU.” The catch is that matrix multiplies now have to respect block scales. Doing this efficiently at scale requires dedicated kernels.\\n\\nMXFP4 in transformers\\n\\ntransformers now includes native support for MXFP4, leveraging optimized triton (MXFP4) kernels for enhanced performance. This builds on the community-driven kernel distribution discussed earlier, utilizing pre-compiled kernels from the Hub to simplify deployment.\\n\\nKey implementation details:\\n\\nQuantizer logic: Found in the MXFP4 quantizer file, this handles the core quantization process for MXFP4.\\n\\nIntegration hooks: The MXFP4 integration file enables seamless use of MXFP4 within the transformers framework.\\n\\nTo check if a model supports MXFP4, inspect its configuration:\\n\\nfrom transformers import GptOssConfig model_id = \"openai/gpt-oss-120b\" cfg = GptOssConfig.from_pretrained(model_id) print(cfg.quantization_config) # Example output: # { # \\'modules_to_not_convert\\': [ # \\'model.layers.*.self_attn\\', # \\'model.layers.*.mlp.router\\', # \\'model.embed_tokens\\', # \\'lm_head\\' # ], # \\'quant_method\\': \\'mxfp4\\' # }\\n\\nIf \\'quant_method\\': \\'mxfp4\\' is present, the model will automatically use the MXFP4 pathway with Triton kernels when supported.\\n\\nThanks to this pull request, you can fine-tune gpt-oss models and save them directly to the Hub in MXFP4 format, streamlining deployment with optimized performance.\\n\\nRequirements and fallbacks\\n\\nTo run MXFP4 on GPU you need:\\n\\naccelerate, kernels, and triton>=3.4 installed. Note that Pytorch 2.8 already comes with triton 3.4, so you only need to manually install triton if using Pytorch 2.7.\\n\\nNVIDIA GPU with compute capability ≥ 7.5. This goes all the way back to Tesla, so you can run gpt-oss-20b on the free tiers of Google Colab and Kaggle, and on many consumer GPUs.\\n\\nIf these constraints are not met, transformers falls back to a higher-precision path (bfloat16 is used by default), which requires about 4× the memory of MXFP4.\\n\\nThe snippet loads GPT-OSS twice on CUDA: once with Mxfp4Config(dequantize=True) (memory intensive) and once in the default quantized path (memory efficient). Figure 3 shows the amount of used VRAM after each load so you can visualize the savings.\\n\\nFigure 3: Memory requirements for the quantized and dequantized models\\n\\nKernels for MXFP4\\n\\nEfficient MXFP4 requires kernels that understand 32-element blocks and their scales during GEMMs and fused ops. This is where Kernels from the Hub comes in again. transformers automatically pulls in the MXFP4-aware Triton kernels from the community repository when you load a model that needs them. The repository will appear in your local cache and will be used during the forward pass. For the MXFP4 kernels one does not need to use the use_kernels=True parameter like before, it is set to default in transformers.\\n\\nQuick sanity check with the Hugging Face cache CLI, after running gpt-oss-20b on a GPU compatible with the triton MXFP4 kernels:\\n\\nhf cache scan\\n\\nSample output:\\n\\nREPO ID REPO TYPE SIZE ON DISK -------------------------------- --------- ------------ kernels-community/triton_kernels model 536.2K openai/gpt-oss-20b model 13.8G\\n\\nThis indicates the MXFP4 kernels were fetched and are available for execution.\\n\\nLet\\'s run some benchmarks and see how well the MXFP4 kernels perform. In Figure 4, we see that the MXFP4 kernels are even better than the custom MoE and RMSNorm kernels for larger batches.\\n\\nFigure 4: MXFP4 kernel benchmark\\n\\nYou can explore and play with the benchmarking script here\\n\\nTensor Parallelism\\n\\nFigure 5: Explanation of tensor parallelism.\\n\\nTensor Parallelism (TP) splits tensors inside a layer across multiple GPUs (as shown in Figure 5). Each GPU multiplies its shard in parallel, and then partial results are collected using all-gather or all-reduce operations. This reduces per-GPU memory and keeps all GPUs working on the same layer, which improves throughput as sequence length or batch size grow. TP is communication-intensive and generally works best on a single machine with fast intra-node links.\\n\\nWhat this enables in transformers\\n\\ntransformers implements TP directly in from_pretrained. You can start with the predefined plan:\\n\\n# run with: torchrun --nproc-per-node 4 tp_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, tp_plan=\"auto\", # built in TP support dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nIf you don’t have the infrastructure to run the above, you can just spawn a process on our GPUs using Hugging Face Jobs!\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/tp_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 tp_gpt_oss.py\"\\n\\nhf jobs is available for all Hugging Face PRO & Enterprise users.\\n\\nUnder the hood, tp_plan=\"auto\" selects a predefined sharding recipe for each layer and wires the necessary collectives. You can inspect the active plan with print(model._tp_plan) if you want to verify what is being sharded.\\n\\nWhen to reach for TP\\n\\nUse TP when the model is too large for one GPU and you want parallel compute, not only memory placement. TP tends to scale throughput with more GPUs, especially for long sequences or larger batches.\\n\\nIf you are curious about how TP differs from device_map=\"auto\" (memory placement), this short Stack Overflow answer explains the distinction and when to use each.\\n\\nTo learn more about TP, here are two must-read resources:\\n\\ntransformers guide: Tensor parallelism, supported models, plans, and extension points.\\n\\nUltra-Scale Playbook: background on TP and its relationship to other parallelism modes.\\n\\nExpert Parallelism\\n\\nExpert Parallelism (EP) shards experts inside MoE layers across GPUs. Each token is routed to one or a few experts, so only those experts run their feed-forward pass. Since experts are independent MLPs, we can place different experts on different ranks and exchange only the hidden states for the routed tokens. This keeps the matrix multiplies intact on each rank and replaces tensor slicing with routing and collectives.\\n\\nRun with multiple processes using torchrun. EP is enabled via the distributed configuration and works with GPT-OSS MoE layers out of the box in transformers.\\n\\n# run with: torchrun --nproc-per-node 4 ep_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM from transformers.distributed import DistributedConfig model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, distributed_config=DistributedConfig(enable_expert_parallel=True), # enabling EP dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nHere is how you would run using hf jobs\\n\\nhf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\\\ \"uv venv .venv --python 3.12 && \\\\ source .venv/bin/activate && \\\\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\\\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/ep_gpt_oss.py && \\\\ torchrun --nproc-per-node=4 ep_gpt_oss.py\"\\n\\nWhen you enable Expert Parallelism, Tensor Parallelism is also activated. This means you enjoy the best of both worlds!\\n\\nDynamic Sliding Window Layer & Cache\\n\\nMany recent LLMs use sliding window attention, or a combination of sliding and global attention layers, as a means to save memory and reduce those expensive quadratic matmuls that grow with sequence length. However, the dynamic KV cache implementation in transformers used to continue to allocate space according to sequence length, without looking at the individual attention layers. You could always optimize memory using compilation (meaning, fixed shapes), but that\\'s a separate scenario altogether.\\n\\ntransformers now has a DynamicSlidingWindowLayer and a config‑aware DynamicCache. If the model config declares sliding‑window or hybrid attention (both sliding and global attention layers are used), the cache stops growing past the window for the sliding layers. If you don’t pass the config, behavior stays as before (full, ever‑growing KV as sequence length grows).\\n\\nFor models that only use sliding window layers, such as Mistral 7B, cache memory stops growing when the sequence reaches the window size (4096, in this case). This makes sense, because the sliding layers can\\'t look past the previous 4K tokens anyway.\\n\\nOpenAI gpt-oss alternates between sliding and global attention layers, which results in total KV cache memory being halved, as we\\'ll see, as sequence length increases. This provides us with:\\n\\nMuch lower KV‑cache memory for models with sliding or hybrid attention (e.g. GPT‑OSS). Cache growth plateaus once the window is reached (e.g., 4K for Mistral; 128 for GPT‑OSS sliding layers), instead of scaling linearly with total generated tokens. (GitHub, Transformers)\\n\\nSpeed/latency wins on long prompts/long generations: smaller KV tensors mean lighter attention reads/writes and less memory bandwidth pressure, especially after the window is hit. (This is the central motivation behind sliding‑window/hybrid LLMs.) (AI21, vLLM Blog)\\n\\nHow to use it\\n\\nThe optimized cache is set by default, that means you don\\'t have to make any changes to your existing code. If you want to create the DynamicCache explicitly here is how you would do it:\\n\\nfrom transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Always respond in riddles\"}, {\"role\": \"user\", \"content\": \"What is the weather like in Madrid?\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) cache = DynamicCache(config=model.config) # create the cache with the model\\'s config generated = model.generate( **inputs, max_new_tokens=500, past_key_values=cache ) print(tokenizer.decode(generated[0][inputs[\"input_ids\"].shape[-1]:]))\\n\\nFigure 6 showcases how much of a difference it makes for us to use the Dynamic KV Cache with sliding window attention.\\n\\nFigure 6: The memory analysis of dynamic cache with sliding window attention\\n\\nContinuous Batching & Paged Attention\\n\\nA typical autoregressive generation process looks like Figure 7. You input the prefill tokens, and the model predicts each new token one after the other until it predicts the EOS (End of Sequence) token.\\n\\nFigure 7: Autoregressive token generation\\n\\nLet’s see what the generation process looks like when we pass a batch of inputs. In Figure 8 you notice that some generations finish off earlier than the others. This mismatch of length underutilizes the GPUs.\\n\\nFigure 8: Static batching of sequences\\n\\nThis type of batching sequences is called static batching. While this is simple and easy to understand, it inherently comes with inefficiencies. Only after each sentence is completely generated can we move on to the next batch.\\n\\nTo bypass this issue, we use dynamic batching (also known as continuous batching). Instead of waiting for all the generation to finish, we schedule incoming requests to the completed generations. That way, as soon as a generation in a batch is complete, we prefill the batch with the next request. The process looks like Figure 9.\\n\\nFigure 9: Continuous Batching of sequences\\n\\nTransformers supports continuous batching with the generate_batch API. This is not meant for production-grade model serving –frameworks like vLLM and SGLang are great at that–, but can be very helpful for evaluation and experimentation. Here is an example script that runs CB end to end on Qwen/Qwen3-4B-Instruct-2507.\\n\\nWe have also performed a benchmark between Continuous Batching and Static Batching with 100 samples. In Figure 9, we note that CB is quite faster than SB.\\n\\nFigure 9: Continuous vs Static Batching Tokens/Second\\n\\nYou can play around with the benchmark here: SB, CB\\n\\nLoad larger models faster\\n\\nWhen you load a large model into your GPU, PyTorch needs to reserve GPU memory for each layer’s weights. Each of these requests (per layer) takes time, and for multi-billion-parameter models it can mean thousands of tiny memory allocations, adding up to a long wait before the model is ready. Instead of asking the GPU for new memory every single time, it can hold on to a big chunk once and then hand out slices from it quickly.\\n\\nPyTorch allocators can do exactly this. The catch is that the allocator only gets fast after you’ve given it some memory to work with. If you don’t “stock the pantry” first, you still end up doing many slow trips to the market. This PR (🎉 #36380) taught transformers to pre-stock the pantry before it starts copying model weights.\\n\\nIt:\\n\\nLooks at the device_map (where each layer will live).\\n\\nPre-allocates a big enough block on each GPU.\\n\\nThen, as layers are copied in, they just slot neatly into this pre-reserved space.\\n\\nYou have to make no changes to your existing code, as this is default behaviour in transformers. If you use device_map=\"auto\" or provide your own device map, your model will now load faster automatically. If you’re running with Tensor Parallel (tp_plan=\"auto\") and torchrun you also benefit from companion changes that make multi-GPU loading smarter.\\n\\nConclusion\\n\\ntransformers moves quickly and it is community-first. The library evolves at the pace of the field because contributors shape it in the open. Pieces added for new models become part of the toolkit and are reused in future integrations.\\n\\nThis velocity enables day-zero integrations like the GPT-OSS series. As the stack becomes increasingly PyTorch-first, it trims bloat and doubles down on the PyTorch paths that matter in practice. The result is a cleaner core that unlocks new capabilities through community kernels, quantization, and parallelism plans, while also standardizing model definitions so that architectures supported in transformers are a reference and extend across the wider ecosystem.\\n\\nThis post is a one-time snapshot of a process we repeatedly iterate on towards the same direction: serve the needs of the community. To be up to date with the latest additions to transformers, check the docs and release notes. And please, keep sharing your feedback and releasing your models in transformers for the community to enjoy 🤗\\n\\nRead More\\n\\nIf you want to go further into particular topics, here is a list of links that one should visit:\\n\\nHugging Face GPT-OSS Recipes Repository\\n\\nWelcome GPT OSS: OpenAI\\'s New Open-Source Model Family\\n\\nOpenAI Cookbook: GPT-OSS Topic\\n\\nTransformers Documentation: Distributed Inference on Multiple GPUs\\n\\nMatthew Carrigan\\'s X Thread on GPT OSS Innovations\\n\\nYouTube Video: OpenAI GPT OSS Announcement\\n\\nTransformers PR #36380: Faster Model Loading on Accelerators\\n\\nTransformers PR #36335: Update from_pretrained for Tensor Parallelism\\n\\nTransformers PR #40039: New Dynamic Sliding Window Layer and Cache\\n\\nHAN Lab Blog: How Attention Sinks Keep Language Models Stable\\n\\nMore Articles from our Blog\\n\\nMake your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation\\n\\nBy September 2, 2025 • 66\\n\\nFrom Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels\\n\\nBy August 18, 2025 • 75\\n\\nCommunity\\n\\nmmalam786\\n\\nabout 19 hours ago\\n\\nVery interesting post — I appreciate how you highlight MXFP4 quantization and the performance trade-offs in Transformers. In my own experiments, I applied quantization to a mmalam786/distilbert-sst2-int8-onnx-demo and saw ~20-30% latency reduction with negligible accuracy drop. Do you have insights on how MXFP4 handles activation quantization vs weight quantization in edge scenarios? Happy to share my demo if useful: https://www.linkedin.com/posts/dr-mm-alam-93991120b_demofirst-aichips-edgeai-activity-7381674484098883584-0Rwn/?utm_source=share&utm_medium=member_desktop&rcm=ACoAADVZuP0BheDJgKL8dWk-bNo7Yd4zhsOnNL4\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\n152', 'text_length': 72868, 'message': 'Retrieved 15 pages for blog_faster-transformers_20251009_165057.html'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `generate_summary` with `Back to Articles\n", + "\n", + "Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers\n", + "\n", + "Published September 11, 2025\n", + "\n", + "Update on GitHub\n", + "\n", + "Upvote\n", + "\n", + "152\n", + "\n", + "Aritra Roy Gosthipaty\n", + "\n", + "ariG23498\n", + "\n", + "Sergio Paniego\n", + "\n", + "sergiopaniego\n", + "\n", + "Vaibhav Srivastav\n", + "\n", + "reach-vb\n", + "\n", + "Pedro Cuenca\n", + "\n", + "pcuenq\n", + "\n", + "Arthur Zucker\n", + "\n", + "ArthurZ\n", + "\n", + "Nathan Habib\n", + "\n", + "SaylorTwift\n", + "\n", + "Cyril Vallez\n", + "\n", + "cyrilvallez\n", + "\n", + "OpenAI recently released their GPT-OSS series of models. The models feature some novel techniques like MXFP4 quantization, efficient kernels, a brand new chat format, and more. To enable the release of gpt-oss through transformers, we have upgraded the library considerably. The updates make it very efficient to load, run, and fine-tune the models.\n", + "\n", + "In this blog post, we talk about all the upgrades in-depth, and how they become part of the transformers toolkit so other models (current and future) can benefit from them. Providing clean implementations of new methods in transformers also allows the community to quickly understand and adopt them. Frameworks such as MLX, llama.cpp or vLLM can use the transformers code as a reference to build their own implementations.\n", + "\n", + "For this release, we worked on:\n", + "\n", + "Zero-build Kernels, downloadable from the Hub\n", + "\n", + "MXFP4 Quantization\n", + "\n", + "Tensor Parallelism\n", + "\n", + "Expert Parallelism\n", + "\n", + "Dynamic Sliding Window Layer & Cache\n", + "\n", + "Continuous Batching & Paged Attention\n", + "\n", + "Load larger models faster\n", + "\n", + "Best part: Most of these features should work across all major models within transformers!\n", + "\n", + "Zero-build Kernels, downloadable from the Hub\n", + "\n", + "A kernel is a specialized, compact program that runs on accelerators to execute tasks like matrix multiplications, activations, or normalizations. In eager PyTorch, operations trigger individual kernels sequentially, which is straightforward but can incur extra memory transfers and launch overheads. PyTorch 2.0's torch.compile with backends like TorchInductor addresses this by automatically fusing and optimizing kernels, delivering 2–10× performance gains.\n", + "\n", + "In addition, the community has created custom kernels for frequent combinations of operations, not just individual PyTorch ops like matmul. For example, Flash Attention was created to optimize the critical attention block that defines the transformers architecture, and is present in many models including most LLMs. By carefully combining all the attention operations inside a single kernel, memory transfers are minimized, memory use is reduced, and speedups can be achieved.\n", + "\n", + "The problem is that all these various kernels are available in separate libraries, which creates a dependency bloat if they were to be added to the transformers library. Furthermore, these kernels are not just Python code, they consist of low-level cuda code, glued together with C++ and exposed through a Python layer. This means they have to be compiled in the target system, which in turn requires whatever build system is required by each kernel library.\n", + "\n", + "The kernels package solves this problem by downloading pre-built binaries of supported kernels from the Hub. You just indicate the kernel you want to use, and kernels will look for a version compatible with your system and download it on first use.\n", + "\n", + "Custom Kernels for GPT-OSS\n", + "\n", + "GPT-OSS, a Mixture of Experts (MoE) model, is a big user of Kernels from the Hub. It leverages several custom kernels:\n", + "\n", + "Liger RMSNorm, used as @use_kernel_forward_from_hub(\"RMSNorm\")`\n", + "\n", + "Megablocks MoE kernels: @use_kernel_forward_from_hub(\"MegaBlocksMoeMLP\")\n", + "\n", + "Flash Attention 3 with support for attention sinks.\n", + "\n", + "MXFP4 triton kernels (covered later)\n", + "\n", + "Let's take a look at the first two ones.\n", + "\n", + "Behind the scenes, the decorators (1 and 2) simply point to community-contributed kernels. For example, RMSNorm comes from liger_kernels, while the MegaBlocksMoeMLP kernel comes from megablocks. Depending on your device (CUDA or ROCm) and whether you’re training or running inference, the right kernel is pulled in automatically.\n", + "\n", + "This design is both specific and general: the RMSNorm liger kernels are already being reused across multiple models, and the MoE kernel could be applied to future MoEs as well.\n", + "\n", + "Because kernels pulls code from the Hub, you have to opt-in to this feature by passing use_kernels=True in your model instantiation, as shown below. We enable INFO logging in the example so you can easily verify that downloadable kernels are in use.\n", + "\n", + "These kernels are not compatible with mxfp4, so inference will happen in bfloat16 if you use them. Please, benchmark your system for the best combination in memory and throughput that suits your project!\n", + "\n", + "from transformers import AutoTokenizer, AutoModelForCausalLM import logging logging.basicConfig(level=logging.INFO) model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", use_kernels=True, )\n", + "\n", + "Running a quick generation yields log messages like\n", + "\n", + "INFO:root:Using layer `LigerRMSNorm` from repo `kernels-community/liger_kernels` INFO:root:Using layer `MegaBlocksMoeMLP` from repo `kernels-community/megablocks`\n", + "\n", + "Figure 1 shows that, in the system we tested, these kernels work best for larger batch sizes. We always recommend to benchmark any performance-related changes as closely to your production conditions as possible.\n", + "\n", + "Figure 1: Benchmarking results of custom kernels\n", + "\n", + "You can explore and play with the benchmarking script here\n", + "\n", + "Flash Attention 3\n", + "\n", + "OpenAI gpt-oss models use attention sinks, which improves quality and facilitates the use of longer contexts. The vLLM team added this feature to the latest version of Flash Attention (Flash Attention 3), and the resulting custom kernel is available on the Hub. Currently, this kernel is compatible with the Hopper architecture. If you have one, this is the way to enable it:\n", + "\n", + "model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", + # Flash Attention with Sinks + attn_implementation=\"kernels-community/vllm-flash-attn3\", )\n", + "\n", + "MXFP4 Quantization\n", + "\n", + "Large language models are memory-hungry. Quantization reduces memory footprint by storing weights (and sometimes activations) in lower-precision formats. For reference, FP32 uses 32 bits per number and BF16 uses 16. By reducing bit width, we trade some precision for smaller models and faster memory movement.\n", + "\n", + "If you want a visual primer on quantization trade-offs, Maarten Grootendorst’s article is excellent: A Visual Guide to Quantization.\n", + "\n", + "What is MXFP4\n", + "\n", + "Figure 2: The E2M1 format used in the MXFP4 format\n", + "\n", + "MXFP4 is a 4-bit floating format with E2M1 layout: 1 sign bit, 2 exponent bits, and 1 mantissa bit, as shown in Figure 2. On its own, E2M1 is very coarse. MXFP4 compensates with blockwise scaling:\n", + "\n", + "Vectors are grouped into blocks of 32 elements.\n", + "\n", + "Each block stores a shared scale that restores dynamic range when dequantizing.\n", + "\n", + "Inside each block, 4-bit values represent numbers relative to that scale.\n", + "\n", + "This blockwise scheme lets MXFP4 keep range while using very few bits. In practice, GPT-OSS 20B fits in roughly 16 GB of VRAM and GPT-OSS 120B fits in roughly 80 GB when MXFP4 is active, which is the difference between “cannot load” and “can run on a single GPU.” The catch is that matrix multiplies now have to respect block scales. Doing this efficiently at scale requires dedicated kernels.\n", + "\n", + "MXFP4 in transformers\n", + "\n", + "transformers now includes native support for MXFP4, leveraging optimized triton (MXFP4) kernels for enhanced performance. This builds on the community-driven kernel distribution discussed earlier, utilizing pre-compiled kernels from the Hub to simplify deployment.\n", + "\n", + "Key implementation details:\n", + "\n", + "Quantizer logic: Found in the MXFP4 quantizer file, this handles the core quantization process for MXFP4.\n", + "\n", + "Integration hooks: The MXFP4 integration file enables seamless use of MXFP4 within the transformers framework.\n", + "\n", + "To check if a model supports MXFP4, inspect its configuration:\n", + "\n", + "from transformers import GptOssConfig model_id = \"openai/gpt-oss-120b\" cfg = GptOssConfig.from_pretrained(model_id) print(cfg.quantization_config) # Example output: # { # 'modules_to_not_convert': [ # 'model.layers.*.self_attn', # 'model.layers.*.mlp.router', # 'model.embed_tokens', # 'lm_head' # ], # 'quant_method': 'mxfp4' # }\n", + "\n", + "If 'quant_method': 'mxfp4' is present, the model will automatically use the MXFP4 pathway with Triton kernels when supported.\n", + "\n", + "Thanks to this pull request, you can fine-tune gpt-oss models and save them directly to the Hub in MXFP4 format, streamlining deployment with optimized performance.\n", + "\n", + "Requirements and fallbacks\n", + "\n", + "To run MXFP4 on GPU you need:\n", + "\n", + "accelerate, kernels, and triton>=3.4 installed. Note that Pytorch 2.8 already comes with triton 3.4, so you only need to manually install triton if using Pytorch 2.7.\n", + "\n", + "NVIDIA GPU with compute capability ≥ 7.5. This goes all the way back to Tesla, so you can run gpt-oss-20b on the free tiers of Google Colab and Kaggle, and on many consumer GPUs.\n", + "\n", + "If these constraints are not met, transformers falls back to a higher-precision path (bfloat16 is used by default), which requires about 4× the memory of MXFP4.\n", + "\n", + "The snippet loads GPT-OSS twice on CUDA: once with Mxfp4Config(dequantize=True) (memory intensive) and once in the default quantized path (memory efficient). Figure 3 shows the amount of used VRAM after each load so you can visualize the savings.\n", + "\n", + "Figure 3: Memory requirements for the quantized and dequantized models\n", + "\n", + "Kernels for MXFP4\n", + "\n", + "Efficient MXFP4 requires kernels that understand 32-element blocks and their scales during GEMMs and fused ops. This is where Kernels from the Hub comes in again. transformers automatically pulls in the MXFP4-aware Triton kernels from the community repository when you load a model that needs them. The repository will appear in your local cache and will be used during the forward pass. For the MXFP4 kernels one does not need to use the use_kernels=True parameter like before, it is set to default in transformers.\n", + "\n", + "Quick sanity check with the Hugging Face cache CLI, after running gpt-oss-20b on a GPU compatible with the triton MXFP4 kernels:\n", + "\n", + "hf cache scan\n", + "\n", + "Sample output:\n", + "\n", + "REPO ID REPO TYPE SIZE ON DISK -------------------------------- --------- ------------ kernels-community/triton_kernels model 536.2K openai/gpt-oss-20b model 13.8G\n", + "\n", + "This indicates the MXFP4 kernels were fetched and are available for execution.\n", + "\n", + "Let's run some benchmarks and see how well the MXFP4 kernels perform. In Figure 4, we see that the MXFP4 kernels are even better than the custom MoE and RMSNorm kernels for larger batches.\n", + "\n", + "Figure 4: MXFP4 kernel benchmark\n", + "\n", + "You can explore and play with the benchmarking script here\n", + "\n", + "Tensor Parallelism\n", + "\n", + "Figure 5: Explanation of tensor parallelism.\n", + "\n", + "Tensor Parallelism (TP) splits tensors inside a layer across multiple GPUs (as shown in Figure 5). Each GPU multiplies its shard in parallel, and then partial results are collected using all-gather or all-reduce operations. This reduces per-GPU memory and keeps all GPUs working on the same layer, which improves throughput as sequence length or batch size grow. TP is communication-intensive and generally works best on a single machine with fast intra-node links.\n", + "\n", + "What this enables in transformers\n", + "\n", + "transformers implements TP directly in from_pretrained. You can start with the predefined plan:\n", + "\n", + "# run with: torchrun --nproc-per-node 4 tp_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, tp_plan=\"auto\", # built in TP support dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\n", + "\n", + "If you don’t have the infrastructure to run the above, you can just spawn a process on our GPUs using Hugging Face Jobs!\n", + "\n", + "hf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\ \"uv venv .venv --python 3.12 && \\ source .venv/bin/activate && \\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/tp_gpt_oss.py && \\ torchrun --nproc-per-node=4 tp_gpt_oss.py\"\n", + "\n", + "hf jobs is available for all Hugging Face PRO & Enterprise users.\n", + "\n", + "Under the hood, tp_plan=\"auto\" selects a predefined sharding recipe for each layer and wires the necessary collectives. You can inspect the active plan with print(model._tp_plan) if you want to verify what is being sharded.\n", + "\n", + "When to reach for TP\n", + "\n", + "Use TP when the model is too large for one GPU and you want parallel compute, not only memory placement. TP tends to scale throughput with more GPUs, especially for long sequences or larger batches.\n", + "\n", + "If you are curious about how TP differs from device_map=\"auto\" (memory placement), this short Stack Overflow answer explains the distinction and when to use each.\n", + "\n", + "To learn more about TP, here are two must-read resources:\n", + "\n", + "transformers guide: Tensor parallelism, supported models, plans, and extension points.\n", + "\n", + "Ultra-Scale Playbook: background on TP and its relationship to other parallelism modes.\n", + "\n", + "Expert Parallelism\n", + "\n", + "Expert Parallelism (EP) shards experts inside MoE layers across GPUs. Each token is routed to one or a few experts, so only those experts run their feed-forward pass. Since experts are independent MLPs, we can place different experts on different ranks and exchange only the hidden states for the routed tokens. This keeps the matrix multiplies intact on each rank and replaces tensor slicing with routing and collectives.\n", + "\n", + "Run with multiple processes using torchrun. EP is enabled via the distributed configuration and works with GPT-OSS MoE layers out of the box in transformers.\n", + "\n", + "# run with: torchrun --nproc-per-node 4 ep_gpt_oss.py import torch from transformers import PreTrainedTokenizerFast, GptOssForCausalLM from transformers.distributed import DistributedConfig model_id = \"openai/gpt-oss-120b\" tokenizer = PreTrainedTokenizerFast.from_pretrained(model_id) model = GptOssForCausalLM.from_pretrained( model_id, distributed_config=DistributedConfig(enable_expert_parallel=True), # enabling EP dtype=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Be concise.\"}, {\"role\": \"user\", \"content\": \"Explain KV caching briefly.\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) with torch.inference_mode(): generations = model.generate(**inputs, max_new_tokens=128) print(tokenizer.decode(generations[0][inputs[\"input_ids\"].shape[-1]:]))\n", + "\n", + "Here is how you would run using hf jobs\n", + "\n", + "hf jobs run --detach --flavor l4x4 ghcr.io/astral-sh/uv:debian /bin/bash -c \\ \"uv venv .venv --python 3.12 && \\ source .venv/bin/activate && \\ uv pip install --upgrade torch numpy transformers accelerate triton kernels && \\ wget https://huggingface.co/datasets/ariG23498/distributed/raw/main/ep_gpt_oss.py && \\ torchrun --nproc-per-node=4 ep_gpt_oss.py\"\n", + "\n", + "When you enable Expert Parallelism, Tensor Parallelism is also activated. This means you enjoy the best of both worlds!\n", + "\n", + "Dynamic Sliding Window Layer & Cache\n", + "\n", + "Many recent LLMs use sliding window attention, or a combination of sliding and global attention layers, as a means to save memory and reduce those expensive quadratic matmuls that grow with sequence length. However, the dynamic KV cache implementation in transformers used to continue to allocate space according to sequence length, without looking at the individual attention layers. You could always optimize memory using compilation (meaning, fixed shapes), but that's a separate scenario altogether.\n", + "\n", + "transformers now has a DynamicSlidingWindowLayer and a config‑aware DynamicCache. If the model config declares sliding‑window or hybrid attention (both sliding and global attention layers are used), the cache stops growing past the window for the sliding layers. If you don’t pass the config, behavior stays as before (full, ever‑growing KV as sequence length grows).\n", + "\n", + "For models that only use sliding window layers, such as Mistral 7B, cache memory stops growing when the sequence reaches the window size (4096, in this case). This makes sense, because the sliding layers can't look past the previous 4K tokens anyway.\n", + "\n", + "OpenAI gpt-oss alternates between sliding and global attention layers, which results in total KV cache memory being halved, as we'll see, as sequence length increases. This provides us with:\n", + "\n", + "Much lower KV‑cache memory for models with sliding or hybrid attention (e.g. GPT‑OSS). Cache growth plateaus once the window is reached (e.g., 4K for Mistral; 128 for GPT‑OSS sliding layers), instead of scaling linearly with total generated tokens. (GitHub, Transformers)\n", + "\n", + "Speed/latency wins on long prompts/long generations: smaller KV tensors mean lighter attention reads/writes and less memory bandwidth pressure, especially after the window is hit. (This is the central motivation behind sliding‑window/hybrid LLMs.) (AI21, vLLM Blog)\n", + "\n", + "How to use it\n", + "\n", + "The optimized cache is set by default, that means you don't have to make any changes to your existing code. If you want to create the DynamicCache explicitly here is how you would do it:\n", + "\n", + "from transformers import AutoModelForCausalLM, AutoTokenizer, DynamicCache model_id = \"openai/gpt-oss-20b\" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, dtype=\"auto\", device_map=\"auto\", ).eval() messages = [ {\"role\": \"system\", \"content\": \"Always respond in riddles\"}, {\"role\": \"user\", \"content\": \"What is the weather like in Madrid?\"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, return_tensors=\"pt\", return_dict=True, reasoning_effort=\"low\", ).to(model.device) cache = DynamicCache(config=model.config) # create the cache with the model's config generated = model.generate( **inputs, max_new_tokens=500, past_key_values=cache ) print(tokenizer.decode(generated[0][inputs[\"input_ids\"].shape[-1]:]))\n", + "\n", + "Figure 6 showcases how much of a difference it makes for us to use the Dynamic KV Cache with sliding window attention.\n", + "\n", + "Figure 6: The memory analysis of dynamic cache with sliding window attention\n", + "\n", + "Continuous Batching & Paged Attention\n", + "\n", + "A typical autoregressive generation process looks like Figure 7. You input the prefill tokens, and the model predicts each new token one after the other until it predicts the EOS (End of Sequence) token.\n", + "\n", + "Figure 7: Autoregressive token generation\n", + "\n", + "Let’s see what the generation process looks like when we pass a batch of inputs. In Figure 8 you notice that some generations finish off earlier than the others. This mismatch of length underutilizes the GPUs.\n", + "\n", + "Figure 8: Static batching of sequences\n", + "\n", + "This type of batching sequences is called static batching. While this is simple and easy to understand, it inherently comes with inefficiencies. Only after each sentence is completely generated can we move on to the next batch.\n", + "\n", + "To bypass this issue, we use dynamic batching (also known as continuous batching). Instead of waiting for all the generation to finish, we schedule incoming requests to the completed generations. That way, as soon as a generation in a batch is complete, we prefill the batch with the next request. The process looks like Figure 9.\n", + "\n", + "Figure 9: Continuous Batching of sequences\n", + "\n", + "Transformers supports continuous batching with the generate_batch API. This is not meant for production-grade model serving –frameworks like vLLM and SGLang are great at that–, but can be very helpful for evaluation and experimentation. Here is an example script that runs CB end to end on Qwen/Qwen3-4B-Instruct-2507.\n", + "\n", + "We have also performed a benchmark between Continuous Batching and Static Batching with 100 samples. In Figure 9, we note that CB is quite faster than SB.\n", + "\n", + "Figure 9: Continuous vs Static Batching Tokens/Second\n", + "\n", + "You can play around with the benchmark here: SB, CB\n", + "\n", + "Load larger models faster\n", + "\n", + "When you load a large model into your GPU, PyTorch needs to reserve GPU memory for each layer’s weights. Each of these requests (per layer) takes time, and for multi-billion-parameter models it can mean thousands of tiny memory allocations, adding up to a long wait before the model is ready. Instead of asking the GPU for new memory every single time, it can hold on to a big chunk once and then hand out slices from it quickly.\n", + "\n", + "PyTorch allocators can do exactly this. The catch is that the allocator only gets fast after you’ve given it some memory to work with. If you don’t “stock the pantry” first, you still end up doing many slow trips to the market. This PR (🎉 #36380) taught transformers to pre-stock the pantry before it starts copying model weights.\n", + "\n", + "It:\n", + "\n", + "Looks at the device_map (where each layer will live).\n", + "\n", + "Pre-allocates a big enough block on each GPU.\n", + "\n", + "Then, as layers are copied in, they just slot neatly into this pre-reserved space.\n", + "\n", + "You have to make no changes to your existing code, as this is default behaviour in transformers. If you use device_map=\"auto\" or provide your own device map, your model will now load faster automatically. If you’re running with Tensor Parallel (tp_plan=\"auto\") and torchrun you also benefit from companion changes that make multi-GPU loading smarter.\n", + "\n", + "Conclusion\n", + "\n", + "transformers moves quickly and it is community-first. The library evolves at the pace of the field because contributors shape it in the open. Pieces added for new models become part of the toolkit and are reused in future integrations.\n", + "\n", + "This velocity enables day-zero integrations like the GPT-OSS series. As the stack becomes increasingly PyTorch-first, it trims bloat and doubles down on the PyTorch paths that matter in practice. The result is a cleaner core that unlocks new capabilities through community kernels, quantization, and parallelism plans, while also standardizing model definitions so that architectures supported in transformers are a reference and extend across the wider ecosystem.\n", + "\n", + "This post is a one-time snapshot of a process we repeatedly iterate on towards the same direction: serve the needs of the community. To be up to date with the latest additions to transformers, check the docs and release notes. And please, keep sharing your feedback and releasing your models in transformers for the community to enjoy 🤗\n", + "\n", + "Read More\n", + "\n", + "If you want to go further into particular topics, here is a list of links that one should visit:\n", + "\n", + "Hugging Face GPT-OSS Recipes Repository\n", + "\n", + "Welcome GPT OSS: OpenAI's New Open-Source Model Family\n", + "\n", + "OpenAI Cookbook: GPT-OSS Topic\n", + "\n", + "Transformers Documentation: Distributed Inference on Multiple GPUs\n", + "\n", + "Matthew Carrigan's X Thread on GPT OSS Innovations\n", + "\n", + "YouTube Video: OpenAI GPT OSS Announcement\n", + "\n", + "Transformers PR #36380: Faster Model Loading on Accelerators\n", + "\n", + "Transformers PR #36335: Update from_pretrained for Tensor Parallelism\n", + "\n", + "Transformers PR #40039: New Dynamic Sliding Window Layer and Cache\n", + "\n", + "HAN Lab Blog: How Attention Sinks Keep Language Models Stable\n", + "\n", + "More Articles from our Blog\n", + "\n", + "Make your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation\n", + "\n", + "By September 2, 2025 • 66\n", + "\n", + "From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels\n", + "\n", + "By August 18, 2025 • 75\n", + "\n", + "Community\n", + "\n", + "mmalam786\n", + "\n", + "about 19 hours ago\n", + "\n", + "Very interesting post — I appreciate how you highlight MXFP4 quantization and the performance trade-offs in Transformers. In my own experiments, I applied quantization to a mmalam786/distilbert-sst2-int8-onnx-demo and saw ~20-30% latency reduction with negligible accuracy drop. Do you have insights on how MXFP4 handles activation quantization vs weight quantization in edge scenarios? Happy to share my demo if useful: https://www.linkedin.com/posts/dr-mm-alam-93991120b_demofirst-aichips-edgeai-activity-7381674484098883584-0Rwn/?utm_source=share&utm_medium=member_desktop&rcm=ACoAADVZuP0BheDJgKL8dWk-bNo7Yd4zhsOnNL4\n", + "\n", + "Sign up or log in to comment\n", + "\n", + "Upvote\n", + "\n", + "152`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'status': 'success', 'summary': \"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\", 'length': 1221, 'message': 'Generated summary (1221 characters)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `collapse_summaries` with `[{\"summary\":\"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"},{\"summary\":\"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"},{\"summary\":\"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\"}]`\n", + "\n", + "\n", + "\u001b[0m\u001b[38;5;200m\u001b[1;3m{'status': 'success', 'collapsed_summaries': [{'summary': \"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"}, {'summary': \"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"}, {'summary': \"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\"}], 'message': 'Summaries already fit in context (942 tokens). No collapse needed.'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `get_document_text` with `blog_AdamF92_reactive-transformer-intro_20251009_165200.html`\n", + "\n", + "\n", + "\u001b[0m\u001b[33;1m\u001b[1;3m{'status': 'success', 'filename': 'blog_AdamF92_reactive-transformer-intro_20251009_165200.html', 'pages': 4, 'text': 'Back to Articles\\n\\nReactive Transformer (RxT): Fixing the Memory Problem in Conversational AI\\n\\nCommunity Article Published October 8, 2025\\n\\nUpvote\\n\\nAdam Filipek\\n\\nAdamF92\\n\\nLarge Language Models (LLMs) have transformed the landscape of AI, but when it comes to natural, long-form conversation, they have a fundamental weakness: they are stateless. To maintain context, models like those in the GPT series must re-process the entire conversation history with every single turn. This \"brute-force\" approach is not only inefficient but also makes interactions prohibitively expensive and slow as dialogues grow longer. The computational cost scales quadratically (O(L2)) with the length of the conversation, a bottleneck that larger context windows don\\'t solve but merely postpone.\\n\\nToday, we\\'re introducing the Reactive Transformer (RxT), a novel architecture detailed in our paper, \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\". RxT shifts the paradigm from data-driven, stateless processing to event-driven, stateful computation. It is designed from the ground up to enable real-time, coherent, and economically viable long-form conversations.\\n\\nThe Core Problem: Why Stateless LLMs Struggle with Dialogue\\n\\nImagine having to reread an entire book from the beginning every time you wanted to start a new page. This is essentially how today\\'s LLMs handle conversations. Because they have no inherent memory, context is managed by concatenating the entire dialogue history and feeding it back into the model with each new user query.\\n\\nThis leads to two critical issues:\\n\\nExploding Computational Costs: The total number of tokens processed over a conversation with N turns scales quadratically (O(N2⋅T)). This makes long-running dialogues incredibly expensive, a problem familiar to anyone using LLM APIs for conversational agents.\\n\\nIncreasing Latency: The time it takes to process the initial prompt grows with every turn. This means the model gets slower and less responsive the longer you talk to it, hindering the user experience in real-time applications.\\n\\nWhile architectures like State Space Models (Mamba) or Retrieval-Augmented Generation (RAG) have addressed parts of this problem, they don\\'t solve the core issue for dialogue. SSMs still typically process the full history, and RAG treats memory as an external tool, not an integrated part of the model\\'s reasoning process.\\n\\nA Paradigm Shift: Event-Driven and Asynchronous\\n\\nThe Reactive Transformer (RxT) redefines the entire process by treating each conversational turn as a discrete event. Instead of processing a monolithic history, RxT operates in a continuous, cyclical workflow with a fixed-size internal Short-Term Memory (STM).\\n\\nThe key innovation is its asynchronous operational cycle, which separates response generation from memory consolidation:\\n\\n⚡️ Real-Time Response Generation (Synchronous Phase): When a user sends a query, the Generator-Decoder immediately produces a response. It does this by referencing the user\\'s query and the previous memory state (STMt−1\\u200b). This entire process is lightweight and fast, ensuring minimal user-perceived latency.\\n\\n🧠 Memory Update (Asynchronous Phase): After the response has been sent to the user, the Memory Encoder and Memory Attention network work in the background. They process the complete interaction (both the user\\'s query and the model\\'s answer) and update the memory state to STMt\\u200b.\\n\\nThis decoupling is crucial. The computationally intensive task of consolidating new information into memory happens after the user has already received their response, meaning it adds zero latency to the interaction.\\n\\nThis design provides two transformative benefits:\\n\\nLinear Cost Scaling: The total user-facing cost of a conversation scales linearly (O(N⋅T)) with the number of turns, making long dialogues computationally feasible.\\n\\nConstant, Low Latency: Since response generation depends only on the current query and a fixed-size memory, the inference time remains constant, no matter how long the conversation has been going on.\\n\\nUnder the Hood: The RxT Architecture\\n\\nRxT is an encoder-decoder model, but its components serve unique, specialized roles within its event-driven cycle.\\n\\nGenerator-Decoder: This is the user-facing component responsible for autoregressive text generation. Crucially, each layer includes a Memory Cross-Attention sub-layer, allowing it to query the STM for relevant context from past interactions. To maintain efficiency, it uses Mixture-of-Experts (MoE) layers.\\n\\nMemory Encoder: Its sole purpose is to create a condensed, rich semantic representation of the just-completed interaction (query + answer). This \"Encoded Data\" is then passed to the memory system.\\n\\nAttention-Based Memory System (ABMS): This is the core of RxT\\'s statefulness. The STM is not a log of past tokens but a collection of fixed-size, learnable vectors (memory slots). The Memory Attention network updates these slots by using them as queries to \"seek out\" relevant information from the Encoded Data of the latest interaction. We\\'ve developed several variants, including Interlayer and Gated Self-Attention, to allow for more sophisticated memory consolidation .\\n\\nResidual Gates: To control how much old information is retained and how much new information is written, we use gated residual connections. This helps prevent \"catastrophic forgetting\" and ensures stable learning over many interactions.\\n\\nExperimental Results: Performance and Efficiency\\n\\nWe conducted a series of experiments to validate RxT\\'s architecture, training several models of increasing scale and comparing them against a baseline stateless decoder-only LLM of a comparable size. All models were trained on datasets derived from TinyStories.\\n\\nSuperior Conversational Performance\\n\\nOur results show that architectural specialization pays off. Even our smallest model, RxT-Alpha Nano (12M parameters), significantly outperformed a larger 22M parameter stateless LLM baseline on multi-turn dialogue tasks.\\n\\nPerplexity: The 12M RxT model achieved a perplexity of 2.74, far better than the 22M LLM\\'s 4.37. Our largest model, RxT-Alpha Synthetic (160M), reached a PPL of 2.18.\\n\\nAccuracy: The RxT models consistently achieved ~80-82% next-token prediction accuracy, compared to just 55% for the stateless baseline.\\n\\nCoherence: Using a custom MRL Reward Score to measure conversational quality, all RxT models demonstrated a superior ability to maintain context and coherence over long dialogues compared to the baseline.\\n\\nThese results confirm that a specialized, memory-augmented architecture is far more effective and parameter-efficient for conversational tasks than a generic, monolithic one.\\n\\nConstant Low Latency\\n\\nThe latency benchmark highlights RxT\\'s primary advantage for real-time applications. We measured the prompt processing time over an 8-step dialogue.\\n\\nThe stateless LLM\\'s latency grew steadily with each turn, from 0.09s to over 0.22s, as its context window filled up.\\n\\nRxT\\'s latency remained nearly constant at ~0.06s across all steps, completely independent of the dialogue\\'s history.\\n\\nThis demonstrates RxT\\'s ability to deliver a snappy, responsive user experience that doesn\\'t degrade over time.\\n\\nConclusion and Future Work\\n\\nThe Reactive Transformer offers a new path forward for building truly interactive and scalable conversational AI. By moving from a stateless to a stateful, event-driven paradigm, RxT solves the critical bottlenecks of computational cost and latency that limit current LLMs.\\n\\nOur experiments provide strong proof-of-concept that this architectural specialization leads to superior performance and efficiency. The work presented here, focusing on the architecture and supervised training, is the first step. Our upcoming papers will detail the advanced multi-stage training curriculum, including novel Reinforcement Learning stages designed to further enhance the memory system\\'s capabilities.\\n\\nWe believe that building models with integrated, persistent memory systems—including future work on Long-Term Memory (LTM)—is essential for moving beyond simple language modeling and toward creating more capable, aware, and genuinely interactive AI agents.\\n\\nRxT-Beta - moving to real-world data and bigger scale\\n\\nAfter introducing synthetic Proof-of-Concept RxT-Alpha models, described in research paper, we are moving to bigger scale, real-world data and MVP RxT-Beta models. As a MVP, models will still be english-only, but they should be competitive for small stateless models for english-based benchmarks. RxT-Beta will be released in multiple variants:\\n\\nRxT-Beta-Micro (270M params) - already in training and should be released this month\\n\\nRxT-Beta-Micro-Reasoning (270M params)\\n\\nRxT-Beta-Mini (1B params) with hybrid reasoning\\n\\nRxT-Beta (4B params)\\n\\nPlease follow me and Reactive AI for more updates.\\n\\nFor a deeper dive into the architecture, training methodology, and results, please read the full research paper: \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\".\\n\\nThe Reactive Transformer architecture is patent-pending (#P.453260). Commercial usage is regulated by the Reactive AI Models & Architecture License. For more details, visit our GitHub: https://github.com/RxAI-dev/rxlm.\\n\\nCommunity\\n\\nSign up or log in to comment\\n\\nUpvote\\n\\nBack to Articles\\n\\nReactive Transformer (RxT): Fixing the Memory Problem in Conversational AI\\n\\nCommunity Article Published October 8, 2025\\n\\nUpvote\\n\\nAdam Filipek\\n\\nAdamF92\\n\\nLarge Language Models (LLMs) have transformed the landscape of AI, but when it comes to natural, long-form conversation, they have a fundamental weakness: they are stateless. To maintain context, models like those in the GPT series must re-process the entire conversation history with every single turn. This \"brute-force\" approach is not only inefficient but also makes interactions prohibitively expensive and slow as dialogues grow longer. The computational cost scales quadratically (O(L2)) with the length of the conversation, a bottleneck that larger context windows don\\'t solve but merely postpone.\\n\\nToday, we\\'re introducing the Reactive Transformer (RxT), a novel architecture detailed in our paper, \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\". RxT shifts the paradigm from data-driven, stateless processing to event-driven, stateful computation. It is designed from the ground up to enable real-time, coherent, and economically viable long-form conversations.\\n\\nThe Core Problem: Why Stateless LLMs Struggle with Dialogue\\n\\nImagine having to reread an entire book from the beginning every time you wanted to start a new page. This is essentially how today\\'s LLMs handle conversations. Because they have no inherent memory, context is managed by concatenating the entire dialogue history and feeding it back into the model with each new user query.\\n\\nThis leads to two critical issues:\\n\\nExploding Computational Costs: The total number of tokens processed over a conversation with N turns scales quadratically (O(N2⋅T)). This makes long-running dialogues incredibly expensive, a problem familiar to anyone using LLM APIs for conversational agents.\\n\\nIncreasing Latency: The time it takes to process the initial prompt grows with every turn. This means the model gets slower and less responsive the longer you talk to it, hindering the user experience in real-time applications.\\n\\nWhile architectures like State Space Models (Mamba) or Retrieval-Augmented Generation (RAG) have addressed parts of this problem, they don\\'t solve the core issue for dialogue. SSMs still typically process the full history, and RAG treats memory as an external tool, not an integrated part of the model\\'s reasoning process.\\n\\nA Paradigm Shift: Event-Driven and Asynchronous\\n\\nThe Reactive Transformer (RxT) redefines the entire process by treating each conversational turn as a discrete event. Instead of processing a monolithic history, RxT operates in a continuous, cyclical workflow with a fixed-size internal Short-Term Memory (STM).\\n\\nThe key innovation is its asynchronous operational cycle, which separates response generation from memory consolidation:\\n\\n⚡️ Real-Time Response Generation (Synchronous Phase): When a user sends a query, the Generator-Decoder immediately produces a response. It does this by referencing the user\\'s query and the previous memory state (STMt−1\\u200b). This entire process is lightweight and fast, ensuring minimal user-perceived latency.\\n\\n🧠 Memory Update (Asynchronous Phase): After the response has been sent to the user, the Memory Encoder and Memory Attention network work in the background. They process the complete interaction (both the user\\'s query and the model\\'s answer) and update the memory state to STMt\\u200b.\\n\\nThis decoupling is crucial. The computationally intensive task of consolidating new information into memory happens after the user has already received their response, meaning it adds zero latency to the interaction.\\n\\nThis design provides two transformative benefits:\\n\\nLinear Cost Scaling: The total user-facing cost of a conversation scales linearly (O(N⋅T)) with the number of turns, making long dialogues computationally feasible.\\n\\nConstant, Low Latency: Since response generation depends only on the current query and a fixed-size memory, the inference time remains constant, no matter how long the conversation has been going on.\\n\\nUnder the Hood: The RxT Architecture\\n\\nRxT is an encoder-decoder model, but its components serve unique, specialized roles within its event-driven cycle.\\n\\nGenerator-Decoder: This is the user-facing component responsible for autoregressive text generation. Crucially, each layer includes a Memory Cross-Attention sub-layer, allowing it to query the STM for relevant context from past interactions. To maintain efficiency, it uses Mixture-of-Experts (MoE) layers.\\n\\nMemory Encoder: Its sole purpose is to create a condensed, rich semantic representation of the just-completed interaction (query + answer). This \"Encoded Data\" is then passed to the memory system.\\n\\nAttention-Based Memory System (ABMS): This is the core of RxT\\'s statefulness. The STM is not a log of past tokens but a collection of fixed-size, learnable vectors (memory slots). The Memory Attention network updates these slots by using them as queries to \"seek out\" relevant information from the Encoded Data of the latest interaction. We\\'ve developed several variants, including Interlayer and Gated Self-Attention, to allow for more sophisticated memory consolidation .\\n\\nResidual Gates: To control how much old information is retained and how much new information is written, we use gated residual connections. This helps prevent \"catastrophic forgetting\" and ensures stable learning over many interactions.\\n\\nExperimental Results: Performance and Efficiency\\n\\nWe conducted a series of experiments to validate RxT\\'s architecture, training several models of increasing scale and comparing them against a baseline stateless decoder-only LLM of a comparable size. All models were trained on datasets derived from TinyStories.\\n\\nSuperior Conversational Performance\\n\\nOur results show that architectural specialization pays off. Even our smallest model, RxT-Alpha Nano (12M parameters), significantly outperformed a larger 22M parameter stateless LLM baseline on multi-turn dialogue tasks.\\n\\nPerplexity: The 12M RxT model achieved a perplexity of 2.74, far better than the 22M LLM\\'s 4.37. Our largest model, RxT-Alpha Synthetic (160M), reached a PPL of 2.18.\\n\\nAccuracy: The RxT models consistently achieved ~80-82% next-token prediction accuracy, compared to just 55% for the stateless baseline.\\n\\nCoherence: Using a custom MRL Reward Score to measure conversational quality, all RxT models demonstrated a superior ability to maintain context and coherence over long dialogues compared to the baseline.\\n\\nThese results confirm that a specialized, memory-augmented architecture is far more effective and parameter-efficient for conversational tasks than a generic, monolithic one.\\n\\nConstant Low Latency\\n\\nThe latency benchmark highlights RxT\\'s primary advantage for real-time applications. We measured the prompt processing time over an 8-step dialogue.\\n\\nThe stateless LLM\\'s latency grew steadily with each turn, from 0.09s to over 0.22s, as its context window filled up.\\n\\nRxT\\'s latency remained nearly constant at ~0.06s across all steps, completely independent of the dialogue\\'s history.\\n\\nThis demonstrates RxT\\'s ability to deliver a snappy, responsive user experience that doesn\\'t degrade over time.\\n\\nConclusion and Future Work\\n\\nThe Reactive Transformer offers a new path forward for building truly interactive and scalable conversational AI. By moving from a stateless to a stateful, event-driven paradigm, RxT solves the critical bottlenecks of computational cost and latency that limit current LLMs.\\n\\nOur experiments provide strong proof-of-concept that this architectural specialization leads to superior performance and efficiency. The work presented here, focusing on the architecture and supervised training, is the first step. Our upcoming papers will detail the advanced multi-stage training curriculum, including novel Reinforcement Learning stages designed to further enhance the memory system\\'s capabilities.\\n\\nWe believe that building models with integrated, persistent memory systems—including future work on Long-Term Memory (LTM)—is essential for moving beyond simple language modeling and toward creating more capable, aware, and genuinely interactive AI agents.\\n\\nRxT-Beta - moving to real-world data and bigger scale\\n\\nAfter introducing synthetic Proof-of-Concept RxT-Alpha models, described in research paper, we are moving to bigger scale, real-world data and MVP RxT-Beta models. As a MVP, models will still be english-only, but they should be competitive for small stateless models for english-based benchmarks. RxT-Beta will be released in multiple variants:\\n\\nRxT-Beta-Micro (270M params) - already in training and should be released this month\\n\\nRxT-Beta-Micro-Reasoning (270M params)\\n\\nRxT-Beta-Mini (1B params) with hybrid reasoning\\n\\nRxT-Beta (4B params)\\n\\nPlease follow me and Reactive AI for more updates.\\n\\nFor a deeper dive into the architecture, training methodology, and results, please read the full research paper: \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\".\\n\\nThe Reactive Transformer architecture is patent-pending (#P.453260). Commercial usage is regulated by the Reactive AI Models & Architecture License. For more details, visit our GitHub: https://github.com/RxAI-dev/rxlm.\\n\\nCommunity\\n\\nSign up or log in to comment\\n\\nUpvote', 'text_length': 18726, 'message': 'Retrieved 4 pages for blog_AdamF92_reactive-transformer-intro_20251009_165200.html'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `generate_summary` with `Back to Articles\n", + "\n", + "Reactive Transformer (RxT): Fixing the Memory Problem in Conversational AI\n", + "\n", + "Community Article Published October 8, 2025\n", + "\n", + "Upvote\n", + "\n", + "Adam Filipek\n", + "\n", + "AdamF92\n", + "\n", + "Large Language Models (LLMs) have transformed the landscape of AI, but when it comes to natural, long-form conversation, they have a fundamental weakness: they are stateless. To maintain context, models like those in the GPT series must re-process the entire conversation history with every single turn. This \"brute-force\" approach is not only inefficient but also makes interactions prohibitively expensive and slow as dialogues grow longer. The computational cost scales quadratically (O(L2)) with the length of the conversation, a bottleneck that larger context windows don't solve but merely postpone.\n", + "\n", + "Today, we're introducing the Reactive Transformer (RxT), a novel architecture detailed in our paper, \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\". RxT shifts the paradigm from data-driven, stateless processing to event-driven, stateful computation. It is designed from the ground up to enable real-time, coherent, and economically viable long-form conversations.\n", + "\n", + "The Core Problem: Why Stateless LLMs Struggle with Dialogue\n", + "\n", + "Imagine having to reread an entire book from the beginning every time you wanted to start a new page. This is essentially how today's LLMs handle conversations. Because they have no inherent memory, context is managed by concatenating the entire dialogue history and feeding it back into the model with each new user query.\n", + "\n", + "This leads to two critical issues:\n", + "\n", + "Exploding Computational Costs: The total number of tokens processed over a conversation with N turns scales quadratically (O(N2⋅T)). This makes long-running dialogues incredibly expensive, a problem familiar to anyone using LLM APIs for conversational agents.\n", + "\n", + "Increasing Latency: The time it takes to process the initial prompt grows with every turn. This means the model gets slower and less responsive the longer you talk to it, hindering the user experience in real-time applications.\n", + "\n", + "While architectures like State Space Models (Mamba) or Retrieval-Augmented Generation (RAG) have addressed parts of this problem, they don't solve the core issue for dialogue. SSMs still typically process the full history, and RAG treats memory as an external tool, not an integrated part of the model's reasoning process.\n", + "\n", + "A Paradigm Shift: Event-Driven and Asynchronous\n", + "\n", + "The Reactive Transformer (RxT) redefines the entire process by treating each conversational turn as a discrete event. Instead of processing a monolithic history, RxT operates in a continuous, cyclical workflow with a fixed-size internal Short-Term Memory (STM).\n", + "\n", + "The key innovation is its asynchronous operational cycle, which separates response generation from memory consolidation:\n", + "\n", + "⚡️ Real-Time Response Generation (Synchronous Phase): When a user sends a query, the Generator-Decoder immediately produces a response. It does this by referencing the user's query and the previous memory state (STMt−1​). This entire process is lightweight and fast, ensuring minimal user-perceived latency.\n", + "\n", + "🧠 Memory Update (Asynchronous Phase): After the response has been sent to the user, the Memory Encoder and Memory Attention network work in the background. They process the complete interaction (both the user's query and the model's answer) and update the memory state to STMt​.\n", + "\n", + "This decoupling is crucial. The computationally intensive task of consolidating new information into memory happens after the user has already received their response, meaning it adds zero latency to the interaction.\n", + "\n", + "This design provides two transformative benefits:\n", + "\n", + "Linear Cost Scaling: The total user-facing cost of a conversation scales linearly (O(N⋅T)) with the number of turns, making long dialogues computationally feasible.\n", + "\n", + "Constant, Low Latency: Since response generation depends only on the current query and a fixed-size memory, the inference time remains constant, no matter how long the conversation has been going on.\n", + "\n", + "Under the Hood: The RxT Architecture\n", + "\n", + "RxT is an encoder-decoder model, but its components serve unique, specialized roles within its event-driven cycle.\n", + "\n", + "Generator-Decoder: This is the user-facing component responsible for autoregressive text generation. Crucially, each layer includes a Memory Cross-Attention sub-layer, allowing it to query the STM for relevant context from past interactions. To maintain efficiency, it uses Mixture-of-Experts (MoE) layers.\n", + "\n", + "Memory Encoder: Its sole purpose is to create a condensed, rich semantic representation of the just-completed interaction (query + answer). This \"Encoded Data\" is then passed to the memory system.\n", + "\n", + "Attention-Based Memory System (ABMS): This is the core of RxT's statefulness. The STM is not a log of past tokens but a collection of fixed-size, learnable vectors (memory slots). The Memory Attention network updates these slots by using them as queries to \"seek out\" relevant information from the Encoded Data of the latest interaction. We've developed several variants, including Interlayer and Gated Self-Attention, to allow for more sophisticated memory consolidation .\n", + "\n", + "Residual Gates: To control how much old information is retained and how much new information is written, we use gated residual connections. This helps prevent \"catastrophic forgetting\" and ensures stable learning over many interactions.\n", + "\n", + "Experimental Results: Performance and Efficiency\n", + "\n", + "We conducted a series of experiments to validate RxT's architecture, training several models of increasing scale and comparing them against a baseline stateless decoder-only LLM of a comparable size. All models were trained on datasets derived from TinyStories.\n", + "\n", + "Superior Conversational Performance\n", + "\n", + "Our results show that architectural specialization pays off. Even our smallest model, RxT-Alpha Nano (12M parameters), significantly outperformed a larger 22M parameter stateless LLM baseline on multi-turn dialogue tasks.\n", + "\n", + "Perplexity: The 12M RxT model achieved a perplexity of 2.74, far better than the 22M LLM's 4.37. Our largest model, RxT-Alpha Synthetic (160M), reached a PPL of 2.18.\n", + "\n", + "Accuracy: The RxT models consistently achieved ~80-82% next-token prediction accuracy, compared to just 55% for the stateless baseline.\n", + "\n", + "Coherence: Using a custom MRL Reward Score to measure conversational quality, all RxT models demonstrated a superior ability to maintain context and coherence over long dialogues compared to the baseline.\n", + "\n", + "These results confirm that a specialized, memory-augmented architecture is far more effective and parameter-efficient for conversational tasks than a generic, monolithic one.\n", + "\n", + "Constant Low Latency\n", + "\n", + "The latency benchmark highlights RxT's primary advantage for real-time applications. We measured the prompt processing time over an 8-step dialogue.\n", + "\n", + "The stateless LLM's latency grew steadily with each turn, from 0.09s to over 0.22s, as its context window filled up.\n", + "\n", + "RxT's latency remained nearly constant at ~0.06s across all steps, completely independent of the dialogue's history.\n", + "\n", + "This demonstrates RxT's ability to deliver a snappy, responsive user experience that doesn't degrade over time.\n", + "\n", + "Conclusion and Future Work\n", + "\n", + "The Reactive Transformer offers a new path forward for building truly interactive and scalable conversational AI. By moving from a stateless to a stateful, event-driven paradigm, RxT solves the critical bottlenecks of computational cost and latency that limit current LLMs.\n", + "\n", + "Our experiments provide strong proof-of-concept that this architectural specialization leads to superior performance and efficiency. The work presented here, focusing on the architecture and supervised training, is the first step. Our upcoming papers will detail the advanced multi-stage training curriculum, including novel Reinforcement Learning stages designed to further enhance the memory system's capabilities.\n", + "\n", + "We believe that building models with integrated, persistent memory systems—including future work on Long-Term Memory (LTM)—is essential for moving beyond simple language modeling and toward creating more capable, aware, and genuinely interactive AI agents.\n", + "\n", + "RxT-Beta - moving to real-world data and bigger scale\n", + "\n", + "After introducing synthetic Proof-of-Concept RxT-Alpha models, described in research paper, we are moving to bigger scale, real-world data and MVP RxT-Beta models. As a MVP, models will still be english-only, but they should be competitive for small stateless models for english-based benchmarks. RxT-Beta will be released in multiple variants:\n", + "\n", + "RxT-Beta-Micro (270M params) - already in training and should be released this month\n", + "\n", + "RxT-Beta-Micro-Reasoning (270M params)\n", + "\n", + "RxT-Beta-Mini (1B params) with hybrid reasoning\n", + "\n", + "RxT-Beta (4B params)\n", + "\n", + "Please follow me and Reactive AI for more updates.\n", + "\n", + "For a deeper dive into the architecture, training methodology, and results, please read the full research paper: \"Reactive Transformer (RxT) - Stateful Real-Time Processing for Event-Driven Reactive Language Models\".\n", + "\n", + "The Reactive Transformer architecture is patent-pending (#P.453260). Commercial usage is regulated by the Reactive AI Models & Architecture License. For more details, visit our GitHub: https://github.com/RxAI-dev/rxlm.\n", + "\n", + "Community\n", + "\n", + "Sign up or log in to comment\n", + "\n", + "Upvote`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33;1m\u001b[1;3m{'status': 'success', 'summary': \"The Reactive Transformer (RxT) introduces a novel architecture designed to address the inefficiencies of stateless large language models (LLMs) in conversational AI. Unlike traditional models that reprocess entire conversation histories, RxT employs an event-driven, stateful approach with a fixed-size Short-Term Memory (STM) to enable real-time, coherent, and cost-effective long-form dialogues. This shift results in linear cost scaling and constant low latency, significantly enhancing performance and user experience. Experimental results demonstrate that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks, achieving superior perplexity, accuracy, and coherence. The architecture's asynchronous cycle separates response generation from memory updates, ensuring minimal latency. RxT's specialized memory-augmented design proves more effective and parameter-efficient for conversational tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, aiming to create more interactive AI agents. RxT's advancements promise substantial industry impact by overcoming the computational and latency challenges of current conversational AI systems.\", 'length': 1257, 'message': 'Generated summary (1257 characters)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `collapse_summaries` with `[{\"summary\":\"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"},{\"summary\":\"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"},{\"summary\":\"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\"},{\"summary\":\"The Reactive Transformer (RxT) introduces a novel architecture designed to address the inefficiencies of stateless large language models (LLMs) in conversational AI. Unlike traditional models that reprocess entire conversation histories, RxT employs an event-driven, stateful approach with a fixed-size Short-Term Memory (STM) to enable real-time, coherent, and cost-effective long-form dialogues. This shift results in linear cost scaling and constant low latency, significantly enhancing performance and user experience. Experimental results demonstrate that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks, achieving superior perplexity, accuracy, and coherence. The architecture's asynchronous cycle separates response generation from memory updates, ensuring minimal latency. RxT's specialized memory-augmented design proves more effective and parameter-efficient for conversational tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, aiming to create more interactive AI agents. RxT's advancements promise substantial industry impact by overcoming the computational and latency challenges of current conversational AI systems.\"}]`\n", + "\n", + "\n", + "\u001b[0m\u001b[38;5;200m\u001b[1;3m{'status': 'success', 'collapsed_summaries': [{'summary': \"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"}, {'summary': \"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"}, {'summary': \"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\"}, {'summary': \"The Reactive Transformer (RxT) introduces a novel architecture designed to address the inefficiencies of stateless large language models (LLMs) in conversational AI. Unlike traditional models that reprocess entire conversation histories, RxT employs an event-driven, stateful approach with a fixed-size Short-Term Memory (STM) to enable real-time, coherent, and cost-effective long-form dialogues. This shift results in linear cost scaling and constant low latency, significantly enhancing performance and user experience. Experimental results demonstrate that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks, achieving superior perplexity, accuracy, and coherence. The architecture's asynchronous cycle separates response generation from memory updates, ensuring minimal latency. RxT's specialized memory-augmented design proves more effective and parameter-efficient for conversational tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, aiming to create more interactive AI agents. RxT's advancements promise substantial industry impact by overcoming the computational and latency challenges of current conversational AI systems.\"}], 'message': 'Summaries already fit in context (1,257 tokens). No collapse needed.'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m\n", + "Invoking: `generate_brief` with `[{\"summary\":\"BigCodeArena introduces a novel advancement in AI code generation evaluation by enabling real-time execution of AI-generated code, allowing users to compare models and vote on their performance based on actual outputs. This platform addresses the limitations of traditional benchmarks by providing a human-in-the-loop system where code can be run in isolated environments, supporting multiple languages and frameworks. The platform has shown significant performance improvements, with models like o3-mini and o1-mini consistently ranking at the top across various languages and execution environments. Practical applications span web design, game development, scientific computing, and more, highlighting the platform's versatility and industry impact. BigCodeArena's community-driven approach has led to over 14,000 conversations and 4,700 preference votes, offering valuable insights into model performance across diverse coding scenarios. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, with execution feedback proving crucial for accurate code quality assessment. Open-source contributions and continuous updates aim to expand language support and improve automated code quality assessment, positioning BigCodeArena as a leading tool for evaluating AI-assisted programming.\"},{\"summary\":\"In 2025, the feasibility of running competitive OCR models on-device has significantly improved, exemplified by RedNote's dots.ocr, a 3 billion parameter model that outperforms Gemini 2.5 Pro in the OmniDocBench. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns. Apple's Neural Engine, a custom AI accelerator, enhances on-device performance by being 12x more power-efficient than CPUs and 4x more than GPUs. However, the Neural Engine is only accessible via Apple's closed-source Core ML framework, which poses challenges for developers converting models from PyTorch. To address this, Apple offers MLX, a flexible ML framework targeting GPUs, which can be used alongside Core ML. The article outlines a conversion process for dots.ocr using CoreML and MLX, demonstrating the potential for broader application in on-device model deployment. Despite successful conversion, the model's initial size of over 5GB highlights the need for further optimization for practical deployment. Future parts of the series will explore integration and optimization strategies, including quantization and dynamic shapes, to enhance performance on the Neural Engine.\"},{\"summary\":\"OpenAI's GPT-OSS series introduces novel advancements such as MXFP4 quantization, efficient kernels, and a new chat format, significantly enhancing the transformers library's efficiency in loading, running, and fine-tuning models. Performance improvements are evident with PyTorch 2.0's torch.compile, which optimizes kernels for 2–10× gains, and the use of custom kernels like Flash Attention 3, which minimizes memory transfers and speeds up operations. Practical applications include the ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enabling single GPU deployment. Industry impact is seen in the integration of Tensor and Expert Parallelism, which distribute workloads across GPUs for improved throughput and memory efficiency. The introduction of Dynamic Sliding Window Layer & Cache reduces memory usage for models with sliding or hybrid attention, enhancing speed and latency for long prompts. Continuous Batching further optimizes GPU utilization by dynamically scheduling requests, outperforming static batching. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations and integrations.\"},{\"summary\":\"The Reactive Transformer (RxT) introduces a novel architecture designed to address the inefficiencies of stateless large language models (LLMs) in conversational AI. Unlike traditional models that reprocess entire conversation histories, RxT employs an event-driven, stateful approach with a fixed-size Short-Term Memory (STM) to enable real-time, coherent, and cost-effective long-form dialogues. This shift results in linear cost scaling and constant low latency, significantly enhancing performance and user experience. Experimental results demonstrate that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks, achieving superior perplexity, accuracy, and coherence. The architecture's asynchronous cycle separates response generation from memory updates, ensuring minimal latency. RxT's specialized memory-augmented design proves more effective and parameter-efficient for conversational tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, aiming to create more interactive AI agents. RxT's advancements promise substantial industry impact by overcoming the computational and latency challenges of current conversational AI systems.\"}]`\n", + "\n", + "\n", + "\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[36;1m\u001b[1;3m{'status': 'success', 'brief': \"**Executive Summary: AI Industry Developments and Strategic Implications**\\n\\nAs we navigate the rapidly evolving landscape of artificial intelligence, several significant developments have emerged that are poised to reshape industries and redefine competitive strategies. This executive summary highlights the most impactful advancements, their practical applications, and key performance milestones, synthesizing overarching trends to inform strategic decision-making for C-suite executives.\\n\\n**1. Advancements in AI Code Generation and Evaluation**\\n\\nThe introduction of BigCodeArena marks a transformative step in AI-assisted programming by enabling real-time execution and evaluation of AI-generated code. This platform addresses the limitations of traditional benchmarks through a human-in-the-loop system, allowing users to compare models based on actual outputs across multiple languages and frameworks. Notably, models like o3-mini and o1-mini have consistently excelled, demonstrating significant performance improvements. The practical applications of this platform are vast, spanning web design, game development, and scientific computing. BigCodeArena's community-driven approach, with over 14,000 conversations and 4,700 preference votes, provides valuable insights into model performance, positioning it as a leading tool for evaluating AI-assisted programming. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances the evaluation process, underscoring the platform's industry impact.\\n\\n**2. On-Device Optical Character Recognition (OCR) Innovations**\\n\\nThe feasibility of running competitive OCR models on-device has seen remarkable improvements, exemplified by RedNote's dots.ocr model. This advancement allows for OCR applications without network dependency, API costs, or data privacy concerns, leveraging Apple's Neural Engine for enhanced on-device performance. The Neural Engine's power efficiency—12x more than CPUs and 4x more than GPUs—highlights its potential for broader application in on-device model deployment. However, challenges remain, particularly in model conversion from PyTorch to Apple's closed-source Core ML framework. Apple's introduction of MLX, a flexible ML framework, offers a solution, demonstrating the potential for broader application in on-device model deployment. Future optimization strategies, including quantization and dynamic shapes, are anticipated to enhance performance further, aligning with industry trends towards more efficient, privacy-conscious AI solutions.\\n\\n**3. Transformative Developments in Transformer Models**\\n\\nOpenAI's GPT-OSS series introduces significant advancements in transformer model efficiency, notably through MXFP4 quantization and efficient kernels. These innovations enhance the transformers library's efficiency in loading, running, and fine-tuning models, with performance improvements evident through PyTorch 2.0's torch.compile. The ability to load larger models faster, with GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enables single GPU deployment, demonstrating substantial industry impact. The integration of Tensor and Expert Parallelism, along with Dynamic Sliding Window Layer & Cache, optimizes memory usage and enhances speed for long prompts. These advancements, driven by community contributions, position transformers as a reference framework for future model implementations, underscoring the trend towards more efficient and scalable AI models.\\n\\n**4. Innovations in Conversational AI with Reactive Transformers**\\n\\nThe Reactive Transformer (RxT) introduces a novel architecture designed to address inefficiencies in stateless large language models (LLMs) for conversational AI. By employing an event-driven, stateful approach with a fixed-size Short-Term Memory (STM), RxT enables real-time, coherent, and cost-effective long-form dialogues. This shift results in linear cost scaling and constant low latency, significantly enhancing performance and user experience. Experimental results demonstrate that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, promising substantial industry impact by overcoming the computational and latency challenges of current conversational AI systems.\\n\\n**Synthesis of Trends**\\n\\nAcross these developments, several key trends emerge. There is a clear movement towards more efficient, scalable, and privacy-conscious AI solutions, as seen in the advancements in on-device OCR and transformer models. The focus on real-time performance and user experience is evident in both BigCodeArena's code evaluation platform and the Reactive Transformer's conversational AI architecture. Community-driven contributions and open-source collaborations are driving innovation, underscoring the importance of ecosystem engagement in AI development. As these trends continue to unfold, they will shape strategic priorities, offering new opportunities for competitive differentiation and operational efficiency.\\n\\nIn conclusion, these developments represent significant strides in AI technology, offering transformative potential across industries. For C-suite executives, understanding and leveraging these advancements will be crucial in navigating the future landscape of AI-driven innovation.\", 'word_count': 678, 'message': 'Generated executive brief (678 words)'}\u001b[0m" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "INFO: HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[32;1m\u001b[1;3m**AI Newsletter - This Week's Highlights**\n", + "\n", + "**1. Advancements in AI Code Generation and Evaluation**\n", + "BigCodeArena has revolutionized AI-assisted programming by enabling real-time execution and evaluation of AI-generated code. This platform allows users to compare models based on actual outputs across multiple languages and frameworks. Models like o3-mini and o1-mini have shown significant performance improvements, excelling in various applications such as web design, game development, and scientific computing. With over 14,000 conversations and 4,700 preference votes, BigCodeArena offers valuable insights into model performance, positioning itself as a leading tool for AI-assisted programming evaluation. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances its evaluation process.\n", + "\n", + "**2. On-Device Optical Character Recognition (OCR) Innovations**\n", + "RedNote's dots.ocr model has improved the feasibility of running competitive OCR models on-device, eliminating the need for network dependency, API costs, or data privacy concerns. Leveraging Apple's Neural Engine, which is 12x more power-efficient than CPUs and 4x more than GPUs, this advancement demonstrates potential for broader application in on-device model deployment. Despite challenges in converting models from PyTorch to Apple's Core ML framework, Apple's MLX framework offers a solution, paving the way for future optimization strategies like quantization and dynamic shapes.\n", + "\n", + "**3. Transformative Developments in Transformer Models**\n", + "OpenAI's GPT-OSS series introduces significant advancements in transformer model efficiency, including MXFP4 quantization and efficient kernels. These innovations enhance the transformers library's efficiency in loading, running, and fine-tuning models, with notable performance improvements through PyTorch 2.0's torch.compile. The ability to load larger models faster, such as GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enables single GPU deployment. The integration of Tensor and Expert Parallelism, along with Dynamic Sliding Window Layer & Cache, optimizes memory usage and enhances speed for long prompts, positioning transformers as a reference framework for future implementations.\n", + "\n", + "**4. Innovations in Conversational AI with Reactive Transformers**\n", + "The Reactive Transformer (RxT) addresses inefficiencies in stateless large language models (LLMs) for conversational AI by employing an event-driven, stateful approach with a fixed-size Short-Term Memory (STM). This architecture enables real-time, coherent, and cost-effective long-form dialogues, resulting in linear cost scaling and constant low latency. Experimental results show that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, promising substantial industry impact.\n", + "\n", + "**Synthesis of Trends**\n", + "These developments highlight a trend towards more efficient, scalable, and privacy-conscious AI solutions. The focus on real-time performance and user experience is evident in both BigCodeArena's code evaluation platform and the Reactive Transformer's conversational AI architecture. Community-driven contributions and open-source collaborations are driving innovation, emphasizing the importance of ecosystem engagement in AI development. These trends will shape strategic priorities, offering new opportunities for competitive differentiation and operational efficiency.\n", + "\n", + "In conclusion, these advancements represent significant strides in AI technology, offering transformative potential across industries. Understanding and leveraging these developments will be crucial for navigating the future landscape of AI-driven innovation.\u001b[0m\n", + "\n", + "\u001b[1m> Finished chain.\u001b[0m\n", + "\n", + "============================================================\n", + "✅ SUMMARIZER COMPLETE\n", + "============================================================\n", + "\n", + "**AI Newsletter - This Week's Highlights**\n", + "\n", + "**1. Advancements in AI Code Generation and Evaluation**\n", + "BigCodeArena has revolutionized AI-assisted programming by enabling real-time execution and evaluation of AI-generated code. This platform allows users to compare models based on actual outputs across multiple languages and frameworks. Models like o3-mini and o1-mini have shown significant performance improvements, excelling in various applications such as web design, game development, and scientific computing. With over 14,000 conversations and 4,700 preference votes, BigCodeArena offers valuable insights into model performance, positioning itself as a leading tool for AI-assisted programming evaluation. The introduction of benchmarks like BigCodeReward and AutoCodeArena further enhances its evaluation process.\n", + "\n", + "**2. On-Device Optical Character Recognition (OCR) Innovations**\n", + "RedNote's dots.ocr model has improved the feasibility of running competitive OCR models on-device, eliminating the need for network dependency, API costs, or data privacy concerns. Leveraging Apple's Neural Engine, which is 12x more power-efficient than CPUs and 4x more than GPUs, this advancement demonstrates potential for broader application in on-device model deployment. Despite challenges in converting models from PyTorch to Apple's Core ML framework, Apple's MLX framework offers a solution, paving the way for future optimization strategies like quantization and dynamic shapes.\n", + "\n", + "**3. Transformative Developments in Transformer Models**\n", + "OpenAI's GPT-OSS series introduces significant advancements in transformer model efficiency, including MXFP4 quantization and efficient kernels. These innovations enhance the transformers library's efficiency in loading, running, and fine-tuning models, with notable performance improvements through PyTorch 2.0's torch.compile. The ability to load larger models faster, such as GPT-OSS 20B fitting in 16 GB of VRAM using MXFP4, enables single GPU deployment. The integration of Tensor and Expert Parallelism, along with Dynamic Sliding Window Layer & Cache, optimizes memory usage and enhances speed for long prompts, positioning transformers as a reference framework for future implementations.\n", + "\n", + "**4. Innovations in Conversational AI with Reactive Transformers**\n", + "The Reactive Transformer (RxT) addresses inefficiencies in stateless large language models (LLMs) for conversational AI by employing an event-driven, stateful approach with a fixed-size Short-Term Memory (STM). This architecture enables real-time, coherent, and cost-effective long-form dialogues, resulting in linear cost scaling and constant low latency. Experimental results show that even the smallest RxT model outperforms larger stateless LLMs in multi-turn dialogue tasks. Future developments include scaling to real-world data with RxT-Beta models and exploring integrated Long-Term Memory systems, promising substantial industry impact.\n", + "\n", + "**Synthesis of Trends**\n", + "These developments highlight a trend towards more efficient, scalable, and privacy-conscious AI solutions. The focus on real-time performance and user experience is evident in both BigCodeArena's code evaluation platform and the Reactive Transformer's conversational AI architecture. Community-driven contributions and open-source collaborations are driving innovation, emphasizing the importance of ecosystem engagement in AI development. These trends will shape strategic priorities, offering new opportunities for competitive differentiation and operational efficiency.\n", + "\n", + "In conclusion, these advancements represent significant strides in AI technology, offering transformative potential across industries. Understanding and leveraging these developments will be crucial for navigating the future landscape of AI-driven innovation.\n" + ] + } + ], + "source": [ + "# ============================================================\n", + "# CUSTOMIZE YOUR PROMPTS HERE\n", + "# ============================================================\n", + "\n", + "SUMMARY_PROMPT = \"\"\"You are an expert at summarizing AI research papers and industry developments.\n", + "\n", + "Please write a concise, informative summary of the following content, focusing specifically on:\n", + "- Novel advancements or breakthroughs in AI/ML\n", + "- State-of-the-art techniques or methodologies\n", + "- Performance improvements or benchmark results\n", + "- Practical applications and industry impact\n", + "- Significance to the AI research community\n", + "\n", + "Keep the summary focused and relevant to AI industry professionals. Maximum 12 sentences.\"\"\"\n", + "\n", + "EXECUTIVE_BRIEF_PROMPT = \"\"\"You are an expert AI industry analyst creating executive summaries for C-suite executives and industry leaders.\n", + "\n", + "You are given detailed summaries of recent AI research papers and industry developments. Your task is to create a concise executive summary of approximately 700 words that:\n", + "\n", + "1. **Identifies the most significant industry developments** - Focus on breakthroughs that will impact businesses, products, or the competitive landscape\n", + "2. **Highlights practical applications** - Emphasize real-world uses and business implications\n", + "3. **Notes key performance milestones** - Include impressive benchmark results or technical achievements\n", + "4. **Synthesizes trends** - Look for patterns or themes across multiple developments\n", + "5. **Maintains accessibility** - Write for business leaders who may not have deep technical expertise\n", + "\n", + "Structure your summary with:\n", + "- A brief opening paragraph highlighting the week's most significant theme or development\n", + "- 3-4 paragraphs covering the most important individual developments, organized by impact or theme\n", + "- A concluding paragraph on what these developments mean for the AI industry going forward\n", + "\n", + "Target length: approximately 700 words. Be selective - only include the most industry-relevant developments.\"\"\"\n", + "\n", + "# ============================================================\n", + "# Execute Summarizer Agent\n", + "# ============================================================\n", + "\n", + "print(\"📝 Starting Summarizer Agent\")\n", + "print(\"=\" * 60)\n", + "print(f\"📋 Task: Generate AI newsletter from MongoDB\")\n", + "print(f\"🗄️ Database: {MONGODB_DATABASE}\")\n", + "print(f\"📁 Collection: {MONGODB_COLLECTION}\")\n", + "\n", + "# Get document count before starting\n", + "doc_info = retrieve_documents_from_mongodb()\n", + "if doc_info[\"status\"] == \"success\":\n", + " print(f\"📄 Documents to process: {doc_info['unique_files']} unique files ({doc_info['total_documents']} total chunks)\")\n", + "else:\n", + " print(f\"⚠️ Could not retrieve document count\")\n", + "\n", + "print(\"=\" * 60)\n", + "\n", + "summarizer_response = summarizer_executor.invoke({\n", + " \"input\": f\"\"\"Generate this week's AI newsletter from MongoDB documents.\n", + "\n", + "For each document summary, use these instructions:\n", + "{SUMMARY_PROMPT}\n", + "\n", + "For the executive brief, use these instructions:\n", + "{EXECUTIVE_BRIEF_PROMPT}\n", + "\n", + "Process all documents and generate the complete newsletter.\"\"\",\n", + " \"mongodb_database\": MONGODB_DATABASE,\n", + " \"mongodb_collection\": MONGODB_COLLECTION\n", + "})\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"✅ SUMMARIZER COMPLETE\")\n", + "print(\"=\" * 60)\n", + "print(f\"\\n{summarizer_response['output']}\")" + ] + }, + { + "cell_type": "markdown", + "id": "55d019ed", + "metadata": {}, + "source": [ + "## What You've Learned\n", + "\n", + "**Document Processing Pipeline**: You've learned how to process PDF documents and HTML files with high-resolution partitioning, maintain page boundaries with page-based chunking, and store structured content in MongoDB for downstream applications.\n", + "\n", + "**Unstructured API Capabilities**: You've experienced intelligent document processing with hi_res strategy, advanced table detection and structure preservation, flexible chunking strategies for optimal text organization, and seamless integration with MongoDB for document storage.\n", + "\n", + "**AI-Powered Newsletter Generation**: You've built a complete system for retrieving processed documents from MongoDB, generating detailed summaries with customizable prompts, creating executive briefs that highlight key developments, and iterating on prompts to perfect your newsletter content.\n", + "\n", + "### Ready to Scale?\n", + "\n", + "Deploy automated newsletter systems for industry intelligence, build document summarization tools for research teams, or create AI-powered content aggregation systems. Add more document sources using additional S3 buckets, implement scheduled pipeline runs for fresh content, or scale up for production document volumes with automated processing.\n", + "\n", + "### Try Unstructured Today\n", + "\n", + "Ready to build your own AI-powered document processing system? [Sign up for a free trial](https://unstructured.io/?modal=try-for-free) and start transforming your documents into intelligent, searchable knowledge.\n", + "\n", + "**Need help getting started?** Contact our team to schedule a demo and see how Unstructured can solve your specific document processing challenges." + ] + } + ], + "metadata": { + "jupytext": { + "cell_metadata_filter": "-all", + "executable": "/usr/bin/env python3", + "main_language": "python", + "notebook_metadata_filter": "-all" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/notebooks/RAG_Llama3_Unstructured_LangChain.ipynb b/notebooks/RAG_Llama3_Unstructured_LangChain.ipynb index fccb796..a5c4055 100644 --- a/notebooks/RAG_Llama3_Unstructured_LangChain.ipynb +++ b/notebooks/RAG_Llama3_Unstructured_LangChain.ipynb @@ -1,8831 +1,8831 @@ { - "nbformat": 4, - "nbformat_minor": 0, - "metadata": { - "colab": { - "provenance": [], - "gpuType": "T4" + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Y2m2l-vt_RSp" + }, + "source": [ + "# Build a RAG system with Llama 3B-Instruct for your PDFs" + ] }, - "kernelspec": { - "name": "python3", - "display_name": "Python 3" + { + "cell_type": "markdown", + "metadata": { + "id": "bKZWdc1_J5hm" + }, + "source": [ + "In this quick tutorial, we'll build a simple RAG system with the latest LLM from Meta - Llama 3, specifically the `Llama-3-8B-Instruct` version that you can get on Hugging Face.\n", + "We'll use [Unstructured API](https://unstructured.io/) for preprocessing PDF files, LangChain for RAG, FAISS for vector storage, and HuggingFace `transformers` to get the model. Let's go!" + ] }, - "language_info": { - "name": "python" + { + "cell_type": "markdown", + "metadata": { + "id": "fKS2grloNhrM" + }, + "source": [ + "Install all the libraries, get your [free unstructured API key](https://unstructured.io/api-key-free), and instantiate the Unstructured client to preprocess your PDF file:" + ] }, - "accelerator": "GPU", - "widgets": { - "application/vnd.jupyter.widget-state+json": { - "282a07aee3e04d7a9902cf4621c2db7a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_d9682345a06f4c668b69077b1f677399", - "IPY_MODEL_530174416a3e4cac9d9eac6633a5d886", - "IPY_MODEL_229e2ecd668a49d1b9d4b78b5822d0e6" - ], - "layout": "IPY_MODEL_7daed9b7dfc34b3a94e6d47805d26427" - } - }, - "d9682345a06f4c668b69077b1f677399": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_6b8ce03feffe4747b0af404e90e0975e", - "placeholder": "​", - "style": "IPY_MODEL_7870c1f4e16e48dea5feb7498b9e837f", - "value": "modules.json: 100%" - } - }, - "530174416a3e4cac9d9eac6633a5d886": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_265de5620d9c483f96b0074ad1ca132d", - "max": 349, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f7cea318fe684c7b95575e6d33dc207d", - "value": 349 - } - }, - "229e2ecd668a49d1b9d4b78b5822d0e6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_d92c1c7eaec2439fa0d885d2ae833fd9", - "placeholder": "​", - "style": "IPY_MODEL_69cc7595e4764ad2806479b52d736faa", - "value": " 349/349 [00:00<00:00, 23.6kB/s]" - } - }, - "7daed9b7dfc34b3a94e6d47805d26427": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "6b8ce03feffe4747b0af404e90e0975e": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "7870c1f4e16e48dea5feb7498b9e837f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "265de5620d9c483f96b0074ad1ca132d": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f7cea318fe684c7b95575e6d33dc207d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "d92c1c7eaec2439fa0d885d2ae833fd9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "69cc7595e4764ad2806479b52d736faa": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "8d536f000daf4a928b9daeddf76cb375": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_a3e1d6f639c84b21aeb7e4ac96381ec8", - "IPY_MODEL_5f9e84ae3afe47d49b257941777ca5b0", - "IPY_MODEL_69d64f8babda4c52aea682b26d1a6e00" - ], - "layout": "IPY_MODEL_b9d09d77494f4e2aa9b500f5d0fdadaf" - } - }, - "a3e1d6f639c84b21aeb7e4ac96381ec8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_9b318fd56d8e45dc8c4d532662c5aba0", - "placeholder": "​", - "style": "IPY_MODEL_a8513c36560449099f7c257c21d5065b", - "value": "config_sentence_transformers.json: 100%" - } - }, - "5f9e84ae3afe47d49b257941777ca5b0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5071d412953e41458aea25cf2864ac23", - "max": 124, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_100f17a84bad4191bb7977949be2cdd4", - "value": 124 - } - }, - "69d64f8babda4c52aea682b26d1a6e00": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_151a4e93b66d4da5974bdb5f38596754", - "placeholder": "​", - "style": "IPY_MODEL_088e8878cd114c59b18e8bb9ed048457", - "value": " 124/124 [00:00<00:00, 8.76kB/s]" - } - }, - "b9d09d77494f4e2aa9b500f5d0fdadaf": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "9b318fd56d8e45dc8c4d532662c5aba0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a8513c36560449099f7c257c21d5065b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "5071d412953e41458aea25cf2864ac23": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "100f17a84bad4191bb7977949be2cdd4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "151a4e93b66d4da5974bdb5f38596754": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "088e8878cd114c59b18e8bb9ed048457": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "f6766f2b04da46bdbf2df051606fce2f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_2fd3a6fe6727423aa44840b17ebbcb34", - "IPY_MODEL_e857e11e572742168303037071a477ba", - "IPY_MODEL_7eb7ef9ef3af46acb461b42efb9c4af0" - ], - "layout": "IPY_MODEL_4db0da1b27344e45b7f235911185a4a0" - } - }, - "2fd3a6fe6727423aa44840b17ebbcb34": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_ff1415afeac740c69bee0facc692a8e1", - "placeholder": "​", - "style": "IPY_MODEL_86104b3ee4564b8e9cef14eec70f1391", - "value": "README.md: 100%" - } - }, - "e857e11e572742168303037071a477ba": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_40a2a615a56e445fa72383446136d980", - "max": 94551, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f39c6dc24936447dbbd0a3279c20223e", - "value": 94551 - } - }, - "7eb7ef9ef3af46acb461b42efb9c4af0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_b4a4dca5f5864bd3a1b202517c49b792", - "placeholder": "​", - "style": "IPY_MODEL_58c008b0cbf04cf2872afa353e6b9f76", - "value": " 94.6k/94.6k [00:00<00:00, 3.37MB/s]" - } - }, - "4db0da1b27344e45b7f235911185a4a0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ff1415afeac740c69bee0facc692a8e1": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "86104b3ee4564b8e9cef14eec70f1391": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "40a2a615a56e445fa72383446136d980": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f39c6dc24936447dbbd0a3279c20223e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "b4a4dca5f5864bd3a1b202517c49b792": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "58c008b0cbf04cf2872afa353e6b9f76": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "8c5086b82bc243f5bac08f9aed527c19": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_f98ad69de71b4ff5bc43e3d4cd4cc5ee", - "IPY_MODEL_8b98d1642c3e4c959952248ce40c60c7", - "IPY_MODEL_1134e014bcc242888828682645029a1e" - ], - "layout": "IPY_MODEL_423cefd4340c482cb70337168049f1c7" - } - }, - "f98ad69de71b4ff5bc43e3d4cd4cc5ee": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e7dd117cd6684500bf4b11a10e7cf0e9", - "placeholder": "​", - "style": "IPY_MODEL_2ba6d64387244836bc00e3d62a552cae", - "value": "sentence_bert_config.json: 100%" - } - }, - "8b98d1642c3e4c959952248ce40c60c7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e3d77e059a5f47eb8dbeeef22febc9d8", - "max": 52, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_3cad7442bcfa45fd9aed4002324a9d7f", - "value": 52 - } - }, - "1134e014bcc242888828682645029a1e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_df5fe4b465b646d2938cf8da656a8d33", - "placeholder": "​", - "style": "IPY_MODEL_56a410608c3a49ed90153c2989f6b8d9", - "value": " 52.0/52.0 [00:00<00:00, 3.66kB/s]" - } - }, - "423cefd4340c482cb70337168049f1c7": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "e7dd117cd6684500bf4b11a10e7cf0e9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "2ba6d64387244836bc00e3d62a552cae": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e3d77e059a5f47eb8dbeeef22febc9d8": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "3cad7442bcfa45fd9aed4002324a9d7f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "df5fe4b465b646d2938cf8da656a8d33": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "56a410608c3a49ed90153c2989f6b8d9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "50550ce1b30649b9b7a382f488025af1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_ebea966f35ea40d7ad4a4f6fe34b4bda", - "IPY_MODEL_f732f41c4de44ea6a0db1de49695f85e", - "IPY_MODEL_fd3ea606c386466d9c6968f8728533b5" - ], - "layout": "IPY_MODEL_8fccf616a7694c529c0c966d417cad51" - } - }, - "ebea966f35ea40d7ad4a4f6fe34b4bda": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_fb8b5e13e19e4424ac25a499550868da", - "placeholder": "​", - "style": "IPY_MODEL_fd44f15fb985473eb6ddccae5a39f04d", - "value": "config.json: 100%" - } - }, - "f732f41c4de44ea6a0db1de49695f85e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_61d9fb21d35948fead864e381993a677", - "max": 777, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_acd718fefdc84b8fa60d4255077914ee", - "value": 777 - } - }, - "fd3ea606c386466d9c6968f8728533b5": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_ebdcaa70abe74743890e7de4d4685291", - "placeholder": "​", - "style": "IPY_MODEL_845ffe1f73164cd2a3e4f413e2a35241", - "value": " 777/777 [00:00<00:00, 64.0kB/s]" - } - }, - "8fccf616a7694c529c0c966d417cad51": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "fb8b5e13e19e4424ac25a499550868da": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "fd44f15fb985473eb6ddccae5a39f04d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "61d9fb21d35948fead864e381993a677": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "acd718fefdc84b8fa60d4255077914ee": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "ebdcaa70abe74743890e7de4d4685291": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "845ffe1f73164cd2a3e4f413e2a35241": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "715c90e0865d4e01943a841fdf3eb0dd": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_38ac4502272241f291ce81ec6802b55f", - "IPY_MODEL_11fb0295d6ad47ce85d9382bdb1723fd", - "IPY_MODEL_b62c504dd67f4d08b5e19d4ca86a8379" - ], - "layout": "IPY_MODEL_f94070d542904bb4ba45a550203c5ac2" - } - }, - "38ac4502272241f291ce81ec6802b55f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_77af9e788b0446de8cd84f1828877588", - "placeholder": "​", - "style": "IPY_MODEL_7efaf19ab2624dbf9adde1f8d03f3911", - "value": "model.safetensors: 100%" - } - }, - "11fb0295d6ad47ce85d9382bdb1723fd": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_d0518bbbde494ab1876c98f3d514832a", - "max": 437955512, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_c2b7df874d334dcdbb433897c84b079d", - "value": 437955512 - } - }, - "b62c504dd67f4d08b5e19d4ca86a8379": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_f7bbd95f043d46339377ee21821735d8", - "placeholder": "​", - "style": "IPY_MODEL_2e4a239969f443bf80829b3f21ffebdd", - "value": " 438M/438M [00:02<00:00, 186MB/s]" - } - }, - "f94070d542904bb4ba45a550203c5ac2": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "77af9e788b0446de8cd84f1828877588": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "7efaf19ab2624dbf9adde1f8d03f3911": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d0518bbbde494ab1876c98f3d514832a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "c2b7df874d334dcdbb433897c84b079d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "f7bbd95f043d46339377ee21821735d8": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "2e4a239969f443bf80829b3f21ffebdd": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "519c0f5256974835b547b28e94ff2224": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_f4b86788d3ce49b7bce7d273d8bf2cad", - "IPY_MODEL_098e801e0c794453b31d69805208dcd9", - "IPY_MODEL_3b20a4851944472198eb2e4f24ee3d96" - ], - "layout": "IPY_MODEL_58ff4d02a7b0449da85e8f48562d2b87" - } - }, - "f4b86788d3ce49b7bce7d273d8bf2cad": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_a57019b4b2394e3a8144831d400d7c0b", - "placeholder": "​", - "style": "IPY_MODEL_932611e22f1f4a1f9598be2a287d397a", - "value": "tokenizer_config.json: 100%" - } - }, - "098e801e0c794453b31d69805208dcd9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_1b093409a5c340a6a74fd0f64e4e6154", - "max": 366, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_998a415e72704b20b00c2e3e4bf37614", - "value": 366 - } - }, - "3b20a4851944472198eb2e4f24ee3d96": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e177d2ee77e24c17a723986c62ae5367", - "placeholder": "​", - "style": "IPY_MODEL_ce1dcb5db8c546cb84761c141924cd98", - "value": " 366/366 [00:00<00:00, 17.8kB/s]" - } - }, - "58ff4d02a7b0449da85e8f48562d2b87": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a57019b4b2394e3a8144831d400d7c0b": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "932611e22f1f4a1f9598be2a287d397a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "1b093409a5c340a6a74fd0f64e4e6154": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "998a415e72704b20b00c2e3e4bf37614": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "e177d2ee77e24c17a723986c62ae5367": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ce1dcb5db8c546cb84761c141924cd98": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "c03ef8cf6b654e1994d691cbdb331e55": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_350258a4b1e848e0b28880b868412862", - "IPY_MODEL_9e003352a7d742fb833f15b532a5ad53", - "IPY_MODEL_4e3dd4a0fa4e42be9515f89bbdada0b1" - ], - "layout": "IPY_MODEL_7d84283a574f470a9f130aa40fad0083" - } - }, - "350258a4b1e848e0b28880b868412862": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_b0512a445d8748e0a538cd037dd80480", - "placeholder": "​", - "style": "IPY_MODEL_eebc474029844a1b88a022ad694d1f17", - "value": "vocab.txt: 100%" - } - }, - "9e003352a7d742fb833f15b532a5ad53": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_6fac241a17194caf908895b8f4cdb377", - "max": 231508, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_ef9b4e9212c845359acb1a0d295c2211", - "value": 231508 - } - }, - "4e3dd4a0fa4e42be9515f89bbdada0b1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_c2e105c1ac584ba9b12bd4b22191d3a8", - "placeholder": "​", - "style": "IPY_MODEL_9d6f989fbb584b2eb25172fcc662b2e7", - "value": " 232k/232k [00:00<00:00, 11.1MB/s]" - } - }, - "7d84283a574f470a9f130aa40fad0083": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "b0512a445d8748e0a538cd037dd80480": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "eebc474029844a1b88a022ad694d1f17": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "6fac241a17194caf908895b8f4cdb377": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ef9b4e9212c845359acb1a0d295c2211": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "c2e105c1ac584ba9b12bd4b22191d3a8": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "9d6f989fbb584b2eb25172fcc662b2e7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "8e3912f4f3134e7f90ae0959536f6429": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_264da80b4e3e446199cf9d53b3f4ed58", - "IPY_MODEL_866796ac33744148b117d864df6283dc", - "IPY_MODEL_702057124ce3481dab7da4bedce9c7e4" - ], - "layout": "IPY_MODEL_a67b0a951275400bbfdc5ab341cd98bc" - } - }, - "264da80b4e3e446199cf9d53b3f4ed58": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_58b683922de04bff9a26487612d10288", - "placeholder": "​", - "style": "IPY_MODEL_4464dca33eb24b5793457e70910b8aa4", - "value": "tokenizer.json: 100%" - } - }, - "866796ac33744148b117d864df6283dc": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_8945460a9d714e559bfeb3d6b5c77196", - "max": 711396, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_595ca1e634504c98adf61fc08c98a826", - "value": 711396 - } - }, - "702057124ce3481dab7da4bedce9c7e4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_1b7d1263951b4a3ea339b4d2c871f2c8", - "placeholder": "​", - "style": "IPY_MODEL_e0c7d450ac65475b869768d8bb5fb7bb", - "value": " 711k/711k [00:00<00:00, 26.5MB/s]" - } - }, - "a67b0a951275400bbfdc5ab341cd98bc": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "58b683922de04bff9a26487612d10288": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "4464dca33eb24b5793457e70910b8aa4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "8945460a9d714e559bfeb3d6b5c77196": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "595ca1e634504c98adf61fc08c98a826": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "1b7d1263951b4a3ea339b4d2c871f2c8": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "e0c7d450ac65475b869768d8bb5fb7bb": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e6efa0c798d14790a6a57a566927bfb2": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_c31d528f88ff43bc93b2d0284c52b308", - "IPY_MODEL_45282b1857ab4ebcb84a7cb5cd209062", - "IPY_MODEL_34a7c9e315c745bb8075547e8414a5e6" - ], - "layout": "IPY_MODEL_19c0c0c106f24040a18d74cb75d8057c" - } - }, - "c31d528f88ff43bc93b2d0284c52b308": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_ed201ab44f23458e9b8752c6920484f5", - "placeholder": "​", - "style": "IPY_MODEL_1f4ac8691f234d50abf1460a029403c7", - "value": "special_tokens_map.json: 100%" - } - }, - "45282b1857ab4ebcb84a7cb5cd209062": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e8d3f4424c4b42e2ba4c94429830b977", - "max": 125, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_28a912965c254793a1bf3624f7faa17b", - "value": 125 - } - }, - "34a7c9e315c745bb8075547e8414a5e6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_b46c929b3e904e39ae2187a8b08c66a9", - "placeholder": "​", - "style": "IPY_MODEL_5645173b5b724ad4be483a81839fcf08", - "value": " 125/125 [00:00<00:00, 6.78kB/s]" - } - }, - "19c0c0c106f24040a18d74cb75d8057c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ed201ab44f23458e9b8752c6920484f5": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "1f4ac8691f234d50abf1460a029403c7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e8d3f4424c4b42e2ba4c94429830b977": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "28a912965c254793a1bf3624f7faa17b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "b46c929b3e904e39ae2187a8b08c66a9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "5645173b5b724ad4be483a81839fcf08": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "65383a4be13a441ba688676051fe6d29": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_d76b9f06c08b4634a51199db44c0a11c", - "IPY_MODEL_8a429e04f8c04b4cbccc2a0fbfc69f39", - "IPY_MODEL_ac5e4ffea47a4a148219e7e998b5c325" - ], - "layout": "IPY_MODEL_d1fef74b9af44ac0b3ff021292982577" - } - }, - "d76b9f06c08b4634a51199db44c0a11c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_2d446aa187c04de79bdb5aae679799e7", - "placeholder": "​", - "style": "IPY_MODEL_b6e5248afcbd4241a6f7e08e81b52777", - "value": "1_Pooling/config.json: 100%" - } - }, - "8a429e04f8c04b4cbccc2a0fbfc69f39": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_594e9fa90fac4d40b1524d57304b37ee", - "max": 190, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_1b0ce7f124384da48b7ed934a283a191", - "value": 190 - } - }, - "ac5e4ffea47a4a148219e7e998b5c325": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_f705b798938b4e919593be8fd9bbcc64", - "placeholder": "​", - "style": "IPY_MODEL_e28088caf5904fbcb81ff49cd04a1667", - "value": " 190/190 [00:00<00:00, 10.9kB/s]" - } - }, - "d1fef74b9af44ac0b3ff021292982577": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "2d446aa187c04de79bdb5aae679799e7": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "b6e5248afcbd4241a6f7e08e81b52777": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "594e9fa90fac4d40b1524d57304b37ee": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "1b0ce7f124384da48b7ed934a283a191": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "f705b798938b4e919593be8fd9bbcc64": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "e28088caf5904fbcb81ff49cd04a1667": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "2e1ee91dac5c4299aecf906b33789fcf": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_fdf91a039db749b38122ac0eb6eab108", - "IPY_MODEL_77b13df8a8504556b51744b1f02603ee", - "IPY_MODEL_298c3d1f6a934aa6b15a3b665a066eb6" - ], - "layout": "IPY_MODEL_3f1f7478f226408e964bea5adf304f21" - } - }, - "fdf91a039db749b38122ac0eb6eab108": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_73768d82de1e4b979a7947f600fe9d09", - "placeholder": "​", - "style": "IPY_MODEL_b47a97e2f1b8413b9a705c173d6c896b", - "value": "config.json: 100%" - } - }, - "77b13df8a8504556b51744b1f02603ee": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_d5cff24e326f4b31b666c7aaf70e92f4", - "max": 654, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_4780eb726783487892cddce619057efe", - "value": 654 - } - }, - "298c3d1f6a934aa6b15a3b665a066eb6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5eacda4f75df44c88e3cec16c147fd0e", - "placeholder": "​", - "style": "IPY_MODEL_aa8fb50e8b524ff58f0493671fb0ba37", - "value": " 654/654 [00:00<00:00, 45.2kB/s]" - } - }, - "3f1f7478f226408e964bea5adf304f21": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "73768d82de1e4b979a7947f600fe9d09": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "b47a97e2f1b8413b9a705c173d6c896b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d5cff24e326f4b31b666c7aaf70e92f4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "4780eb726783487892cddce619057efe": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "5eacda4f75df44c88e3cec16c147fd0e": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "aa8fb50e8b524ff58f0493671fb0ba37": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "fd6ee259a61d45399d3bddbff62e08fa": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_67506c1cff4e4f02ab15b05379debe89", - "IPY_MODEL_697705f63b4e4234bec3dff82f9a668f", - "IPY_MODEL_12e047d39fa545ac9ecfd900e88fba68" - ], - "layout": "IPY_MODEL_14cd39e47c764792a1362259532fd09b" - } - }, - "67506c1cff4e4f02ab15b05379debe89": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_875064682fde47e89d090f9dea7f2a9e", - "placeholder": "​", - "style": "IPY_MODEL_1c5fc66b15d644768d328fc67d6d8c94", - "value": "model.safetensors.index.json: 100%" - } - }, - "697705f63b4e4234bec3dff82f9a668f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e4a90f0ea3bc4c8b958ed923c104b3d6", - "max": 23950, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_afeca83368d44e4fa29d882a8f9cc0a5", - "value": 23950 - } - }, - "12e047d39fa545ac9ecfd900e88fba68": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5ff02061baf14cc69b6e99fa3375cc9b", - "placeholder": "​", - "style": "IPY_MODEL_6996f20968ca4c16af50df81f148355c", - "value": " 23.9k/23.9k [00:00<00:00, 1.92MB/s]" - } - }, - "14cd39e47c764792a1362259532fd09b": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "875064682fde47e89d090f9dea7f2a9e": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "1c5fc66b15d644768d328fc67d6d8c94": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e4a90f0ea3bc4c8b958ed923c104b3d6": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "afeca83368d44e4fa29d882a8f9cc0a5": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "5ff02061baf14cc69b6e99fa3375cc9b": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "6996f20968ca4c16af50df81f148355c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d80a990ec240484c8b3e16bcd1165320": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_21fb6035bff44eb9809e022af0bb7104", - "IPY_MODEL_b65f406dcd7b48d29f6d234398f9e2c1", - "IPY_MODEL_089ed96280c64a0588d48e04f92c183a" - ], - "layout": "IPY_MODEL_c3ac25e2171b40c49e681bf7b09dc1d2" - } - }, - "21fb6035bff44eb9809e022af0bb7104": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5c342c9d3b364194888115758819b7a0", - "placeholder": "​", - "style": "IPY_MODEL_22b8f6c8a73b42f183a9041e3fe61f14", - "value": "Downloading shards: 100%" - } - }, - "b65f406dcd7b48d29f6d234398f9e2c1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_fdb89e3a02c94b9988db666725fb5cc0", - "max": 4, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_d911fdac4d3e43bb801e4ae81f8f108d", - "value": 4 - } - }, - "089ed96280c64a0588d48e04f92c183a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_89b20da2defb418299dd1b4b41f3414a", - "placeholder": "​", - "style": "IPY_MODEL_8d6d4222b2c348bf8f79031dba831dae", - "value": " 4/4 [02:40<00:00, 35.62s/it]" - } - }, - "c3ac25e2171b40c49e681bf7b09dc1d2": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "5c342c9d3b364194888115758819b7a0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "22b8f6c8a73b42f183a9041e3fe61f14": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "fdb89e3a02c94b9988db666725fb5cc0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "d911fdac4d3e43bb801e4ae81f8f108d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "89b20da2defb418299dd1b4b41f3414a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "8d6d4222b2c348bf8f79031dba831dae": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "3d4bd9fc29094069989a9c53ceb129b9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_566d2392323746d6ad976937b2a9da80", - "IPY_MODEL_29c85a59881f4a57b71d2a707b7b7fb8", - "IPY_MODEL_486f67e06ce64b019f17053f0a6d410c" - ], - "layout": "IPY_MODEL_57fd623252fe46d7a89b5f404a61da1a" - } - }, - "566d2392323746d6ad976937b2a9da80": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_a3b3b8175680410483eb988ed1c6bffa", - "placeholder": "​", - "style": "IPY_MODEL_80e83a9fd2c2402c8eed6a659938acf6", - "value": "model-00001-of-00004.safetensors: 100%" - } - }, - "29c85a59881f4a57b71d2a707b7b7fb8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_9aff0e55c7a24c618c998804a769669f", - "max": 4976698672, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_bcd1f2cdb488417a888a9ddce142dcdd", - "value": 4976698672 - } - }, - "486f67e06ce64b019f17053f0a6d410c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_516bb336ed96406499d23a2b86ed6a9c", - "placeholder": "​", - "style": "IPY_MODEL_bb561f17393947ef833a3208abf83b00", - "value": " 4.98G/4.98G [00:43<00:00, 174MB/s]" - } - }, - "57fd623252fe46d7a89b5f404a61da1a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a3b3b8175680410483eb988ed1c6bffa": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "80e83a9fd2c2402c8eed6a659938acf6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "9aff0e55c7a24c618c998804a769669f": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "bcd1f2cdb488417a888a9ddce142dcdd": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "516bb336ed96406499d23a2b86ed6a9c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "bb561f17393947ef833a3208abf83b00": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d5a0eac1f227495886ecd40fd2d035f9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_91f078a552854c83a9821f0e23a2179f", - "IPY_MODEL_e8b871dcfca946e08453c6f788f78b4b", - "IPY_MODEL_f4a11ae55c724924ac1318e5730f6d2c" - ], - "layout": "IPY_MODEL_58e024f57fcf4039b360b0bdbbe23520" - } - }, - "91f078a552854c83a9821f0e23a2179f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_0c91a9d8b10d49a5afde9f9208acb95f", - "placeholder": "​", - "style": "IPY_MODEL_0f33403f4da64ee6bc338719f05aaa02", - "value": "model-00002-of-00004.safetensors: 100%" - } - }, - "e8b871dcfca946e08453c6f788f78b4b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_b3c06b8aa61d4e6ba9df1e90e86f2ba9", - "max": 4999802720, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_faeb0c503d83411f8f162a23ea49375b", - "value": 4999802720 - } - }, - "f4a11ae55c724924ac1318e5730f6d2c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_63fcde7be7d94fbdb2e9bf9ca9084ad7", - "placeholder": "​", - "style": "IPY_MODEL_95aa717fc02b43fbb08ed633480bf1ee", - "value": " 5.00G/5.00G [00:51<00:00, 189MB/s]" - } - }, - "58e024f57fcf4039b360b0bdbbe23520": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0c91a9d8b10d49a5afde9f9208acb95f": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0f33403f4da64ee6bc338719f05aaa02": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "b3c06b8aa61d4e6ba9df1e90e86f2ba9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "faeb0c503d83411f8f162a23ea49375b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "63fcde7be7d94fbdb2e9bf9ca9084ad7": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "95aa717fc02b43fbb08ed633480bf1ee": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "5bc4d0af62f047928f9de1183100acaa": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_17fbfefcf6ab41ccbf3fbb570cfe3c0d", - "IPY_MODEL_ab7055cac4e940229ee11ff98caafdd0", - "IPY_MODEL_fac438b999bb40459c245b0e307bd132" - ], - "layout": "IPY_MODEL_07d99c2f9a2e4a038072a085a745a080" - } - }, - "17fbfefcf6ab41ccbf3fbb570cfe3c0d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_4902defd076c4820b3a3e61c876024bc", - "placeholder": "​", - "style": "IPY_MODEL_f8e7f835c1f24fd082f9b61d4785723f", - "value": "model-00003-of-00004.safetensors: 100%" - } - }, - "ab7055cac4e940229ee11ff98caafdd0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_9c0b221cf5bf4b8495e435137a9468e5", - "max": 4915916176, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_b590bf4d0dda4fd99ebc4b2cfe8050bc", - "value": 4915916176 - } - }, - "fac438b999bb40459c245b0e307bd132": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_4f33241a7378442a8711b00402742526", - "placeholder": "​", - "style": "IPY_MODEL_0a379a9f2ae84ed6bc45427e3e3eb916", - "value": " 4.92G/4.92G [00:49<00:00, 177MB/s]" - } - }, - "07d99c2f9a2e4a038072a085a745a080": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "4902defd076c4820b3a3e61c876024bc": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f8e7f835c1f24fd082f9b61d4785723f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "9c0b221cf5bf4b8495e435137a9468e5": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "b590bf4d0dda4fd99ebc4b2cfe8050bc": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "4f33241a7378442a8711b00402742526": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0a379a9f2ae84ed6bc45427e3e3eb916": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d555f930636e41d58c02d18020cf0635": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_421b9ec5cfc34e17b193366bd619f916", - "IPY_MODEL_d437a724018e4f4795b1e3bddbba782a", - "IPY_MODEL_0038bd8e5cc446cfa5637aecb86a405c" - ], - "layout": "IPY_MODEL_e97bcab104234aa3a22130e421cec71e" - } - }, - "421b9ec5cfc34e17b193366bd619f916": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5cacb4eb6ece4fb284adb0eca4c474f3", - "placeholder": "​", - "style": "IPY_MODEL_431786378bd74bdfb9dca67b212a01df", - "value": "model-00004-of-00004.safetensors: 100%" - } - }, - "d437a724018e4f4795b1e3bddbba782a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_c9d72c019b2544ff86f818534269e9b5", - "max": 1168138808, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f754d601c4134e0f8701735171ec75a9", - "value": 1168138808 - } - }, - "0038bd8e5cc446cfa5637aecb86a405c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_85a1497ae9964e77885583613a7c6d9a", - "placeholder": "​", - "style": "IPY_MODEL_0b43aa2b874b487b91731e463cf853a5", - "value": " 1.17G/1.17G [00:14<00:00, 121MB/s]" - } - }, - "e97bcab104234aa3a22130e421cec71e": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "5cacb4eb6ece4fb284adb0eca4c474f3": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "431786378bd74bdfb9dca67b212a01df": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "c9d72c019b2544ff86f818534269e9b5": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f754d601c4134e0f8701735171ec75a9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "85a1497ae9964e77885583613a7c6d9a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0b43aa2b874b487b91731e463cf853a5": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "f5b1706f09524b9c9e1f134a6df03b51": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_f481bd5fe45f4f66887b99ad7ebb6fbb", - "IPY_MODEL_817fc199281249e982f5b3ac78fe3923", - "IPY_MODEL_72709f5c251b47dea3cb6b04fbedf6e8" - ], - "layout": "IPY_MODEL_efd6d5809f924d288a9d1799dd3b4806" - } - }, - "f481bd5fe45f4f66887b99ad7ebb6fbb": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_0df353d732694e9ab1d6c9530c45a0d4", - "placeholder": "​", - "style": "IPY_MODEL_eacdabbb4b154c5385b76392ca17bfcf", - "value": "Loading checkpoint shards: 100%" - } - }, - "817fc199281249e982f5b3ac78fe3923": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_d1881e5dd3bb42f3a62444a987aff829", - "max": 4, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_115d71f18c1b4fc79ed6a27fc6401dbc", - "value": 4 - } - }, - "72709f5c251b47dea3cb6b04fbedf6e8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e0e358f24f9a4d959ac7b4f7c26d69eb", - "placeholder": "​", - "style": "IPY_MODEL_e790081734bb4461b525d46b6895a9b1", - "value": " 4/4 [01:24<00:00, 17.95s/it]" - } - }, - "efd6d5809f924d288a9d1799dd3b4806": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0df353d732694e9ab1d6c9530c45a0d4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "eacdabbb4b154c5385b76392ca17bfcf": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d1881e5dd3bb42f3a62444a987aff829": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "115d71f18c1b4fc79ed6a27fc6401dbc": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "e0e358f24f9a4d959ac7b4f7c26d69eb": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "e790081734bb4461b525d46b6895a9b1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "4bbeba17775c423f8dd570e25245982b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_ce86048691224ac495c0dbb6817a4721", - "IPY_MODEL_d4dea83270d148eb9fb6a19150f4b580", - "IPY_MODEL_a91b8f174a1a49e1a1f114316ea23133" - ], - "layout": "IPY_MODEL_b57e9af6a5f9433496526beffb7a269f" - } - }, - "ce86048691224ac495c0dbb6817a4721": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_ca841ddbe07f4b77bcff43bc706b1a20", - "placeholder": "​", - "style": "IPY_MODEL_dc44bc377ec941eea4f627d2bc847477", - "value": "generation_config.json: 100%" - } - }, - "d4dea83270d148eb9fb6a19150f4b580": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_380e392c659a418fba5d3564cbf62e94", - "max": 126, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_24f3609a29944a6bbc88c59c5e5eee14", - "value": 126 - } - }, - "a91b8f174a1a49e1a1f114316ea23133": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_b99c9208985f485cbbfbca51d5260ba5", - "placeholder": "​", - "style": "IPY_MODEL_d6960ffb549e4e69a919405abecb22a3", - "value": " 126/126 [00:00<00:00, 8.70kB/s]" - } - }, - "b57e9af6a5f9433496526beffb7a269f": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ca841ddbe07f4b77bcff43bc706b1a20": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "dc44bc377ec941eea4f627d2bc847477": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "380e392c659a418fba5d3564cbf62e94": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "24f3609a29944a6bbc88c59c5e5eee14": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "b99c9208985f485cbbfbca51d5260ba5": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "d6960ffb549e4e69a919405abecb22a3": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "a2e3896132754b40bb9252709d86b17a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_920320e9727946abbbb4bfaf899b6d09", - "IPY_MODEL_e71f70b8dde344b38818b098fa418ec4", - "IPY_MODEL_1738720dc3e640aaaaf1d2eb609af8b6" - ], - "layout": "IPY_MODEL_ada1722de7c443ff93ef04df53b89404" - } - }, - "920320e9727946abbbb4bfaf899b6d09": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_74c3c91eb1964ac4a6ae58b0d95b874c", - "placeholder": "​", - "style": "IPY_MODEL_844d6117b65f442992c299535b66728e", - "value": "tokenizer_config.json: 100%" - } - }, - "e71f70b8dde344b38818b098fa418ec4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_a62910759d0440e4a1e7b9a99cdccda0", - "max": 50941, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_d93e7a2e9e1a4f3692fae8d59a3cdfcf", - "value": 50941 - } - }, - "1738720dc3e640aaaaf1d2eb609af8b6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_de98dcfd70aa4144abb13896da001ccc", - "placeholder": "​", - "style": "IPY_MODEL_ec2ce8e9ce1a45e38af3196fd2e60546", - "value": " 50.9k/50.9k [00:00<00:00, 3.56MB/s]" - } - }, - "ada1722de7c443ff93ef04df53b89404": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "74c3c91eb1964ac4a6ae58b0d95b874c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "844d6117b65f442992c299535b66728e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "a62910759d0440e4a1e7b9a99cdccda0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "d93e7a2e9e1a4f3692fae8d59a3cdfcf": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "de98dcfd70aa4144abb13896da001ccc": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ec2ce8e9ce1a45e38af3196fd2e60546": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "260fa61b341b407ea1732ef8ace93c5e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_644390bc94554bc6a2cf8f5add94a1d7", - "IPY_MODEL_f3f39db72b2b4ac99641eda95606701b", - "IPY_MODEL_8dac808a75e94c33bc766ac34befc1df" - ], - "layout": "IPY_MODEL_65f8d58f5f01479cb4a1241a23ea7978" - } - }, - "644390bc94554bc6a2cf8f5add94a1d7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_2e16ed7fde6e409a8dac1fc3fed2b4c1", - "placeholder": "​", - "style": "IPY_MODEL_813881219e0c420e80c9a4b27a023d54", - "value": "tokenizer.json: 100%" - } + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "rJ9juQ-XKJKK" + }, + "outputs": [], + "source": [ + "!pip install -q unstructured-client unstructured[all-docs] langchain transformers accelerate bitsandbytes sentence-transformers faiss-gpu" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "3JWGSEoZKbtN" + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "os.environ[\"UNSTRUCTURED_API_KEY\"] = \"YOUR_UNSTRUCTURED_API_KEY\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OIG6V3swKyIZ" + }, + "outputs": [], + "source": [ + "from unstructured_client import UnstructuredClient\n", + "\n", + "unstructured_api_key = os.environ.get(\"UNSTRUCTURED_API_KEY\")\n", + "\n", + "client = UnstructuredClient(\n", + " api_key_auth=unstructured_api_key,\n", + " # if using paid API, provide your unique API URL:\n", + " # server_url=\"YOUR_API_URL\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CwzrH-9_K6-z" + }, + "source": [ + "Partition, and chunk your file so that the logical structure of the document is preserved for better RAG results." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "qThmnl_fJzrb" + }, + "outputs": [], + "source": [ + "from unstructured_client.models import shared\n", + "from unstructured_client.models.errors import SDKError\n", + "from unstructured.staging.base import dict_to_elements\n", + "\n", + "path_to_pdf=\"PATH_TO_YOUR_PDF_FILE\"\n", + "\n", + "with open(path_to_pdf, \"rb\") as f:\n", + " files=shared.Files(\n", + " content=f.read(),\n", + " file_name=path_to_pdf,\n", + " )\n", + " req = shared.PartitionParameters(\n", + " files=files,\n", + " chunking_strategy=\"by_title\",\n", + " max_characters=512,\n", + " )\n", + " try:\n", + " resp = client.general.partition(req)\n", + " except SDKError as e:\n", + " print(e)\n", + "\n", + "elements = dict_to_elements(resp.elements)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UF1TaeByPEBR" + }, + "source": [ + "Create LangChain documents from document chunks and their metadata, and ingest those documents into the FAISS vectorstore.\n", + "\n", + "Set up the retriever." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 496, + "referenced_widgets": [ + "282a07aee3e04d7a9902cf4621c2db7a", + "d9682345a06f4c668b69077b1f677399", + "530174416a3e4cac9d9eac6633a5d886", + "229e2ecd668a49d1b9d4b78b5822d0e6", + "7daed9b7dfc34b3a94e6d47805d26427", + "6b8ce03feffe4747b0af404e90e0975e", + "7870c1f4e16e48dea5feb7498b9e837f", + "265de5620d9c483f96b0074ad1ca132d", + "f7cea318fe684c7b95575e6d33dc207d", + "d92c1c7eaec2439fa0d885d2ae833fd9", + "69cc7595e4764ad2806479b52d736faa", + "8d536f000daf4a928b9daeddf76cb375", + "a3e1d6f639c84b21aeb7e4ac96381ec8", + "5f9e84ae3afe47d49b257941777ca5b0", + "69d64f8babda4c52aea682b26d1a6e00", + "b9d09d77494f4e2aa9b500f5d0fdadaf", + "9b318fd56d8e45dc8c4d532662c5aba0", + "a8513c36560449099f7c257c21d5065b", + "5071d412953e41458aea25cf2864ac23", + "100f17a84bad4191bb7977949be2cdd4", + "151a4e93b66d4da5974bdb5f38596754", + "088e8878cd114c59b18e8bb9ed048457", + "f6766f2b04da46bdbf2df051606fce2f", + "2fd3a6fe6727423aa44840b17ebbcb34", + "e857e11e572742168303037071a477ba", + "7eb7ef9ef3af46acb461b42efb9c4af0", + "4db0da1b27344e45b7f235911185a4a0", + "ff1415afeac740c69bee0facc692a8e1", + "86104b3ee4564b8e9cef14eec70f1391", + "40a2a615a56e445fa72383446136d980", + "f39c6dc24936447dbbd0a3279c20223e", + "b4a4dca5f5864bd3a1b202517c49b792", + "58c008b0cbf04cf2872afa353e6b9f76", + "8c5086b82bc243f5bac08f9aed527c19", + "f98ad69de71b4ff5bc43e3d4cd4cc5ee", + "8b98d1642c3e4c959952248ce40c60c7", + "1134e014bcc242888828682645029a1e", + "423cefd4340c482cb70337168049f1c7", + "e7dd117cd6684500bf4b11a10e7cf0e9", + "2ba6d64387244836bc00e3d62a552cae", + "e3d77e059a5f47eb8dbeeef22febc9d8", + "3cad7442bcfa45fd9aed4002324a9d7f", + "df5fe4b465b646d2938cf8da656a8d33", + "56a410608c3a49ed90153c2989f6b8d9", + "50550ce1b30649b9b7a382f488025af1", + "ebea966f35ea40d7ad4a4f6fe34b4bda", + "f732f41c4de44ea6a0db1de49695f85e", + "fd3ea606c386466d9c6968f8728533b5", + "8fccf616a7694c529c0c966d417cad51", + "fb8b5e13e19e4424ac25a499550868da", + "fd44f15fb985473eb6ddccae5a39f04d", + "61d9fb21d35948fead864e381993a677", + "acd718fefdc84b8fa60d4255077914ee", + "ebdcaa70abe74743890e7de4d4685291", + "845ffe1f73164cd2a3e4f413e2a35241", + "715c90e0865d4e01943a841fdf3eb0dd", + "38ac4502272241f291ce81ec6802b55f", + "11fb0295d6ad47ce85d9382bdb1723fd", + "b62c504dd67f4d08b5e19d4ca86a8379", + "f94070d542904bb4ba45a550203c5ac2", + "77af9e788b0446de8cd84f1828877588", + "7efaf19ab2624dbf9adde1f8d03f3911", + "d0518bbbde494ab1876c98f3d514832a", + "c2b7df874d334dcdbb433897c84b079d", + "f7bbd95f043d46339377ee21821735d8", + "2e4a239969f443bf80829b3f21ffebdd", + "519c0f5256974835b547b28e94ff2224", + "f4b86788d3ce49b7bce7d273d8bf2cad", + "098e801e0c794453b31d69805208dcd9", + "3b20a4851944472198eb2e4f24ee3d96", + "58ff4d02a7b0449da85e8f48562d2b87", + "a57019b4b2394e3a8144831d400d7c0b", + "932611e22f1f4a1f9598be2a287d397a", + "1b093409a5c340a6a74fd0f64e4e6154", + "998a415e72704b20b00c2e3e4bf37614", + "e177d2ee77e24c17a723986c62ae5367", + "ce1dcb5db8c546cb84761c141924cd98", + "c03ef8cf6b654e1994d691cbdb331e55", + "350258a4b1e848e0b28880b868412862", + "9e003352a7d742fb833f15b532a5ad53", + "4e3dd4a0fa4e42be9515f89bbdada0b1", + "7d84283a574f470a9f130aa40fad0083", + "b0512a445d8748e0a538cd037dd80480", + "eebc474029844a1b88a022ad694d1f17", + "6fac241a17194caf908895b8f4cdb377", + "ef9b4e9212c845359acb1a0d295c2211", + "c2e105c1ac584ba9b12bd4b22191d3a8", + "9d6f989fbb584b2eb25172fcc662b2e7", + "8e3912f4f3134e7f90ae0959536f6429", + "264da80b4e3e446199cf9d53b3f4ed58", + "866796ac33744148b117d864df6283dc", + "702057124ce3481dab7da4bedce9c7e4", + "a67b0a951275400bbfdc5ab341cd98bc", + "58b683922de04bff9a26487612d10288", + "4464dca33eb24b5793457e70910b8aa4", + "8945460a9d714e559bfeb3d6b5c77196", + "595ca1e634504c98adf61fc08c98a826", + "1b7d1263951b4a3ea339b4d2c871f2c8", + "e0c7d450ac65475b869768d8bb5fb7bb", + "e6efa0c798d14790a6a57a566927bfb2", + "c31d528f88ff43bc93b2d0284c52b308", + "45282b1857ab4ebcb84a7cb5cd209062", + "34a7c9e315c745bb8075547e8414a5e6", + "19c0c0c106f24040a18d74cb75d8057c", + "ed201ab44f23458e9b8752c6920484f5", + "1f4ac8691f234d50abf1460a029403c7", + "e8d3f4424c4b42e2ba4c94429830b977", + "28a912965c254793a1bf3624f7faa17b", + "b46c929b3e904e39ae2187a8b08c66a9", + "5645173b5b724ad4be483a81839fcf08", + "65383a4be13a441ba688676051fe6d29", + "d76b9f06c08b4634a51199db44c0a11c", + "8a429e04f8c04b4cbccc2a0fbfc69f39", + "ac5e4ffea47a4a148219e7e998b5c325", + "d1fef74b9af44ac0b3ff021292982577", + "2d446aa187c04de79bdb5aae679799e7", + "b6e5248afcbd4241a6f7e08e81b52777", + "594e9fa90fac4d40b1524d57304b37ee", + "1b0ce7f124384da48b7ed934a283a191", + "f705b798938b4e919593be8fd9bbcc64", + "e28088caf5904fbcb81ff49cd04a1667" + ] }, - "f3f39db72b2b4ac99641eda95606701b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_88f77b9840e543be8c2eedd9ea3e8f09", - "max": 9084490, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_69e9503329df4e7f97476d2b4c1d00b7", - "value": 9084490 - } + "id": "bT1XTlQLMBX2", + "outputId": "0fd10f26-6ae5-4183-89cc-fa53e438e073" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: \n", + "The secret `HF_TOKEN` does not exist in your Colab secrets.\n", + "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n", + "You will be able to reuse this secret in all of your notebooks.\n", + "Please note that authentication is recommended but still optional to access public models or datasets.\n", + " warnings.warn(\n" + ] }, - "8dac808a75e94c33bc766ac34befc1df": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_24243f2c5cfc4dc0b19f69eca49639a5", - "placeholder": "​", - "style": "IPY_MODEL_bdcb2072ab124038a60f41bb165974d1", - "value": " 9.08M/9.08M [00:00<00:00, 24.2MB/s]" - } + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "282a07aee3e04d7a9902cf4621c2db7a", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "modules.json: 0%| | 0.00/349 [00:00<|start_header_id|>system<|end_header_id|>\n", "\n", - "with open(path_to_pdf, \"rb\") as f:\n", - " files=shared.Files(\n", - " content=f.read(),\n", - " file_name=path_to_pdf,\n", - " )\n", - " req = shared.PartitionParameters(\n", - " files=files,\n", - " chunking_strategy=\"by_title\",\n", - " max_characters=512,\n", - " )\n", - " try:\n", - " resp = client.general.partition(req)\n", - " except SDKError as e:\n", - " print(e)\n", + "{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>\n", "\n", - "elements = dict_to_elements(resp.elements)" - ] - }, - { - "cell_type": "markdown", - "source": [ - "Create LangChain documents from document chunks and their metadata, and ingest those documents into the FAISS vectorstore.\n", + "{{ user_msg_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", "\n", - "Set up the retriever." - ], - "metadata": { - "id": "UF1TaeByPEBR" - } + "{{ model_answer_1 }}<|eot_id|>\n", + "```" + ] }, { "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "b-8hFNkoM0LC" + }, + "outputs": [], "source": [ - "from langchain_core.documents import Document\n", - "from langchain.vectorstores import FAISS\n", - "from langchain.embeddings import HuggingFaceEmbeddings\n", + "from langchain.llms import HuggingFacePipeline\n", + "from langchain.prompts import PromptTemplate\n", + "from transformers import pipeline\n", + "from langchain_core.output_parsers import StrOutputParser\n", + "from langchain_core.runnables import RunnablePassthrough\n", "\n", - "documents = []\n", - "for element in elements:\n", - " metadata = element.metadata.to_dict()\n", - " documents.append(Document(page_content=element.text, metadata=metadata))\n", + "terminators = [\n", + " tokenizer.eos_token_id,\n", + " tokenizer.convert_tokens_to_ids(\"<|eot_id|>\")\n", + "]\n", "\n", - "db = FAISS.from_documents(documents, HuggingFaceEmbeddings(model_name=\"BAAI/bge-base-en-v1.5\"))\n", - "retriever = db.as_retriever(search_type=\"similarity\", search_kwargs={\"k\": 4})" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 496, - "referenced_widgets": [ - "282a07aee3e04d7a9902cf4621c2db7a", - "d9682345a06f4c668b69077b1f677399", - "530174416a3e4cac9d9eac6633a5d886", - "229e2ecd668a49d1b9d4b78b5822d0e6", - "7daed9b7dfc34b3a94e6d47805d26427", - "6b8ce03feffe4747b0af404e90e0975e", - "7870c1f4e16e48dea5feb7498b9e837f", - "265de5620d9c483f96b0074ad1ca132d", - "f7cea318fe684c7b95575e6d33dc207d", - "d92c1c7eaec2439fa0d885d2ae833fd9", - "69cc7595e4764ad2806479b52d736faa", - "8d536f000daf4a928b9daeddf76cb375", - "a3e1d6f639c84b21aeb7e4ac96381ec8", - "5f9e84ae3afe47d49b257941777ca5b0", - "69d64f8babda4c52aea682b26d1a6e00", - "b9d09d77494f4e2aa9b500f5d0fdadaf", - "9b318fd56d8e45dc8c4d532662c5aba0", - "a8513c36560449099f7c257c21d5065b", - "5071d412953e41458aea25cf2864ac23", - "100f17a84bad4191bb7977949be2cdd4", - "151a4e93b66d4da5974bdb5f38596754", - "088e8878cd114c59b18e8bb9ed048457", - "f6766f2b04da46bdbf2df051606fce2f", - "2fd3a6fe6727423aa44840b17ebbcb34", - "e857e11e572742168303037071a477ba", - "7eb7ef9ef3af46acb461b42efb9c4af0", - "4db0da1b27344e45b7f235911185a4a0", - "ff1415afeac740c69bee0facc692a8e1", - "86104b3ee4564b8e9cef14eec70f1391", - "40a2a615a56e445fa72383446136d980", - "f39c6dc24936447dbbd0a3279c20223e", - "b4a4dca5f5864bd3a1b202517c49b792", - "58c008b0cbf04cf2872afa353e6b9f76", - "8c5086b82bc243f5bac08f9aed527c19", - "f98ad69de71b4ff5bc43e3d4cd4cc5ee", - "8b98d1642c3e4c959952248ce40c60c7", - "1134e014bcc242888828682645029a1e", - "423cefd4340c482cb70337168049f1c7", - "e7dd117cd6684500bf4b11a10e7cf0e9", - "2ba6d64387244836bc00e3d62a552cae", - "e3d77e059a5f47eb8dbeeef22febc9d8", - "3cad7442bcfa45fd9aed4002324a9d7f", - "df5fe4b465b646d2938cf8da656a8d33", - "56a410608c3a49ed90153c2989f6b8d9", - "50550ce1b30649b9b7a382f488025af1", - "ebea966f35ea40d7ad4a4f6fe34b4bda", - "f732f41c4de44ea6a0db1de49695f85e", - "fd3ea606c386466d9c6968f8728533b5", - "8fccf616a7694c529c0c966d417cad51", - "fb8b5e13e19e4424ac25a499550868da", - "fd44f15fb985473eb6ddccae5a39f04d", - "61d9fb21d35948fead864e381993a677", - "acd718fefdc84b8fa60d4255077914ee", - "ebdcaa70abe74743890e7de4d4685291", - "845ffe1f73164cd2a3e4f413e2a35241", - "715c90e0865d4e01943a841fdf3eb0dd", - "38ac4502272241f291ce81ec6802b55f", - "11fb0295d6ad47ce85d9382bdb1723fd", - "b62c504dd67f4d08b5e19d4ca86a8379", - "f94070d542904bb4ba45a550203c5ac2", - "77af9e788b0446de8cd84f1828877588", - "7efaf19ab2624dbf9adde1f8d03f3911", - "d0518bbbde494ab1876c98f3d514832a", - "c2b7df874d334dcdbb433897c84b079d", - "f7bbd95f043d46339377ee21821735d8", - "2e4a239969f443bf80829b3f21ffebdd", - "519c0f5256974835b547b28e94ff2224", - "f4b86788d3ce49b7bce7d273d8bf2cad", - "098e801e0c794453b31d69805208dcd9", - "3b20a4851944472198eb2e4f24ee3d96", - "58ff4d02a7b0449da85e8f48562d2b87", - "a57019b4b2394e3a8144831d400d7c0b", - "932611e22f1f4a1f9598be2a287d397a", - "1b093409a5c340a6a74fd0f64e4e6154", - "998a415e72704b20b00c2e3e4bf37614", - "e177d2ee77e24c17a723986c62ae5367", - "ce1dcb5db8c546cb84761c141924cd98", - "c03ef8cf6b654e1994d691cbdb331e55", - "350258a4b1e848e0b28880b868412862", - "9e003352a7d742fb833f15b532a5ad53", - "4e3dd4a0fa4e42be9515f89bbdada0b1", - "7d84283a574f470a9f130aa40fad0083", - "b0512a445d8748e0a538cd037dd80480", - "eebc474029844a1b88a022ad694d1f17", - "6fac241a17194caf908895b8f4cdb377", - "ef9b4e9212c845359acb1a0d295c2211", - "c2e105c1ac584ba9b12bd4b22191d3a8", - "9d6f989fbb584b2eb25172fcc662b2e7", - "8e3912f4f3134e7f90ae0959536f6429", - "264da80b4e3e446199cf9d53b3f4ed58", - "866796ac33744148b117d864df6283dc", - "702057124ce3481dab7da4bedce9c7e4", - "a67b0a951275400bbfdc5ab341cd98bc", - "58b683922de04bff9a26487612d10288", - "4464dca33eb24b5793457e70910b8aa4", - "8945460a9d714e559bfeb3d6b5c77196", - "595ca1e634504c98adf61fc08c98a826", - "1b7d1263951b4a3ea339b4d2c871f2c8", - "e0c7d450ac65475b869768d8bb5fb7bb", - "e6efa0c798d14790a6a57a566927bfb2", - "c31d528f88ff43bc93b2d0284c52b308", - "45282b1857ab4ebcb84a7cb5cd209062", - "34a7c9e315c745bb8075547e8414a5e6", - "19c0c0c106f24040a18d74cb75d8057c", - "ed201ab44f23458e9b8752c6920484f5", - "1f4ac8691f234d50abf1460a029403c7", - "e8d3f4424c4b42e2ba4c94429830b977", - "28a912965c254793a1bf3624f7faa17b", - "b46c929b3e904e39ae2187a8b08c66a9", - "5645173b5b724ad4be483a81839fcf08", - "65383a4be13a441ba688676051fe6d29", - "d76b9f06c08b4634a51199db44c0a11c", - "8a429e04f8c04b4cbccc2a0fbfc69f39", - "ac5e4ffea47a4a148219e7e998b5c325", - "d1fef74b9af44ac0b3ff021292982577", - "2d446aa187c04de79bdb5aae679799e7", - "b6e5248afcbd4241a6f7e08e81b52777", - "594e9fa90fac4d40b1524d57304b37ee", - "1b0ce7f124384da48b7ed934a283a191", - "f705b798938b4e919593be8fd9bbcc64", - "e28088caf5904fbcb81ff49cd04a1667" + "text_generation_pipeline = pipeline(\n", + " model=model,\n", + " tokenizer=tokenizer,\n", + " task=\"text-generation\",\n", + " temperature=0.2,\n", + " do_sample=True,\n", + " repetition_penalty=1.1,\n", + " return_full_text=False,\n", + " max_new_tokens=200,\n", + " eos_token_id=terminators,\n", + ")\n", + "\n", + "llm = HuggingFacePipeline(pipeline=text_generation_pipeline)\n", + "\n", + "prompt_template = \"\"\"\n", + "<|start_header_id|>user<|end_header_id|>\n", + "You are an assistant for answering questions about IPM.\n", + "You are given the extracted parts of a long document and a question. Provide a conversational answer.\n", + "If you don't know the answer, just say \"I do not know.\" Don't make up an answer.\n", + "Question: {question}\n", + "Context: {context}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", + "\"\"\"\n", + "\n", + "prompt = PromptTemplate(\n", + " input_variables=[\"context\", \"question\"],\n", + " template=prompt_template,\n", + ")\n", + "\n", + "def format_docs(docs):\n", + " return \"\\n\\n\".join(doc.page_content for doc in docs)\n", + "\n", + "rag_chain = (\n", + " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", + " | prompt\n", + " | llm\n", + " | StrOutputParser()\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "u18Cc6msce5l" + }, + "source": [ + "Tada! Your RAG is ready to use. Pass a question, the retriver will add relevant context from your document, and Llama3 will generate an answer.\n", + "Here, my document was a chapter from a book on IPM that stands for \"Integrated Pest Management\". " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 143 + }, + "id": "i47dsLeFNVPB", + "outputId": "ce896b69-d035-4b69-ecba-be229e68aef1" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n" ] }, - "id": "bT1XTlQLMBX2", - "outputId": "0fd10f26-6ae5-4183-89cc-fa53e438e073" - }, - "execution_count": null, - "outputs": [ - { - "output_type": "stream", - "name": "stderr", - "text": [ - "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: \n", - "The secret `HF_TOKEN` does not exist in your Colab secrets.\n", - "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n", - "You will be able to reuse this secret in all of your notebooks.\n", - "Please note that authentication is recommended but still optional to access public models or datasets.\n", - " warnings.warn(\n" - ] + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'Based on the provided documents, a cultural control in IPM refers to disrupting the pest life cycle or making the environment less suited for survival. This includes practices such as rotating crops, using optimum growing conditions, and maintaining sanitation.\\n\\nFor instance, rotating crops can help break the life cycle of certain pests, while using optimum growing conditions can promote healthy plant growth and reduce the likelihood of pest infestation. Similarly, maintaining sanitation can prevent pests from finding food and shelter, thereby reducing their ability to survive and reproduce.\\n\\nThese cultural controls are often considered preventive measures, as they can help prevent pest problems from occurring in the first place. By incorporating cultural controls into an IPM program, farmers and gardeners can reduce their reliance on chemical pesticides and create a more sustainable and environmentally friendly approach to managing pests.'" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "question = \"What is considered a cultural control in IPM?\"\n", + "rag_chain.invoke(question)" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "gpuType": "T4", + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "0038bd8e5cc446cfa5637aecb86a405c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_85a1497ae9964e77885583613a7c6d9a", + "placeholder": "​", + "style": "IPY_MODEL_0b43aa2b874b487b91731e463cf853a5", + "value": " 1.17G/1.17G [00:14<00:00, 121MB/s]" + } + }, + "07d99c2f9a2e4a038072a085a745a080": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "088e8878cd114c59b18e8bb9ed048457": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "089ed96280c64a0588d48e04f92c183a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_89b20da2defb418299dd1b4b41f3414a", + "placeholder": "​", + "style": "IPY_MODEL_8d6d4222b2c348bf8f79031dba831dae", + "value": " 4/4 [02:40<00:00, 35.62s/it]" + } + }, + "098e801e0c794453b31d69805208dcd9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_1b093409a5c340a6a74fd0f64e4e6154", + "max": 366, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_998a415e72704b20b00c2e3e4bf37614", + "value": 366 + } + }, + "0a379a9f2ae84ed6bc45427e3e3eb916": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "0b43aa2b874b487b91731e463cf853a5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "0c91a9d8b10d49a5afde9f9208acb95f": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0df29c938b5940edaf4d2804ae08ef2e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "0df353d732694e9ab1d6c9530c45a0d4": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "0f33403f4da64ee6bc338719f05aaa02": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "100f17a84bad4191bb7977949be2cdd4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "1134e014bcc242888828682645029a1e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_df5fe4b465b646d2938cf8da656a8d33", + "placeholder": "​", + "style": "IPY_MODEL_56a410608c3a49ed90153c2989f6b8d9", + "value": " 52.0/52.0 [00:00<00:00, 3.66kB/s]" + } + }, + "115d71f18c1b4fc79ed6a27fc6401dbc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "11fb0295d6ad47ce85d9382bdb1723fd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d0518bbbde494ab1876c98f3d514832a", + "max": 437955512, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_c2b7df874d334dcdbb433897c84b079d", + "value": 437955512 + } + }, + "12e047d39fa545ac9ecfd900e88fba68": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5ff02061baf14cc69b6e99fa3375cc9b", + "placeholder": "​", + "style": "IPY_MODEL_6996f20968ca4c16af50df81f148355c", + "value": " 23.9k/23.9k [00:00<00:00, 1.92MB/s]" + } + }, + "14cd39e47c764792a1362259532fd09b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "151a4e93b66d4da5974bdb5f38596754": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1738720dc3e640aaaaf1d2eb609af8b6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_de98dcfd70aa4144abb13896da001ccc", + "placeholder": "​", + "style": "IPY_MODEL_ec2ce8e9ce1a45e38af3196fd2e60546", + "value": " 50.9k/50.9k [00:00<00:00, 3.56MB/s]" + } + }, + "17fbfefcf6ab41ccbf3fbb570cfe3c0d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4902defd076c4820b3a3e61c876024bc", + "placeholder": "​", + "style": "IPY_MODEL_f8e7f835c1f24fd082f9b61d4785723f", + "value": "model-00003-of-00004.safetensors: 100%" + } + }, + "1895e085ea5942fcb429bc1f64efb7f4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_8b0310dae18b4f7fa12048a9376f021d", + "max": 73, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_0df29c938b5940edaf4d2804ae08ef2e", + "value": 73 + } + }, + "19c0c0c106f24040a18d74cb75d8057c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1b093409a5c340a6a74fd0f64e4e6154": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1b0ce7f124384da48b7ed934a283a191": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "1b7d1263951b4a3ea339b4d2c871f2c8": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "1c5fc66b15d644768d328fc67d6d8c94": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "1f4ac8691f234d50abf1460a029403c7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "21fb6035bff44eb9809e022af0bb7104": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5c342c9d3b364194888115758819b7a0", + "placeholder": "​", + "style": "IPY_MODEL_22b8f6c8a73b42f183a9041e3fe61f14", + "value": "Downloading shards: 100%" + } + }, + "229e2ecd668a49d1b9d4b78b5822d0e6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d92c1c7eaec2439fa0d885d2ae833fd9", + "placeholder": "​", + "style": "IPY_MODEL_69cc7595e4764ad2806479b52d736faa", + "value": " 349/349 [00:00<00:00, 23.6kB/s]" + } + }, + "22b8f6c8a73b42f183a9041e3fe61f14": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "24243f2c5cfc4dc0b19f69eca49639a5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "24f3609a29944a6bbc88c59c5e5eee14": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "260fa61b341b407ea1732ef8ace93c5e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_644390bc94554bc6a2cf8f5add94a1d7", + "IPY_MODEL_f3f39db72b2b4ac99641eda95606701b", + "IPY_MODEL_8dac808a75e94c33bc766ac34befc1df" + ], + "layout": "IPY_MODEL_65f8d58f5f01479cb4a1241a23ea7978" + } + }, + "264da80b4e3e446199cf9d53b3f4ed58": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_58b683922de04bff9a26487612d10288", + "placeholder": "​", + "style": "IPY_MODEL_4464dca33eb24b5793457e70910b8aa4", + "value": "tokenizer.json: 100%" + } + }, + "265de5620d9c483f96b0074ad1ca132d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "282a07aee3e04d7a9902cf4621c2db7a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_d9682345a06f4c668b69077b1f677399", + "IPY_MODEL_530174416a3e4cac9d9eac6633a5d886", + "IPY_MODEL_229e2ecd668a49d1b9d4b78b5822d0e6" + ], + "layout": "IPY_MODEL_7daed9b7dfc34b3a94e6d47805d26427" + } + }, + "28a912965c254793a1bf3624f7faa17b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "298c3d1f6a934aa6b15a3b665a066eb6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5eacda4f75df44c88e3cec16c147fd0e", + "placeholder": "​", + "style": "IPY_MODEL_aa8fb50e8b524ff58f0493671fb0ba37", + "value": " 654/654 [00:00<00:00, 45.2kB/s]" + } + }, + "299ce57151134fe38ae9ba3260e5383c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "29c85a59881f4a57b71d2a707b7b7fb8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9aff0e55c7a24c618c998804a769669f", + "max": 4976698672, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_bcd1f2cdb488417a888a9ddce142dcdd", + "value": 4976698672 + } + }, + "2a75284ca1a74cf6b2df311a802a8df7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "2ba6d64387244836bc00e3d62a552cae": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "2d446aa187c04de79bdb5aae679799e7": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2e16ed7fde6e409a8dac1fc3fed2b4c1": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2e1ee91dac5c4299aecf906b33789fcf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_fdf91a039db749b38122ac0eb6eab108", + "IPY_MODEL_77b13df8a8504556b51744b1f02603ee", + "IPY_MODEL_298c3d1f6a934aa6b15a3b665a066eb6" + ], + "layout": "IPY_MODEL_3f1f7478f226408e964bea5adf304f21" + } + }, + "2e4a239969f443bf80829b3f21ffebdd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "2fd3a6fe6727423aa44840b17ebbcb34": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ff1415afeac740c69bee0facc692a8e1", + "placeholder": "​", + "style": "IPY_MODEL_86104b3ee4564b8e9cef14eec70f1391", + "value": "README.md: 100%" + } + }, + "34a7c9e315c745bb8075547e8414a5e6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b46c929b3e904e39ae2187a8b08c66a9", + "placeholder": "​", + "style": "IPY_MODEL_5645173b5b724ad4be483a81839fcf08", + "value": " 125/125 [00:00<00:00, 6.78kB/s]" + } + }, + "350258a4b1e848e0b28880b868412862": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b0512a445d8748e0a538cd037dd80480", + "placeholder": "​", + "style": "IPY_MODEL_eebc474029844a1b88a022ad694d1f17", + "value": "vocab.txt: 100%" + } + }, + "380e392c659a418fba5d3564cbf62e94": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "38ac4502272241f291ce81ec6802b55f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_77af9e788b0446de8cd84f1828877588", + "placeholder": "​", + "style": "IPY_MODEL_7efaf19ab2624dbf9adde1f8d03f3911", + "value": "model.safetensors: 100%" + } + }, + "3b20a4851944472198eb2e4f24ee3d96": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e177d2ee77e24c17a723986c62ae5367", + "placeholder": "​", + "style": "IPY_MODEL_ce1dcb5db8c546cb84761c141924cd98", + "value": " 366/366 [00:00<00:00, 17.8kB/s]" + } + }, + "3cad7442bcfa45fd9aed4002324a9d7f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3d4bd9fc29094069989a9c53ceb129b9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_566d2392323746d6ad976937b2a9da80", + "IPY_MODEL_29c85a59881f4a57b71d2a707b7b7fb8", + "IPY_MODEL_486f67e06ce64b019f17053f0a6d410c" + ], + "layout": "IPY_MODEL_57fd623252fe46d7a89b5f404a61da1a" + } + }, + "3f1f7478f226408e964bea5adf304f21": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "40a2a615a56e445fa72383446136d980": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "421b9ec5cfc34e17b193366bd619f916": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5cacb4eb6ece4fb284adb0eca4c474f3", + "placeholder": "​", + "style": "IPY_MODEL_431786378bd74bdfb9dca67b212a01df", + "value": "model-00004-of-00004.safetensors: 100%" + } + }, + "423cefd4340c482cb70337168049f1c7": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "431786378bd74bdfb9dca67b212a01df": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "4464dca33eb24b5793457e70910b8aa4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "45282b1857ab4ebcb84a7cb5cd209062": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e8d3f4424c4b42e2ba4c94429830b977", + "max": 125, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_28a912965c254793a1bf3624f7faa17b", + "value": 125 + } + }, + "4780eb726783487892cddce619057efe": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "486f67e06ce64b019f17053f0a6d410c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_516bb336ed96406499d23a2b86ed6a9c", + "placeholder": "​", + "style": "IPY_MODEL_bb561f17393947ef833a3208abf83b00", + "value": " 4.98G/4.98G [00:43<00:00, 174MB/s]" + } + }, + "4902defd076c4820b3a3e61c876024bc": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4bbeba17775c423f8dd570e25245982b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_ce86048691224ac495c0dbb6817a4721", + "IPY_MODEL_d4dea83270d148eb9fb6a19150f4b580", + "IPY_MODEL_a91b8f174a1a49e1a1f114316ea23133" + ], + "layout": "IPY_MODEL_b57e9af6a5f9433496526beffb7a269f" + } + }, + "4db0da1b27344e45b7f235911185a4a0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4e3dd4a0fa4e42be9515f89bbdada0b1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c2e105c1ac584ba9b12bd4b22191d3a8", + "placeholder": "​", + "style": "IPY_MODEL_9d6f989fbb584b2eb25172fcc662b2e7", + "value": " 232k/232k [00:00<00:00, 11.1MB/s]" + } + }, + "4f33241a7378442a8711b00402742526": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "50550ce1b30649b9b7a382f488025af1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_ebea966f35ea40d7ad4a4f6fe34b4bda", + "IPY_MODEL_f732f41c4de44ea6a0db1de49695f85e", + "IPY_MODEL_fd3ea606c386466d9c6968f8728533b5" + ], + "layout": "IPY_MODEL_8fccf616a7694c529c0c966d417cad51" + } + }, + "5071d412953e41458aea25cf2864ac23": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "516bb336ed96406499d23a2b86ed6a9c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "519c0f5256974835b547b28e94ff2224": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f4b86788d3ce49b7bce7d273d8bf2cad", + "IPY_MODEL_098e801e0c794453b31d69805208dcd9", + "IPY_MODEL_3b20a4851944472198eb2e4f24ee3d96" + ], + "layout": "IPY_MODEL_58ff4d02a7b0449da85e8f48562d2b87" + } + }, + "530174416a3e4cac9d9eac6633a5d886": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_265de5620d9c483f96b0074ad1ca132d", + "max": 349, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_f7cea318fe684c7b95575e6d33dc207d", + "value": 349 + } + }, + "5520f2680fd14de2a9277bddec4e3b2e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "5645173b5b724ad4be483a81839fcf08": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "566d2392323746d6ad976937b2a9da80": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_a3b3b8175680410483eb988ed1c6bffa", + "placeholder": "​", + "style": "IPY_MODEL_80e83a9fd2c2402c8eed6a659938acf6", + "value": "model-00001-of-00004.safetensors: 100%" + } + }, + "56a410608c3a49ed90153c2989f6b8d9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "57fd623252fe46d7a89b5f404a61da1a": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "58b683922de04bff9a26487612d10288": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "58c008b0cbf04cf2872afa353e6b9f76": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "58e024f57fcf4039b360b0bdbbe23520": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "58ff4d02a7b0449da85e8f48562d2b87": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "594e9fa90fac4d40b1524d57304b37ee": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "595ca1e634504c98adf61fc08c98a826": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "5bc4d0af62f047928f9de1183100acaa": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_17fbfefcf6ab41ccbf3fbb570cfe3c0d", + "IPY_MODEL_ab7055cac4e940229ee11ff98caafdd0", + "IPY_MODEL_fac438b999bb40459c245b0e307bd132" + ], + "layout": "IPY_MODEL_07d99c2f9a2e4a038072a085a745a080" + } + }, + "5c342c9d3b364194888115758819b7a0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5cacb4eb6ece4fb284adb0eca4c474f3": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5eacda4f75df44c88e3cec16c147fd0e": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "5f9e84ae3afe47d49b257941777ca5b0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5071d412953e41458aea25cf2864ac23", + "max": 124, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_100f17a84bad4191bb7977949be2cdd4", + "value": 124 + } + }, + "5ff02061baf14cc69b6e99fa3375cc9b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "61d9fb21d35948fead864e381993a677": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "63fcde7be7d94fbdb2e9bf9ca9084ad7": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "644390bc94554bc6a2cf8f5add94a1d7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_2e16ed7fde6e409a8dac1fc3fed2b4c1", + "placeholder": "​", + "style": "IPY_MODEL_813881219e0c420e80c9a4b27a023d54", + "value": "tokenizer.json: 100%" + } + }, + "65383a4be13a441ba688676051fe6d29": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_d76b9f06c08b4634a51199db44c0a11c", + "IPY_MODEL_8a429e04f8c04b4cbccc2a0fbfc69f39", + "IPY_MODEL_ac5e4ffea47a4a148219e7e998b5c325" + ], + "layout": "IPY_MODEL_d1fef74b9af44ac0b3ff021292982577" + } + }, + "65f8d58f5f01479cb4a1241a23ea7978": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "67506c1cff4e4f02ab15b05379debe89": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_875064682fde47e89d090f9dea7f2a9e", + "placeholder": "​", + "style": "IPY_MODEL_1c5fc66b15d644768d328fc67d6d8c94", + "value": "model.safetensors.index.json: 100%" + } + }, + "697705f63b4e4234bec3dff82f9a668f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e4a90f0ea3bc4c8b958ed923c104b3d6", + "max": 23950, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_afeca83368d44e4fa29d882a8f9cc0a5", + "value": 23950 + } + }, + "6996f20968ca4c16af50df81f148355c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "69cc7595e4764ad2806479b52d736faa": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "69d64f8babda4c52aea682b26d1a6e00": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_151a4e93b66d4da5974bdb5f38596754", + "placeholder": "​", + "style": "IPY_MODEL_088e8878cd114c59b18e8bb9ed048457", + "value": " 124/124 [00:00<00:00, 8.76kB/s]" + } + }, + "69e9503329df4e7f97476d2b4c1d00b7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "6b8ce03feffe4747b0af404e90e0975e": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "6dbddb37a5c84395b8e4a8db76e16c50": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "6fac241a17194caf908895b8f4cdb377": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "702057124ce3481dab7da4bedce9c7e4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_1b7d1263951b4a3ea339b4d2c871f2c8", + "placeholder": "​", + "style": "IPY_MODEL_e0c7d450ac65475b869768d8bb5fb7bb", + "value": " 711k/711k [00:00<00:00, 26.5MB/s]" + } + }, + "715c90e0865d4e01943a841fdf3eb0dd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_38ac4502272241f291ce81ec6802b55f", + "IPY_MODEL_11fb0295d6ad47ce85d9382bdb1723fd", + "IPY_MODEL_b62c504dd67f4d08b5e19d4ca86a8379" + ], + "layout": "IPY_MODEL_f94070d542904bb4ba45a550203c5ac2" + } + }, + "72709f5c251b47dea3cb6b04fbedf6e8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e0e358f24f9a4d959ac7b4f7c26d69eb", + "placeholder": "​", + "style": "IPY_MODEL_e790081734bb4461b525d46b6895a9b1", + "value": " 4/4 [01:24<00:00, 17.95s/it]" + } + }, + "73768d82de1e4b979a7947f600fe9d09": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "74c3c91eb1964ac4a6ae58b0d95b874c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "77af9e788b0446de8cd84f1828877588": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "77b13df8a8504556b51744b1f02603ee": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d5cff24e326f4b31b666c7aaf70e92f4", + "max": 654, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_4780eb726783487892cddce619057efe", + "value": 654 + } + }, + "7870c1f4e16e48dea5feb7498b9e837f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "7d84283a574f470a9f130aa40fad0083": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "7daed9b7dfc34b3a94e6d47805d26427": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "7eb7ef9ef3af46acb461b42efb9c4af0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b4a4dca5f5864bd3a1b202517c49b792", + "placeholder": "​", + "style": "IPY_MODEL_58c008b0cbf04cf2872afa353e6b9f76", + "value": " 94.6k/94.6k [00:00<00:00, 3.37MB/s]" + } + }, + "7efaf19ab2624dbf9adde1f8d03f3911": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "80e83a9fd2c2402c8eed6a659938acf6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "813881219e0c420e80c9a4b27a023d54": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "817fc199281249e982f5b3ac78fe3923": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_d1881e5dd3bb42f3a62444a987aff829", + "max": 4, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_115d71f18c1b4fc79ed6a27fc6401dbc", + "value": 4 + } + }, + "844d6117b65f442992c299535b66728e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "845ffe1f73164cd2a3e4f413e2a35241": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "85a1497ae9964e77885583613a7c6d9a": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8607de98d88b42aba06c916138ad60d0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "86104b3ee4564b8e9cef14eec70f1391": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "866796ac33744148b117d864df6283dc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_8945460a9d714e559bfeb3d6b5c77196", + "max": 711396, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_595ca1e634504c98adf61fc08c98a826", + "value": 711396 + } + }, + "875064682fde47e89d090f9dea7f2a9e": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "88f77b9840e543be8c2eedd9ea3e8f09": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8945460a9d714e559bfeb3d6b5c77196": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "89b20da2defb418299dd1b4b41f3414a": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8a429e04f8c04b4cbccc2a0fbfc69f39": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_594e9fa90fac4d40b1524d57304b37ee", + "max": 190, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_1b0ce7f124384da48b7ed934a283a191", + "value": 190 + } + }, + "8b0310dae18b4f7fa12048a9376f021d": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8b98d1642c3e4c959952248ce40c60c7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e3d77e059a5f47eb8dbeeef22febc9d8", + "max": 52, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_3cad7442bcfa45fd9aed4002324a9d7f", + "value": 52 + } + }, + "8c5086b82bc243f5bac08f9aed527c19": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f98ad69de71b4ff5bc43e3d4cd4cc5ee", + "IPY_MODEL_8b98d1642c3e4c959952248ce40c60c7", + "IPY_MODEL_1134e014bcc242888828682645029a1e" + ], + "layout": "IPY_MODEL_423cefd4340c482cb70337168049f1c7" + } + }, + "8d536f000daf4a928b9daeddf76cb375": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_a3e1d6f639c84b21aeb7e4ac96381ec8", + "IPY_MODEL_5f9e84ae3afe47d49b257941777ca5b0", + "IPY_MODEL_69d64f8babda4c52aea682b26d1a6e00" + ], + "layout": "IPY_MODEL_b9d09d77494f4e2aa9b500f5d0fdadaf" + } + }, + "8d6d4222b2c348bf8f79031dba831dae": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "8dac808a75e94c33bc766ac34befc1df": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_24243f2c5cfc4dc0b19f69eca49639a5", + "placeholder": "​", + "style": "IPY_MODEL_bdcb2072ab124038a60f41bb165974d1", + "value": " 9.08M/9.08M [00:00<00:00, 24.2MB/s]" + } + }, + "8e3912f4f3134e7f90ae0959536f6429": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_264da80b4e3e446199cf9d53b3f4ed58", + "IPY_MODEL_866796ac33744148b117d864df6283dc", + "IPY_MODEL_702057124ce3481dab7da4bedce9c7e4" + ], + "layout": "IPY_MODEL_a67b0a951275400bbfdc5ab341cd98bc" + } + }, + "8fccf616a7694c529c0c966d417cad51": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "91f078a552854c83a9821f0e23a2179f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_0c91a9d8b10d49a5afde9f9208acb95f", + "placeholder": "​", + "style": "IPY_MODEL_0f33403f4da64ee6bc338719f05aaa02", + "value": "model-00002-of-00004.safetensors: 100%" + } + }, + "920320e9727946abbbb4bfaf899b6d09": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_74c3c91eb1964ac4a6ae58b0d95b874c", + "placeholder": "​", + "style": "IPY_MODEL_844d6117b65f442992c299535b66728e", + "value": "tokenizer_config.json: 100%" + } + }, + "932611e22f1f4a1f9598be2a287d397a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "95aa717fc02b43fbb08ed633480bf1ee": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "998a415e72704b20b00c2e3e4bf37614": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "9aff0e55c7a24c618c998804a769669f": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9b318fd56d8e45dc8c4d532662c5aba0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9c0b221cf5bf4b8495e435137a9468e5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9d6f989fbb584b2eb25172fcc662b2e7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "9e003352a7d742fb833f15b532a5ad53": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_6fac241a17194caf908895b8f4cdb377", + "max": 231508, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_ef9b4e9212c845359acb1a0d295c2211", + "value": 231508 + } + }, + "a2e3896132754b40bb9252709d86b17a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_920320e9727946abbbb4bfaf899b6d09", + "IPY_MODEL_e71f70b8dde344b38818b098fa418ec4", + "IPY_MODEL_1738720dc3e640aaaaf1d2eb609af8b6" + ], + "layout": "IPY_MODEL_ada1722de7c443ff93ef04df53b89404" + } + }, + "a3b3b8175680410483eb988ed1c6bffa": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a3e1d6f639c84b21aeb7e4ac96381ec8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9b318fd56d8e45dc8c4d532662c5aba0", + "placeholder": "​", + "style": "IPY_MODEL_a8513c36560449099f7c257c21d5065b", + "value": "config_sentence_transformers.json: 100%" + } + }, + "a57019b4b2394e3a8144831d400d7c0b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a62910759d0440e4a1e7b9a99cdccda0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a6403a66557341dba6634d7ab2b5b4b1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_6dbddb37a5c84395b8e4a8db76e16c50", + "placeholder": "​", + "style": "IPY_MODEL_5520f2680fd14de2a9277bddec4e3b2e", + "value": " 73.0/73.0 [00:00<00:00, 3.30kB/s]" + } + }, + "a67b0a951275400bbfdc5ab341cd98bc": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a8513c36560449099f7c257c21d5065b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "a91b8f174a1a49e1a1f114316ea23133": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_b99c9208985f485cbbfbca51d5260ba5", + "placeholder": "​", + "style": "IPY_MODEL_d6960ffb549e4e69a919405abecb22a3", + "value": " 126/126 [00:00<00:00, 8.70kB/s]" + } + }, + "aa8fb50e8b524ff58f0493671fb0ba37": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "ab7055cac4e940229ee11ff98caafdd0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9c0b221cf5bf4b8495e435137a9468e5", + "max": 4915916176, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_b590bf4d0dda4fd99ebc4b2cfe8050bc", + "value": 4915916176 + } + }, + "ac5e4ffea47a4a148219e7e998b5c325": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f705b798938b4e919593be8fd9bbcc64", + "placeholder": "​", + "style": "IPY_MODEL_e28088caf5904fbcb81ff49cd04a1667", + "value": " 190/190 [00:00<00:00, 10.9kB/s]" + } + }, + "acd718fefdc84b8fa60d4255077914ee": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "ada1722de7c443ff93ef04df53b89404": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "afeca83368d44e4fa29d882a8f9cc0a5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "b0512a445d8748e0a538cd037dd80480": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b08e94b9ed83461fb9744cc6cdac0adf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_299ce57151134fe38ae9ba3260e5383c", + "placeholder": "​", + "style": "IPY_MODEL_2a75284ca1a74cf6b2df311a802a8df7", + "value": "special_tokens_map.json: 100%" + } + }, + "b3c06b8aa61d4e6ba9df1e90e86f2ba9": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b46c929b3e904e39ae2187a8b08c66a9": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b47a97e2f1b8413b9a705c173d6c896b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "b4a4dca5f5864bd3a1b202517c49b792": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b57e9af6a5f9433496526beffb7a269f": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b590bf4d0dda4fd99ebc4b2cfe8050bc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "b62c504dd67f4d08b5e19d4ca86a8379": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f7bbd95f043d46339377ee21821735d8", + "placeholder": "​", + "style": "IPY_MODEL_2e4a239969f443bf80829b3f21ffebdd", + "value": " 438M/438M [00:02<00:00, 186MB/s]" + } + }, + "b65f406dcd7b48d29f6d234398f9e2c1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_fdb89e3a02c94b9988db666725fb5cc0", + "max": 4, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_d911fdac4d3e43bb801e4ae81f8f108d", + "value": 4 + } + }, + "b6e5248afcbd4241a6f7e08e81b52777": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "b99c9208985f485cbbfbca51d5260ba5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b9d09d77494f4e2aa9b500f5d0fdadaf": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "bb561f17393947ef833a3208abf83b00": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "bcd1f2cdb488417a888a9ddce142dcdd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "bdcb2072ab124038a60f41bb165974d1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "c03ef8cf6b654e1994d691cbdb331e55": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_350258a4b1e848e0b28880b868412862", + "IPY_MODEL_9e003352a7d742fb833f15b532a5ad53", + "IPY_MODEL_4e3dd4a0fa4e42be9515f89bbdada0b1" + ], + "layout": "IPY_MODEL_7d84283a574f470a9f130aa40fad0083" + } + }, + "c2b7df874d334dcdbb433897c84b079d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "c2e105c1ac584ba9b12bd4b22191d3a8": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "c31d528f88ff43bc93b2d0284c52b308": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ed201ab44f23458e9b8752c6920484f5", + "placeholder": "​", + "style": "IPY_MODEL_1f4ac8691f234d50abf1460a029403c7", + "value": "special_tokens_map.json: 100%" + } + }, + "c3ac25e2171b40c49e681bf7b09dc1d2": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "c9d72c019b2544ff86f818534269e9b5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ca841ddbe07f4b77bcff43bc706b1a20": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ce1dcb5db8c546cb84761c141924cd98": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "ce86048691224ac495c0dbb6817a4721": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ca841ddbe07f4b77bcff43bc706b1a20", + "placeholder": "​", + "style": "IPY_MODEL_dc44bc377ec941eea4f627d2bc847477", + "value": "generation_config.json: 100%" + } + }, + "d0518bbbde494ab1876c98f3d514832a": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d1881e5dd3bb42f3a62444a987aff829": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d1fef74b9af44ac0b3ff021292982577": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d437a724018e4f4795b1e3bddbba782a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c9d72c019b2544ff86f818534269e9b5", + "max": 1168138808, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_f754d601c4134e0f8701735171ec75a9", + "value": 1168138808 + } + }, + "d4dea83270d148eb9fb6a19150f4b580": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_380e392c659a418fba5d3564cbf62e94", + "max": 126, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_24f3609a29944a6bbc88c59c5e5eee14", + "value": 126 + } + }, + "d555f930636e41d58c02d18020cf0635": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_421b9ec5cfc34e17b193366bd619f916", + "IPY_MODEL_d437a724018e4f4795b1e3bddbba782a", + "IPY_MODEL_0038bd8e5cc446cfa5637aecb86a405c" + ], + "layout": "IPY_MODEL_e97bcab104234aa3a22130e421cec71e" + } + }, + "d5a0eac1f227495886ecd40fd2d035f9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_91f078a552854c83a9821f0e23a2179f", + "IPY_MODEL_e8b871dcfca946e08453c6f788f78b4b", + "IPY_MODEL_f4a11ae55c724924ac1318e5730f6d2c" + ], + "layout": "IPY_MODEL_58e024f57fcf4039b360b0bdbbe23520" + } + }, + "d5cff24e326f4b31b666c7aaf70e92f4": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d6960ffb549e4e69a919405abecb22a3": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "d76b9f06c08b4634a51199db44c0a11c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_2d446aa187c04de79bdb5aae679799e7", + "placeholder": "​", + "style": "IPY_MODEL_b6e5248afcbd4241a6f7e08e81b52777", + "value": "1_Pooling/config.json: 100%" + } + }, + "d80a990ec240484c8b3e16bcd1165320": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_21fb6035bff44eb9809e022af0bb7104", + "IPY_MODEL_b65f406dcd7b48d29f6d234398f9e2c1", + "IPY_MODEL_089ed96280c64a0588d48e04f92c183a" + ], + "layout": "IPY_MODEL_c3ac25e2171b40c49e681bf7b09dc1d2" + } + }, + "d911fdac4d3e43bb801e4ae81f8f108d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "d92c1c7eaec2439fa0d885d2ae833fd9": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "d93e7a2e9e1a4f3692fae8d59a3cdfcf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "d9682345a06f4c668b69077b1f677399": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_6b8ce03feffe4747b0af404e90e0975e", + "placeholder": "​", + "style": "IPY_MODEL_7870c1f4e16e48dea5feb7498b9e837f", + "value": "modules.json: 100%" + } + }, + "dc44bc377ec941eea4f627d2bc847477": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "de98dcfd70aa4144abb13896da001ccc": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "df5fe4b465b646d2938cf8da656a8d33": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "dfa9c9f4c2334d17a0518b4006e4965b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_b08e94b9ed83461fb9744cc6cdac0adf", + "IPY_MODEL_1895e085ea5942fcb429bc1f64efb7f4", + "IPY_MODEL_a6403a66557341dba6634d7ab2b5b4b1" + ], + "layout": "IPY_MODEL_8607de98d88b42aba06c916138ad60d0" + } + }, + "e0c7d450ac65475b869768d8bb5fb7bb": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "e0e358f24f9a4d959ac7b4f7c26d69eb": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e177d2ee77e24c17a723986c62ae5367": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e28088caf5904fbcb81ff49cd04a1667": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } }, - { - "output_type": "display_data", - "data": { - "text/plain": [ - "modules.json: 0%| | 0.00/349 [00:00<|start_header_id|>system<|end_header_id|>\n", - "\n", - "{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>\n", - "\n", - "{{ user_msg_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", - "\n", - "{{ model_answer_1 }}<|eot_id|>\n", - "```" - ], - "metadata": { - "id": "aeOvh7Y2cIym" - } - }, - { - "cell_type": "code", - "source": [ - "from langchain.llms import HuggingFacePipeline\n", - "from langchain.prompts import PromptTemplate\n", - "from transformers import pipeline\n", - "from langchain_core.output_parsers import StrOutputParser\n", - "from langchain_core.runnables import RunnablePassthrough\n", - "\n", - "terminators = [\n", - " tokenizer.eos_token_id,\n", - " tokenizer.convert_tokens_to_ids(\"<|eot_id|>\")\n", - "]\n", - "\n", - "text_generation_pipeline = pipeline(\n", - " model=model,\n", - " tokenizer=tokenizer,\n", - " task=\"text-generation\",\n", - " temperature=0.2,\n", - " do_sample=True,\n", - " repetition_penalty=1.1,\n", - " return_full_text=False,\n", - " max_new_tokens=200,\n", - " eos_token_id=terminators,\n", - ")\n", - "\n", - "llm = HuggingFacePipeline(pipeline=text_generation_pipeline)\n", - "\n", - "prompt_template = \"\"\"\n", - "<|start_header_id|>user<|end_header_id|>\n", - "You are an assistant for answering questions about IPM.\n", - "You are given the extracted parts of a long document and a question. Provide a conversational answer.\n", - "If you don't know the answer, just say \"I do not know.\" Don't make up an answer.\n", - "Question: {question}\n", - "Context: {context}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n", - "\"\"\"\n", - "\n", - "prompt = PromptTemplate(\n", - " input_variables=[\"context\", \"question\"],\n", - " template=prompt_template,\n", - ")\n", - "\n", - "def format_docs(docs):\n", - " return \"\\n\\n\".join(doc.page_content for doc in docs)\n", - "\n", - "rag_chain = (\n", - " {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", - " | prompt\n", - " | llm\n", - " | StrOutputParser()\n", - ")" - ], - "metadata": { - "id": "b-8hFNkoM0LC" - }, - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "source": [ - "Tada! Your RAG is ready to use. Pass a question, the retriver will add relevant context from your document, and Llama3 will generate an answer.\n", - "Here, my document was a chapter from a book on IPM that stands for \"Integrated Pest Management\". " - ], - "metadata": { - "id": "u18Cc6msce5l" - } - }, - { - "cell_type": "code", - "source": [ - "question = \"What is considered a cultural control in IPM?\"\n", - "rag_chain.invoke(question)" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 143 + "fac438b999bb40459c245b0e307bd132": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4f33241a7378442a8711b00402742526", + "placeholder": "​", + "style": "IPY_MODEL_0a379a9f2ae84ed6bc45427e3e3eb916", + "value": " 4.92G/4.92G [00:49<00:00, 177MB/s]" + } + }, + "faeb0c503d83411f8f162a23ea49375b": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } }, - "id": "i47dsLeFNVPB", - "outputId": "ce896b69-d035-4b69-ecba-be229e68aef1" - }, - "execution_count": null, - "outputs": [ - { - "output_type": "stream", - "name": "stderr", - "text": [ - "Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.\n" - ] + "fb8b5e13e19e4424ac25a499550868da": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } }, - { - "output_type": "execute_result", - "data": { - "text/plain": [ - "'Based on the provided documents, a cultural control in IPM refers to disrupting the pest life cycle or making the environment less suited for survival. This includes practices such as rotating crops, using optimum growing conditions, and maintaining sanitation.\\n\\nFor instance, rotating crops can help break the life cycle of certain pests, while using optimum growing conditions can promote healthy plant growth and reduce the likelihood of pest infestation. Similarly, maintaining sanitation can prevent pests from finding food and shelter, thereby reducing their ability to survive and reproduce.\\n\\nThese cultural controls are often considered preventive measures, as they can help prevent pest problems from occurring in the first place. By incorporating cultural controls into an IPM program, farmers and gardeners can reduce their reliance on chemical pesticides and create a more sustainable and environmentally friendly approach to managing pests.'" + "fd3ea606c386466d9c6968f8728533b5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ebdcaa70abe74743890e7de4d4685291", + "placeholder": "​", + "style": "IPY_MODEL_845ffe1f73164cd2a3e4f413e2a35241", + "value": " 777/777 [00:00<00:00, 64.0kB/s]" + } + }, + "fd44f15fb985473eb6ddccae5a39f04d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "fd6ee259a61d45399d3bddbff62e08fa": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_67506c1cff4e4f02ab15b05379debe89", + "IPY_MODEL_697705f63b4e4234bec3dff82f9a668f", + "IPY_MODEL_12e047d39fa545ac9ecfd900e88fba68" ], - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - } - }, - "metadata": {}, - "execution_count": 38 + "layout": "IPY_MODEL_14cd39e47c764792a1362259532fd09b" + } + }, + "fdb89e3a02c94b9988db666725fb5cc0": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fdf91a039db749b38122ac0eb6eab108": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_73768d82de1e4b979a7947f600fe9d09", + "placeholder": "​", + "style": "IPY_MODEL_b47a97e2f1b8413b9a705c173d6c896b", + "value": "config.json: 100%" + } + }, + "ff1415afeac740c69bee0facc692a8e1": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } } - ] + } } - ] -} \ No newline at end of file + }, + "nbformat": 4, + "nbformat_minor": 0 +}