unslothai · shimmyshimmer · Oct 31, 2025 · Oct 31, 2025 · Oct 31, 2025 · Oct 31, 2025
diff --git a/nb/Kaggle-Qwen3_VL_(8B)-Vision.ipynb b/nb/Kaggle-Qwen3_VL_(8B)-Vision.ipynb
@@ -18,14 +18,18 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "gib37dRGOWGF"
+   },
    "source": [
     "### News"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "WES1cJDEOWGF"
+   },
    "source": [
     "\n",
     "Unsloth's [Docker image](https://hub.docker.com/r/unsloth/unsloth) is here! Start training with no setup & environment issues. [Read our Guide](https://docs.unsloth.ai/new/how-to-train-llms-with-unsloth-and-docker).\n",
@@ -41,15 +45,19 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "yGlEz2VVOWGG"
+   },
    "source": [
     "### Installation"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "id": "PglJeZZoOWGG"
+   },
    "outputs": [],
    "source": "%%capture\nimport os\n\n!pip install pip3-autoremove\n!pip install torch torchvision torchaudio xformers --index-url https://download.pytorch.org/whl/cu128\n!pip install unsloth\n!pip install transformers==4.57.0\n!pip install --no-deps trl==0.22.2"
   },
@@ -156,7 +164,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": null,
    "metadata": {
     "id": "6bZsfBuZDeCL"
    },
@@ -195,7 +203,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
@@ -274,7 +282,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -303,7 +311,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
@@ -332,7 +340,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
@@ -371,7 +379,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
@@ -423,7 +431,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": null,
    "metadata": {
     "id": "oPXzJZzHEgXe"
    },
@@ -458,7 +466,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": null,
    "metadata": {
     "id": "gFW2qXIr7Ezy"
    },
@@ -478,7 +486,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -520,7 +528,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -578,7 +586,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -632,7 +640,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": null,
    "metadata": {
     "cellView": "form",
     "colab": {
@@ -662,7 +670,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/",
@@ -846,7 +854,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": null,
    "metadata": {
     "cellView": "form",
     "colab": {
@@ -900,7 +908,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -958,7 +966,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -996,7 +1004,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": null,
    "metadata": {
     "colab": {
      "base_uri": "https://localhost:8080/"
@@ -1058,7 +1066,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": null,
    "metadata": {
     "id": "iHjt_SMYsd3P"
    },
@@ -1073,10 +1081,61 @@
     "if False: model.push_to_hub_merged(\"YOUR_USERNAME/unsloth_finetune\", tokenizer, token = \"PUT_HERE\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "### GGUF / llama.cpp Conversion\n",
+    "To save to `GGUF` / `llama.cpp`, we support it natively now! We clone `llama.cpp` and we default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.\n",
+    "\n",
+    "Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):\n",
+    "* `q8_0` - Fast conversion. High resource use, but generally acceptable.\n",
+    "* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.\n",
+    "* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.\n",
+    "\n",
+    "[**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)"
+   ],
+   "metadata": {
+    "id": "qjuPsiqcOYaA"
+   }
+  },
+  {
+   "cell_type": "code",
+   "source": [
+    "# Save to 8bit Q8_0\n",
+    "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer,)\n",
+    "# Remember to go to https://huggingface.co/settings/tokens for a token!\n",
+    "# And change hf to your username!\n",
+    "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, token = \"\")\n",
+    "\n",
+    "# Save to 16bit GGUF\n",
+    "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"f16\")\n",
+    "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"f16\", token = \"\")\n",
+    "\n",
+    "# Save to q4_k_m GGUF\n",
+    "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\")\n",
+    "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\", token = \"\")\n",
+    "\n",
+    "# Save to multiple GGUF options - much faster if you want multiple!\n",
+    "if False:\n",
+    "    model.push_to_hub_gguf(\n",
+    "        \"hf/unsloth_finetune\", # Change hf to your username!\n",
+    "        tokenizer,\n",
+    "        quantization_method = [\"q4_k_m\", \"q8_0\", \"q5_k_m\",],\n",
+    "        token = \"\",\n",
+    "    )"
+   ],
+   "metadata": {
+    "id": "At1T2hJnOdGM"
+   },
+   "execution_count": null,
+   "outputs": []
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "Now, use the `model-unsloth.gguf` file or `model-unsloth-Q4_K_M.gguf` file in llama.cpp.\n",
+    "\n",
     "And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!\n",
     "\n",
     "Some other links:\n",
@@ -1092,7 +1151,7 @@
     "\n",
     "  Join Discord if you need help + \u2b50\ufe0f <i>Star us on <a href=\"https://github.com/unslothai/unsloth\">Github</a> </i> \u2b50\ufe0f\n",
     "\n",
-    "  This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme)\n",
+    "  This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme).\n",
     "</div>\n"
    ]
   }
@@ -2146,4 +2205,4 @@
  },
  "nbformat": 4,
  "nbformat_minor": 0
-}
+}