}]},\n",
- " {'role': 'assistant',\n",
- " 'content': [{'type': 'text',\n",
- " 'text': '{ \\\\frac { N } { M } } \\\\in { \\\\bf Z } , { \\\\frac { M } { P } } \\\\in { \\\\bf Z } , { \\\\frac { P } { Q } } \\\\in { \\\\bf Z }'}]}]}"
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SXd9bTZd1aaL"
+ },
+ "source": [
+ "We now add LoRA adapters for parameter efficient finetuning - this allows us to only efficiently train 1% of all parameters.\n",
+ "\n",
+ "**[NEW]** We also support finetuning ONLY the vision part of the model, or ONLY the language part. Or you can select both! You can also select to finetune the attention or the MLP layers!"
]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "converted_dataset[0]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FecKS-dA82f5"
- },
- "source": [
- "Let's first see before we do any finetuning what the model outputs for the first example!"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
},
- "id": "vcat4UxA81vr",
- "outputId": "de67935b-b273-4432-b42a-0eb31dceb4fe"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- " $$H ^ { \\prime } = \\beta N \\int d \\lambda \\left\\{ \\frac { 1 } { 2 \\beta ^ { 2 } N ^ { 2 } } \\partial _ { \\lambda } \\zeta ^ { \\dagger } \\partial _ { \\lambda } \\zeta + V ( \\lambda ) \\zeta ^ { \\dagger } \\zeta \\right\\} .$$<|im_end|>\n"
- ]
- }
- ],
- "source": [
- "FastVisionModel.for_inference(model) # Enable for inference!\n",
- "\n",
- "image = dataset[2][\"image\"]\n",
- "instruction = \"Write the LaTeX representation for this image.\"\n",
- "\n",
- "messages = [\n",
- " {\"role\": \"user\", \"content\": [\n",
- " {\"type\": \"image\"},\n",
- " {\"type\": \"text\", \"text\": instruction}\n",
- " ]}\n",
- "]\n",
- "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
- "inputs = tokenizer(\n",
- " image,\n",
- " input_text,\n",
- " add_special_tokens = False,\n",
- " return_tensors = \"pt\",\n",
- ").to(\"cuda\")\n",
- "\n",
- "from transformers import TextStreamer\n",
- "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
- "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
- " use_cache = True, temperature = 1.5, min_p = 0.1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "idAEIeSQ3xdS"
- },
- "source": [
- "\n",
- "### Train the model\n",
- "Now let's train our model. We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!\n",
- "\n",
- "We use our new `UnslothVisionDataCollator` which will help in our vision finetuning setup."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "6bZsfBuZDeCL"
+ },
+ "outputs": [],
+ "source": [
+ "model = FastVisionModel.get_peft_model(\n",
+ " model,\n",
+ " finetune_vision_layers = True, # False if not finetuning vision layers\n",
+ " finetune_language_layers = True, # False if not finetuning language layers\n",
+ " finetune_attention_modules = True, # False if not finetuning attention layers\n",
+ " finetune_mlp_modules = True, # False if not finetuning MLP layers\n",
+ "\n",
+ " r = 16, # The larger, the higher the accuracy, but might overfit\n",
+ " lora_alpha = 16, # Recommended alpha == r at least\n",
+ " lora_dropout = 0,\n",
+ " bias = \"none\",\n",
+ " random_state = 3407,\n",
+ " use_rslora = False, # We support rank stabilized LoRA\n",
+ " loftq_config = None, # And LoftQ\n",
+ " # target_modules = \"all-linear\", # Optional now! Can specify a list if needed\n",
+ ")"
+ ]
},
- "id": "95_Nn-89DhsL",
- "outputId": "c4a76c64-06c7-4aa8-f2fd-038653804e6f"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Unsloth: Model does not have a default image size - using 512\n"
- ]
- }
- ],
- "source": [
- "from unsloth.trainer import UnslothVisionDataCollator\n",
- "from trl import SFTTrainer, SFTConfig\n",
- "\n",
- "FastVisionModel.for_training(model) # Enable for training!\n",
- "\n",
- "trainer = SFTTrainer(\n",
- " model = model,\n",
- " tokenizer = tokenizer,\n",
- " data_collator = UnslothVisionDataCollator(model, tokenizer), # Must use!\n",
- " train_dataset = converted_dataset,\n",
- " args = SFTConfig(\n",
- " per_device_train_batch_size = 2,\n",
- " gradient_accumulation_steps = 4,\n",
- " warmup_steps = 5,\n",
- " max_steps = 30,\n",
- " # num_train_epochs = 1, # Set this instead of max_steps for full training runs\n",
- " learning_rate = 2e-4,\n",
- " logging_steps = 1,\n",
- " optim = \"adamw_8bit\",\n",
- " weight_decay = 0.001,\n",
- " lr_scheduler_type = \"linear\",\n",
- " seed = 3407,\n",
- " output_dir = \"outputs\",\n",
- " report_to = \"none\", # For Weights and Biases\n",
- "\n",
- " # You MUST put the below items for vision finetuning:\n",
- " remove_unused_columns = False,\n",
- " dataset_text_field = \"\",\n",
- " dataset_kwargs = {\"skip_prepare_dataset\": True},\n",
- " max_length = 2048,\n",
- " ),\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {
- "cellView": "form",
- "colab": {
- "base_uri": "https://localhost:8080/"
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vITh0KVJ10qX"
+ },
+ "source": [
+ "\n",
+ "### Data Prep\n",
+ "We'll be using a sampled dataset of handwritten maths formulas. The goal is to convert these images into a computer readable form - ie in LaTeX form, so we can render it. This can be very useful for complex formulas.\n",
+ "\n",
+ "You can access the dataset [here](https://huggingface.co/datasets/unsloth/LaTeX_OCR). The full dataset is [here](https://huggingface.co/datasets/linxy/LaTeX_OCR)."
+ ]
},
- "id": "2ejIt2xSNKKp",
- "outputId": "e4592e9e-a840-4079-b91e-08f919fc5c80"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "GPU = Tesla T4. Max memory = 14.741 GB.\n",
- "7.66 GB of memory reserved.\n"
- ]
- }
- ],
- "source": [
- "# @title Show current memory stats\n",
- "gpu_stats = torch.cuda.get_device_properties(0)\n",
- "start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
- "max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n",
- "print(f\"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.\")\n",
- "print(f\"{start_gpu_memory} GB of memory reserved.\")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/",
- "height": 1000
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 81,
+ "referenced_widgets": [
+ "1d7dc3fd42e04783b7e912161ec5f7c2",
+ "2ed0f7a88aab4e7f9e7c3216bea48b90",
+ "48276a93bae44504aa0a485b207051d0",
+ "8466623dbeb442468367747c2b187cdd",
+ "2b60182eb6724c0fa3d241a39854cd1e",
+ "28b385c6e67d4c05a4277889c9f5c0c4",
+ "1bc12b432fe5444c8e854153cf84120d",
+ "3374e62bec23487098102668768cc9ef",
+ "ce09826095904bc18c8eaaff388a216d",
+ "9947797e53bf41c9a93c9b94c877c863",
+ "de403b0aaea3409996b04e0c826bc71a",
+ "27456772617447dc80c58eb292c185c8",
+ "196ef60df0cd417f934f6e2304c3f180",
+ "9113790223204b179dfeff0623f0136d",
+ "d84c3a99025742ae939a2472f36852fa",
+ "daed4150f70b4cd1b913f26a732b13c5",
+ "cfb40ab3aa994fc9b3088d79ed5c4a26",
+ "046156a2fad9435b84a19ff3163e7e4a",
+ "23ec6113e4014445baf3dcc4dd6c4e17",
+ "696d7cf5fbb64bc484d7de8bc66a4062",
+ "087abc8baba848babf93b8f29e5a2bcf",
+ "1a74e168c9554fac9978a4736dbcdb11"
+ ]
+ },
+ "id": "LjY75GoYUCB8",
+ "outputId": "121963f3-8e3f-4c7f-e4a5-36c1ddc12a54"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "1d7dc3fd42e04783b7e912161ec5f7c2",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ "Generating train split: 0%| | 0/68686 [00:00, ? examples/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "27456772617447dc80c58eb292c185c8",
+ "version_major": 2,
+ "version_minor": 0
+ },
+ "text/plain": [
+ "Generating test split: 0%| | 0/7632 [00:00, ? examples/s]"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from datasets import load_dataset\n",
+ "dataset = load_dataset(\"unsloth/LaTeX_OCR\", split = \"train\")"
+ ]
},
- "id": "yqxqAZ7KJ4oL",
- "outputId": "dca00bba-4035-4133-cb34-9fadfb0b011a"
- },
- "outputs": [
{
- "name": "stderr",
- "output_type": "stream",
- "text": [
- "The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None}.\n",
- "The model is already on multiple devices. Skipping the move to device specified in `args`.\n",
- "==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1\n",
- " \\\\ /| Num examples = 68,686 | Num Epochs = 1 | Total steps = 30\n",
- "O^O/ \\_/ \\ Batch size per device = 2 | Gradient accumulation steps = 4\n",
- "\\ / Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8\n",
- " \"-____-\" Trainable parameters = 51,346,944 of 8,818,470,640 (0.58% trained)\n"
- ]
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "W1W2Qhsz6rUT"
+ },
+ "source": [
+ "Let's take an overview look at the dataset. We shall see what the 3rd image is, and what caption it had."
+ ]
},
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Unsloth: Will smartly offload gradients to save VRAM!\n"
- ]
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "bfcSGwIb6p_R",
+ "outputId": "43de1949-d15c-4956-d57d-d641f0671816"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "Dataset({\n",
+ " features: ['image', 'text'],\n",
+ " num_rows: 68686\n",
+ "})"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "dataset"
+ ]
},
{
- "data": {
- "text/html": [
- "\n",
- " \n",
- " \n",
- "
\n",
- " [30/30 02:52, Epoch 0/1]\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | Step | \n",
- " Training Loss | \n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " | 1 | \n",
- " 0.313100 | \n",
- "
\n",
- " \n",
- " | 2 | \n",
- " 0.373600 | \n",
- "
\n",
- " \n",
- " | 3 | \n",
- " 0.359900 | \n",
- "
\n",
- " \n",
- " | 4 | \n",
- " 0.265600 | \n",
- "
\n",
- " \n",
- " | 5 | \n",
- " 0.220300 | \n",
- "
\n",
- " \n",
- " | 6 | \n",
- " 0.228400 | \n",
- "
\n",
- " \n",
- " | 7 | \n",
- " 0.172900 | \n",
- "
\n",
- " \n",
- " | 8 | \n",
- " 0.115100 | \n",
- "
\n",
- " \n",
- " | 9 | \n",
- " 0.065900 | \n",
- "
\n",
- " \n",
- " | 10 | \n",
- " 0.070000 | \n",
- "
\n",
- " \n",
- " | 11 | \n",
- " 0.054500 | \n",
- "
\n",
- " \n",
- " | 12 | \n",
- " 0.056300 | \n",
- "
\n",
- " \n",
- " | 13 | \n",
- " 0.043800 | \n",
- "
\n",
- " \n",
- " | 14 | \n",
- " 0.043600 | \n",
- "
\n",
- " \n",
- " | 15 | \n",
- " 0.037000 | \n",
- "
\n",
- " \n",
- " | 16 | \n",
- " 0.035200 | \n",
- "
\n",
- " \n",
- " | 17 | \n",
- " 0.027400 | \n",
- "
\n",
- " \n",
- " | 18 | \n",
- " 0.023400 | \n",
- "
\n",
- " \n",
- " | 19 | \n",
- " 0.017700 | \n",
- "
\n",
- " \n",
- " | 20 | \n",
- " 0.027400 | \n",
- "
\n",
- " \n",
- " | 21 | \n",
- " 0.014100 | \n",
- "
\n",
- " \n",
- " | 22 | \n",
- " 0.023300 | \n",
- "
\n",
- " \n",
- " | 23 | \n",
- " 0.007900 | \n",
- "
\n",
- " \n",
- " | 24 | \n",
- " 0.013500 | \n",
- "
\n",
- " \n",
- " | 25 | \n",
- " 0.026900 | \n",
- "
\n",
- " \n",
- " | 26 | \n",
- " 0.022000 | \n",
- "
\n",
- " \n",
- " | 27 | \n",
- " 0.010300 | \n",
- "
\n",
- " \n",
- " | 28 | \n",
- " 0.051500 | \n",
- "
\n",
- " \n",
- " | 29 | \n",
- " 0.014600 | \n",
- "
\n",
- " \n",
- " | 30 | \n",
- " 0.031700 | \n",
- "
\n",
- " \n",
- "
"
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 67
+ },
+ "id": "uOLWY2936t1n",
+ "outputId": "cbef1d3b-4859-4f3b-fe76-67b57cad8648"
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRofHh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/2wBDAQkJCQwLDBgNDRgyIRwhMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjL/wAARCAAyAUADASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREAAgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYkNOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOEhYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD3+iq1/f22mWE17eSiK2gUvJIQSFUdScdhWYPF+hGGxmW/Vkv8/ZCI3PnY5O35eeOfpzQBuUVk6DrsevJfywQukNrey2iyMQRN5ZAZl9t24f8AATWtQAUUVkxeJdJm1dtLiu99yHaI7Y2KeYo3GPzMbd4GSVzkAHjigDWorOTXdMk1afSlvIzqEEXnPb87wmcbgMcjPcU7S9ZsNagefT7gTxI5jZgrABgcEcgcg8H0oAm1HULbStNudQvJBHbW0TSyueyqMmquhawdb0/7WdOv7A+YyGG+iEcnHfAJ4PY5rnPibIZtF0zRQpb+2NVtbNwoBPl797nBPI2oc9euO9dqOlAC0Vn3Gt6fbal/Z0s5+2fZ2uhCsbMxiU4LcA98DHXJqbT9RtNW06G/sJ0uLWdd0cqdGHrQBl2HiJ9U8SX+nWVkXs9Obybq9eTaPOKhvLRcHdgEbiSMZHWt6uO8Cf8AIQ8Y/wDYfl/9Ew12NABWZea5bWWsWOmSRXJmvHKRyLCfKB2O+C/TOI24GT045rTrmfEssq634eaOyvJ0tr1p5nggLqiGCVMkj/adeBz3oA6aijtRQAUVU1LU7PSLI3d7N5cQZUGFLFmYgKqqASxJIAAGTTNL1ey1iGWSzkZvJlMMqSRtG8bjBKsrAEHBB5HIIPQ0AXqKr3t9a6db/aLydIIdyoXkOFBY4GT25IrCsvFE9341vNEGm3YtYbaKVLrygEJYyZbdu5VtoC4HUNntQB0tFYq+K9GbUxp4uyZmna2DeS/lGYDJj8zGzfgdM57da2qACiq1rqFneyTx21zFLJA5SVFYFkYEjBHUdDVnNABRVe+vbfTbKa8u5PLt4V3SPgnaO5OO1UJfE2jwaRbarJeotjdMqwTbWxIW+7jjJz29aANeiqP9rWraqmmozvctB9oZVU/u484Bb0ycgDqcH0NQw+INPutKudRs5HuYLZnWURRsXVk+8u0gHcPTrQBqUVVTUbOXS11KO4R7JofPWZTlTHjduHtjmud1Lxi1v4k0PT7GymvrTUN+65t0EiDHAw24D5Tkt146c0AdZWBa+JH/AOEsn8PahYm1nMTXFlMsm+O6iBAbHAKupPK88cgkVrTajZ297DZzXMcdxOCYo3YAvggHGevUVy2u/wDJUfCP/XpqP8oaAOyooooAKKKKACiiigDi/ilfrb+CrjTkuEiutWePT4dzAZ81wrnkjgKWJ7euKxNMtp3+Kmm6bc61Hfw6Jpsk0SpEkflyOREFwpOSEVuvIB/2q9NZFb7yg/UUBEByFUH1AoAr2Gn2ul2KWdjAsNvHnbGvQZJJ/Mkn8a8/fVvExkYi48QKCeAPD8fH/j9elUmB6CgDIj1KS08JtqV55xeC1aaXzofKc7VJOVGdp46Vw2gxx/8ACV+Hre11Y6nb2tjPeXUStGYLKVgoDgoB8zF5eHLHDMeOtejajp8Wp2TWk5YROylgpxkBg2PocYPsamit4IFdYYY4w7F2CKBuY9ScdSaAPJ9cldIJfiNpEaX11p2qzKY4Hz5toALdk4908wdcZJHBr0vw/pzaT4fsLF8GSGFRIR/E+MsfxYk/jWiqqowoA+gpelAHAa+x1P4xeFdOUhk060udRmTGR8w8pCR2wc4NdF4z1m48PeDdW1a0jWS4tbZ5I1YZG7HBI9BnJ+lYugw/2h8UvFGrNl1sobfTIXyCAdvmyAHHq6ZGeuc9q7ZlV1KsAykYIIyCKAPIdD8T6ZpnjDW7/VNfl1aWysobSGSNQ5lkJDSrGFGPmkaMKo7hh0XI1fBV3rGlweIdCOnww6mhOqafYTTEIsVxlhGWx/BJvUkcZ/OvQ0sbSMIEtoVCBQuIwNoXO0DjjGTj0zUhij8wy7F8zbt3Y5x1xn0oA5H4bLAfDdxP50kuoz3076n5qhXW63YdSoJACgKBg/dAPetS4u/FK3Mi2+kaTJCGPltJqcisy54JUQHB9sn61leBP+P/AMZf9h+X/wBEw1z3ivSlTxFGttqivq7alDqDXcoCHTLQYVlZ+6MRtVD94k8cE0AdbNqvii28vz9K0KLzHEab9YkXcx6AZg5PtVaLxJrdxftYw2vhuS8XdugXW3LjacN8vkZ4PB9K5jxvDc3Y1bVvKtZLXz4tJV5kZp4kZ0RzbgjaHLO3zc8ovPGBt6ZoWpJ47a+n0mK3062e5+yGK5UjMpBklZcbjI7AcZCqM8EnNAGwL7xac40bReOv/E2k4/8AIFU73xLremKrX9r4btVYEgz626AgdTzB0GR+dcnrtwNA8SS+PyJRp0d8+l38aJkSW21U345BKzhvc5x2q3eeEb1PCOlWml6LB9quYJRqNwkqQzRLMA0yJuXGWPy5I+ULwM4wAX/FWoXV3feGNK1Oay0yG8kmurm4imWQJ5O0xrHJIgAZiwO7AIwcetY+jardeHkuvFUt2kuk6vriW5e8IWV7UKIIplORk7lycglkG71r0mHTLabSbW0vLG3ZIo0HkOokVCFxgbuuOmaq6n4V0nWbqW41C3Nw0lo1oFkclI0bO4ovRWIOCw5wAKAMGa403VfiTdabrUtoxsbaI6fZXBH7xpA3mShW4cgKFGM7RnpurdmsTpepaprkYWRTp8USW6rg5hMrcH38wDpxiodX8HabrVjp1leNK8Ni8bIWCu7bMYy7KWB+XkqQTk810NAHi+mwS69b+E0t9X+0ahqN6mt6jBbqn2e2Vf3p+RR8jbzGvJyxLZPp0d14zurv4eardPeWtrqNlff2beXNsS0cOZljaZeSQBG+8Z6Hr0rtLnQ7GfT7qzji+yx3Q/etaHyXb1O5cHPv71Fp/hrStKup57C0S38+3jt3ij4jKR5C/L0yA2M+gAoAXQbLRbTTIm0GKzFo8ahJbYqwkUdCXGd31JJ5Ncnq914oPi/w8X0jSllH2ny1XUpCrfuxnJ8njj2P4V1Hhzw1ZeGLS4t7IsRcTm4lJREBcgKcKiqo4UdAO571oy2VtPd291LCjT2+7ynI5TcMNj6igCGSGe/0WSC+RLeaeFkkEEhkCEgj5WIGevoK4H4ZC58Q+G9AvL2J1s9Is1gtVYFRLOq7GkI7hFGxT6lz/dNemU2ONIkCRoqKOgUYAoA47RNQg06+8a6nq8kVstvqIEkrN92BbeIpnk/3icepNZXg7U7mLxzqIurGTT7PxHH/AGlYW8xIcNHiOTcP4XZfLk29geea7L+woRr0+ppJhbqFYrq3ZAyTFD+7fnowBI9wRnoKuX1o13bOkUvkTlGWO4VAzRZGCVz3oA8xtLzTVh8PaPqF7bQ6FLf6kwWZ8RzmG5KwQ7jwV+bdg9fLA56V302gQnVdHu7XyYILAzfuY4wAwkXHGOBzz+NJP4V02bwqnh1IxHYpEsSgxpKcD1DqwJPOSRnknrzWjpunwaVpdpp1tu8i1hSCPccnaoAGT34FAHI+BZdL1wXer3Elpc6691J9oU4aS0COyRxhT8yAKB6ZJLc5pPHbNBrvhq50wtJ4gWeWOztSP3c8TKPOEhyNqgBTuGSCBgHOK3j4XsG8WJ4jYE3qQtCmERQAcZywUM3T+IkDJxWRrv8AyVLwj/156h/KGgDr5ZkggeaVgkaKWZj0AAyTUOn6haarYQ31jOk9rOu+OVOjD1FWaOlABRRRQAUUUUAFFFFABRRRQAUUUUAFZ+uWF3qej3FpY6lLp104BiuolDGNgQeh4IOMEdwT0rQooAyfD2hroOnyQG5e6uJ55Lm5uHUKZZXbJOBwAOAB2AArWoooAKKKKAOb0nQr/RfE2qzwSwS6Tqk32t0clZYJ9qq2OCGVgoPJG3Henz+BvDNxrY1qXRrWTUhKs4uGBLb1xg9e2B+VdDRQBzWl+C9MtGt7u7hFzqEchuGkZ28vz2JLSCPO0Nkn5tueBXSModCpzgjBwcUtFAGXD4c0iHSJtJWxiawmLNJbyZdGLHLZDE9Tz9ea1KKKACiiigAooooAKKKKACiiigAooooAKKKKACiiigArmYdB1C78bHXtUlgFvZwyW2nW0JLFVcqXldiB8x2gbRwAOpNdNRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFABRRRQAUUUUAFFFFAH/2Q==",
+ "image/png": "iVBORw0KGgoAAAANSUhEUgAAAUAAAAAyCAIAAACib5WDAAAYrUlEQVR4Ae3cebSuUx0HcCSFEBHKlNZCQi2zureMyTwrY2UeQlSaXNdQxjSSMltFSJF5bjCzDMuilUUUIjSKotLtc+83+z73eYd7znve877n3vXuP56zn/389m/av2FP75l10qRJswzKQAMDDcyYGphtxmS7P1wPgl1/9N4nqoZ77I/4TOvA3dU+bP/5z39mnXXWPtnSgGwfNGC4lf/+9799oD1kkjOnA1N6tD9kPUwHELbZZ5/9hRde+Oc//zkd0Ol9FgteffXVsR/apyfHTP6dCT3zzDP/+Mc/ZpttTPvImGauYxuh9BdffPHvf//7v/71r46RpCNPM5Zc7uijj37729/+ne98J9m4Y7Riwete9zpPKb1jJIOOo6qBjPinP/3pt7zlLddcc82///1vBjCqFDtHzhxnpsIraP/nP//5IossMvfcc3/5y18mHTfuWMZky0022YSKf/rTnwrJSHSMTd8//vGP55133qabbvr888/Dg9uOsQ06jpIG4sCvvPLKxhtvbNzFboS48SiRGwnamTADS25i5+abbz7PPPMYg85j2yyz8F7J/IYbbrjyyiu/8pWvwPmGN7xB/uwAp0HSy7xggw02uO2222688cbXv/71HeAZdOmBBpiQYoAuv/zy973vfccee+xf//pXa6gMYg8YGAaJkXj/WOub3HjVVVfxsb/85S+m0JasI2EyqVs4gNACmD8rI0Go+9/+9jcYllpqKalYZZCBR6LPUe2blHv77bdzp2OOOQatkUzlRonV2Yfh62MelDNImBxYsJxzzjlly66wbCrO8cRjyA3DSHCK6/POO6/JM+NQHwmqbvUl0RjhpFsSTRdPBnG6UovaIFdffXWQpk7TRdsXgJlnCk3XfEyMPPPMMw8++OA55pgj8XLkahUXIMmojxwbPCxj5Hi6hWG6dtwtQmMHD5GHKDWwTOLG1JBVNdmJAzNoCaSKZezUX375ZRqXMKeM0VDHqZf8D9F0esOSaCK3ZBrfG4pjgQojUYYYkU278DxE4N5L14kDE6mLGzBU0xXtWABT33e/+13PDTfc0DOq771O21MkbFJ6F91migqHp8Z0sTez8sorW94LK1qqnAeg2tL3urgc1TXlxCcATT+Vxixrv/SlL33xi18kcmymfJ0RK00cmFTkVFQoRUmLp0kpIR3STJw4EYB6bdSHpQJ94ZycJae971IolvEoLWGpFRUILS/NdtZbbz0w7R0YqqqNtsfciuJw2yPs/PPPb33umdfhIqnBFx3W1FgDa3wFjw1jqlLdrqdtwIW3kQxxI9HGFvxPMbfJ9la+ptEzzGhnDEa2OqbVrwB8AlBspqAqFYKAIReYPAvyAjPDVZo4sB0gCVZRIbCSFk8LSzJffPHFRx11VFXdnYlNiXBKAgoqBUmhWBYepSUsFchSMTYYlknOPvtswwNh+dSqAlWxUTCtMLfq3lk7S3388cfPPffcxx57zFM9cbAzbHoRHOe23J9++umXXnqpqsamOA2f2SN31fGCCy5Ya621jOl73vOeww47jDYwEyvXF8I//elPKto1NsWWRthEgTb2oHsbDPifYm6T7a1QSaNnJIr3/vjHP77++uvBBFv5Gj/0CUBTHw6HCRNO4J3nX3311Ysttli2poItmgHTJgQU9sZOZarKwpOROO200/785z+TiiRbb7219lxGIZvbEXvsscc555zzqU99yjavFqrvQBjIFR0///nP/+hHP1I/5ZRTHJozAght3N9///2sxwbghz70IWBarrvuOpaEpQ9/+MNrrrmmLo3G6uDXIY30izfjAb4Vb4z41FNP/dnPfrbffvtB+PGPf3y++ebbZpttxo8fr2MJHK26d9wOuSNl04TDDz/8d7/7nfqee+7ZMTYdyXjkkUf+5Cc/eec73/mLX/zC0Gy22Wa01CjCZI1PmvTkk0/+8pe/3GWXXfR985vfTN7vf//7H/vYx3JjIZzcddddAjTH/tWvfrXoooti8o1vfGNoBaA8jRcvEol23nnnueaaq7TXKm0Ggr1973vf+/3vf6+7Kclee+1lUsDfTj75ZPEI/o022shwE8ftl69//euXXnopKfQycG9605vKkGl817veteWWW7LJj370o0UDNQ7dw1lppZWeeuopA7HPPvuwE6xO1su0mqnxP6Zfw32epCI55wnHZ511ltDOJahVi+QmYwhd6txJF9qpdh96HRXAa6+9ttj/3HPPPfLIIwxF1MCAAeC6biyiwm4SSjzTgiVf0QVZJRdOEp4du/tkjKsApa6jXS5fP/GJTyDhvqtPv/nNb9T333//Wscg4WyChbH3tUa3oO19JSJLJjjHP1ZlVG7geBmTjXymhaplXZJedtll4XnxxRdPxaAo9L/ccsvtuOOOvOjZZ59997vf7RgcQKM+IQTv0yGHHLLDDjt89rOfbaUiCldCpfZE7uGHH+ZUpHjwwQchRBfMgQceqEVoMNxePR2/uf2irgswU5jqkFUNoGigkUNXA2DA6qGHHhpUnjXNuLyhsdgYoQQga2aNjUrQ2N8ydeJKHQpL5VTS4Gc+8xkmLgyLcwsvvLBP0pTrB4L9Cius4DoRJVbDvFeabVpIOAX3/x8gUfnhD38oOggWCy20EO8VbumRpnxdYIEFjAE7A3bSSSdpEZv5ueSPJXV0a0EdDOx33HGHp15VctV6SIgLBGSmKi67alx66aWdPIlQ5t4+1RhmUhnjKqqO6yxDoahUOsaDJX3vvPNOHoh/bH/kIx/BqrmMdvirmOmHxi655JI11ljD1SIC0hIM4pcn3QabRnMTWdr4Gvq3vvWtPPMb3/gG95Npq2pRhxM5OfzXv/61QSRObVAKG+bnSnmtMgbPMsssI4HDb0QwYHDNbO+77z5Ts1133dXMCLzuku26667LhUAi/Y53vKM6ZDpqXH/99QlIAzihgUYOkdPOjRWVSNGomSqH6sB4QVRU+9T/V8yVYgzUr732WiKZjJV2NxmoJucNJld8mGoCXGCGXkkYk9ZM3qTWP/zhD8zFPFY7nDBTrtF69NFHGZBZMeuha3nAJMrX6L1Kjma1szzzbXacpK2xClPqIoU1gtD+gQ98wDQMGPyeN910k8FgrCDDYWiZgIj9jJ6xYq+KVr1fBZNIe2KSoR900EHGxcycHbP18lVFAaPQKiu86KKL0pinzOYXGslykUW7GabwLUdZYpxwwgnU0qjzqAiMcN/UEoItsc9kVdGS1yoD8Gi3LYxKYUCG5IfA8pXhielm8qQIrQxEbcgCTA+rrrqqugJDjcMI8slPfjKzLTbTSjP64seTwZhpM9Hw4zmmyjRrYMKIf7feeisWb7nllrvvvptapSkyyIfcmPzmEtXEG717mr+54suAiO21FIGcEi2e4RQXPBNiWQls1sDg5QHZmO9hAAZgeLCoM49ijg888MAqq6xy/vnnW+zla0FeKtphM4VOdi3tpRLRzBsF+wMOOAB+S0H7GWhFHPN5TFpiGVrYwq1gbwpgb+zEE08kCCTgC85qvTT2poI9xexup512MuWjRiZuZWsBabBkVMo0VSnMkMhFbqJtt912SadayjhGENKR1xIJTsq3bhJhH3roIWAGSBQzP4cWTpAq3Nuc5YknngDAWyDUvVAMTkrTkqeWVApMKto5pDqHNAomQcKrqT4qcPpK/1xI+vVKapAoqtSGDKRimiYKQ2JSAGGNw3RncpEdQqWmGZ80ogIbo0L6c5/73BFHHCGC+E0LrtI3zPf9OVXjZDMAYpKJqFBN11rIwCtwPG7cOLwSTJ38kTDcgzF+VjKkVY+O8gkYczEYxYGDRCSWJ0V3eVVfc/JvfvObjqYMJPwyfHZZ9t133+OOO+4LX/jChRdeKCTrizQSQV57winEEKHKQGC04MSg2uSQMfwswTrZls/uu+8OwCcABuZtb3ubEzJgCy64oBZ2Y+FtN4Wfy8Bl1AtOkMBqIte46u5raHkaIC701a9+1RqYYWULyg6F7UZGRmNOCriW+SeVxi35m0ZSiHHBE+YJbuDwSfMkIqyT4cRuK2GE7Ir99re/pTrxNMeHNAatOAgYNiNY80yYo5xQF2jgN+J55RJKNAOVinP7448/3tTPBuS2227LA8ULjEXnrtbZEtc3XTy1e9aGLI34YQZkwUMjh5GUQoQ5GOhQwKppplBRiTmZj5jS2xtjhzbV0IW/CtbPOjlTyK8SLeD4teZJzB1/5Wtp76xiGHQ0JJnpBcl73/te29rqJkue9G5OmE/qlMWksovAS9NefTIgr1ydXYqUWE1Lgcmr++gE4beClAl8xgYMlnQxR0qMkGyNkPavfe1r4KMKXaIB7amYkVqzmX86jfAcYhEjUoYIX8CqvZZddlkugSVukKmdVzzzWxaJQ5+owrxUI0vVYrFgL5ekitcUX8kiZsW1NEZF9vzV2ToZaSD3+IVazgYy4ltZyPb2Mr1WceZrTTk8SiHLkksu6WlSE/yegVdBSLt77CuuuKLX4MzAkctXMgYe2+G8NmSxDbsndCX4gm/kMORMKEwwQ6KpZhAqpXCYfbUsuEpjwPCDVc/Sq2eVqRkYT9R08803cxgHDMYvhQOvs846+E7kBlMtGGUrLMYsDkz1kzpUpBLChWoVr6h4mj8vv/zyZGZwBttWlrgL3iskNj8TNbTo++1vf/vee+9Ni741EuU1DokKfkpjtWJEfeKlxlWwNx7yjB0UacoK3EQDP5b6ZtGcVsfddtuNXOjapDFNxXkw58kiLRmIDzJKqNJKXTt+1PGWFqR1Zzp0GzwFUqXWUn2tdTQW7BV7VvLYphbSiTgmeygCFu+8WhCChJlicxgD2GvUmOz3wQ9+UAt+ggRR4RKYpCqW6bj33nvDKeB+61vfOuOMM6IcIth6hBM8zUAVbvMsygFGdlwhIcfmNem36MQngRt152omYqeffnpMUXuKqTu00SSVouUpE9aGLEJ5hge0zPtqHOaTdiUiN9XMa5Qnj6xCt5YV9sBNdug2NlxgVLCkVFt6Vw+LnuTxFJDQLo3xyYTMptmPMICBidCWWLXCSbQQGwyFAjYMjMPqhQuFSkbXjCuoRFAbV/kUljJ711FjYPI1z+DEm2mYwTCo2tGqwqDoFYcRLSdhVtRSqB3XnHVfccUVYEgKxpxKPRRzTiNva6mh1TLcIvbbtAs/w+rb2DGTQPOO4IkaZZ6wLSzyIoJEY056yJXNqhrdjEtNRWC06CKNq8PmmWSlAieX81WYYNxa2hfTKKUpTNgz+riFsPYPGPLV/FlSZWMw4FZFPgDcOGQAHClzTnoI/005hDZ2CL6NZnyFBLBpXSFXs4G8msUIcJ7h0LNnZaqv8gEx2wzHdohAaMxsx2+//fb2QqRB2YbMykg4y3gYe0tKOPm2nQmBPCdv3BUV7oQBnOAntsXfDJhX1IOhykNa7IFRsdAen6+BFTzCpIMo2xumGPZpDJ4NNh1zrA1/8Ni/QUJ+CzbrMXt4hZ9CPV+9CjppDIeoR1EcxlpA0VcjkWF2NGJhKdBoiX0Ac6qZRWMwaLGCZYWezHerrbZq7Ii6+YJzWr7hq6M+sSZswJzb4KZOaREuiUxSLYTldeEwXz2LimQw+0BwCrLmqMJEbNT+XzKnWAY/DMSR0GT4iRMnYlhjwabiVYngk933gAPKaw0Mfqq2N2HyzOqik8DoosJgnEeYXefVXK/VkAGQ5+V27CHdnsPI1UYzIUcDZamCtyrz6mmRDLDkWVpqYKP3OtWBjahdEBZseaPCPjzFM2Yk/Fv8YCIiNeWGJK1KsZWozGUMokLoaMGpBvwFoUYUw0DpVb42pZ5GmHkFG9W3KZ/BZrQcIaDC5YRVkKw5dk/ewDgstRXkE5wk0p7kZiCrmANsciHiWNqZiEKlUYmYtuJMzgUpIcMJLcaMsW0kSCwfZDMVyQT/whaFOJFmcCGhxam7mYgAauamY45/SkcWDxIhgR8JkcgZkhbc8gHLBI4qLOpos127wsd4mumoRstadEmXT3niXIUUOWUJq0jAKdfpZXExfvx4yIFFRrGVk/skfCRL18YoJNz0UvSqUdSiBBUxlfI65cvkR7rYTlPyFTO5hVIbspAWak211Dkw+OlyCKaVZoJQLIjG6Kcp/zDgBJineo/LVAfuAeHIb0/CJlaVXEyn2lKt02NUWW0s9TL8soHYDLKVlttQafUpqOx4Sd1GCNFwgqgKV3//+99vxc558rNv4U+7osWc0MF1+JTzJVIhI6sJXpTwYchDwjqflSiZvevldBfyfJ0wYUKtY1MZccWLINExdB2/ueSESqxZI2O1xKUo9UYkNT0kf5qFwpn56k1TTsvjabFXocQNZLEmgZjsIZ1nRgfz4T+vVYBavdbd13Thh5JwkUJ7jdXgMa/BqlHwOnkYpjDTnsOCp1Ez6W7cmRYlwNmosdDt43MaB54i8v8fRQV57wqLlCXnuP0bQ6cOwxM1FfyFh9LSvhKdZlrlWAVwGy0j52tIxP60KFUS1ZagkuWYRdWBY0kylXZTO93t96iXFakkoFfQOiHnzElQWri616RQnGBDwT8f1m7TBSHtkMfNQrexoxZgOEyJIVKvE7JI5GlV5sYbsLiiPG9qPW7cONEk3T1rRa8gLHqQyeVeYFqgsrrRXUUxkREsTByEGAwQpIZt6K+FXGMX/Gi02LH9Llx6rRHySkDbGQIWeQGHVZXpcqgvQZpqRjsM9G9kM5rhRGNjCXBj+2i3TOPAo00MfsNsDGJtXSEXnfIcWi6htyuYIQnyRgdmH77KKg5gpSavll7FgUnHyhVrBFu49i3j5NpNOE2n9XXY7hk8Zte2HtAy740U/NC61w1nZqGX4pJZtaMW3duUpvYkavA9qaZNx1afwmojWh5i1dPGsiHUN91bIW/fHiVwJDe0nJIAromfV5P/Mu2vIhwKh001E2GH6MBVir2s99qBuysbFbMM2rcZZg2cZNVoZB0TbeXAjQjtJ/G97BSwmNVWW+0HP/iB1antJROwmJ0rfmBYoQWhe7yQJJPbwdpiiy28OpwEkHNXW6lZz3Nmy85axzYOk5QSDqc4zjTzi7QP61lDUsU/LDy9AR5JpKhxOEM48Owso8eFmth0t4jaNWVS1mYcw46ousOkbiFvj4cgyDkvtafiX89Ks7m96CajzOzkUIHB7w1sQY0bN841I7f2uLcVI4/1CQZPe/5LLLGEug0Y57pSNzxuiQlJcgsAedv9qmrHNqeO1fPVHI3CUEooDkv/NSRV/NB2gLAwM6wKQpwTM61kj7/VuB06hz0TZFhSTxe4Dw48LOuZrgAAZGBWZcu365jbU2dPvNcUmjfa3zbLlYGdgbl/ixO+J8GKJtZmiSmZ51dxsjZIfHWCoovixqJ9Y3dI/TNxlipAiErq1V7qjWZaA2j1ikSrT521dx1hKzYQah+aWzn2EDkcIlgr9vrV3qfrI90Ql3HzEJj8NJ8b5ApKLT90g05zHPyTPdn2lFT90IL3urWS3+5Y9DoE4mN2m++55x5enVNZgQbPKUkXuAVmi9VPLJCRBFwCF4l05MNaIs5rnSb/1bE5Q4PW0dFAFD523ZvRzOjFhq01Jx9WuitLlpqOkdiGtSjkcTyEVJz0xmbKVX6nRz65+MH9+K09FT9b5b06ZqOlsKc7SDD+HwUk7soDCLnsqPN2MErpMqj0WAPRv3HhvWP2GKkPU+gRBkqjSKFykc0eZxiuFvABN5alMo0dzy3bcOUKR5DHUROMPWVaVyMkSb5t/mau6zKGnzG7deTQ0qzYRWsZdcKECUwhvQoVUujilrVr4QKEnze6pBH8bmXI5+qBKV0GlR5rgOsaXFsYBsJ+RI+pD5Uc5maskrjo2gAJXZDI/5SSytLeXVmS0nM9QIyAvJZIG8nJn25uaXda1vh10DKjaCDmZLhd+7Wi6foJZbf0MOOtgaUyyhUR7RtJuTJYrrPSSC3LDTWGtYaTeNFyRGQT2LV+u81W3QY1PXzCQCkmvT4BkHV5viQMzNfW6Cf/SBMkGM8CRpBCojQOKr3UQLzLuLij4lqru4Nu6RqjrhtYF4TqViToC56clCLNE0aJAZi5k8Hz0zxRw2Wv+G1TcubS/gFAdbXcFGzQOMY1YMTtR/h/YJZFthjVGcDY5HlWbHUhDPQDBS1nxVsqo82Febud5/woZ7RpDfD3VwPCtEVQTgT6y0l76jOwAxNM9OnNrAYhYSKHOu0VKlQPBaw9ksHXsaMBntzqhHksMDljO3CPNdizeNFjuQbkmmogk9PeZIimDAylceDAQ9HSAGaggTGqgRlvF3qMKnLA1kAD/dDAwIH7ofUBzYEGuqSBgQN3SZEDNAMN9EMDAwfuh9YHNAca6JIGBg7cJUUO0Aw00A8NDBy4H1of0BxooEsaGDhwlxQ5QDPQQD80MHDgfmh9QHOggS5pYODAXVLkAM1AA/3QwP8AGMg7qICuIqsAAAAASUVORK5CYII=",
+ "text/plain": [
+ ""
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
],
- "text/plain": [
- ""
+ "source": [
+ "dataset[2][\"image\"]"
]
- },
- "metadata": {},
- "output_type": "display_data"
- }
- ],
- "source": [
- "trainer_stats = trainer.train()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {
- "cellView": "form",
- "colab": {
- "base_uri": "https://localhost:8080/"
},
- "id": "pCqnaKmlO1U9",
- "outputId": "6e059f68-54f9-46b3-cb03-203e36f3ac11"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "214.9762 seconds used for training.\n",
- "3.58 minutes used for training.\n",
- "Peak reserved memory = 8.213 GB.\n",
- "Peak reserved memory for training = 0.553 GB.\n",
- "Peak reserved memory % of max memory = 55.715 %.\n",
- "Peak reserved memory for training % of max memory = 3.751 %.\n"
- ]
- }
- ],
- "source": [
- "# @title Show final memory and time stats\n",
- "used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
- "used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n",
- "used_percentage = round(used_memory / max_memory * 100, 3)\n",
- "lora_percentage = round(used_memory_for_lora / max_memory * 100, 3)\n",
- "print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n",
- "print(\n",
- " f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\"\n",
- ")\n",
- "print(f\"Peak reserved memory = {used_memory} GB.\")\n",
- "print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n",
- "print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n",
- "print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ekOmTR1hSNcr"
- },
- "source": [
- "\n",
- "### Inference\n",
- "Let's run the model! You can change the instruction and input - leave the output blank!\n",
- "\n",
- "We use `min_p = 0.1` and `temperature = 1.5`. Read this [Tweet](https://x.com/menhguin/status/1826132708508213629) for more information on why."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 53
+ },
+ "id": "VTzhtzNRAEL1",
+ "outputId": "6f12a3ee-9f95-40a2-e3f8-e7af5d4e4c3c"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "type": "string"
+ },
+ "text/plain": [
+ "'H ^ { \\\\prime } = \\\\beta N \\\\int d \\\\lambda \\\\biggl \\\\{ \\\\frac { 1 } { 2 \\\\beta ^ { 2 } N ^ { 2 } } \\\\partial _ { \\\\lambda } \\\\zeta ^ { \\\\dagger } \\\\partial _ { \\\\lambda } \\\\zeta + V ( \\\\lambda ) \\\\zeta ^ { \\\\dagger } \\\\zeta \\\\biggr \\\\} \\\\ .'"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "dataset[2][\"text\"]"
+ ]
},
- "id": "kR3gIAX-SM2q",
- "outputId": "1da3fc74-e159-463c-ad2d-f90e17538fde"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "H ^ { \\prime } = \\beta N \\int d \\lambda \\Big \\{ { \\frac { 1 } { 2 \\beta ^ { 2 } N ^ { 2 } } } \\partial _ { \\lambda } \\zeta ^ { \\dagger } \\partial _ { \\lambda } \\zeta + V ( \\lambda ) \\zeta ^ { \\dagger } \\zeta \\Big \\} \\, .<|im_end|>\n"
- ]
- }
- ],
- "source": [
- "FastVisionModel.for_inference(model) # Enable for inference!\n",
- "\n",
- "image = dataset[2][\"image\"]\n",
- "instruction = \"Write the LaTeX representation for this image.\"\n",
- "\n",
- "messages = [\n",
- " {\"role\": \"user\", \"content\": [\n",
- " {\"type\": \"image\"},\n",
- " {\"type\": \"text\", \"text\": instruction}\n",
- " ]}\n",
- "]\n",
- "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
- "inputs = tokenizer(\n",
- " image,\n",
- " input_text,\n",
- " add_special_tokens = False,\n",
- " return_tensors = \"pt\",\n",
- ").to(\"cuda\")\n",
- "\n",
- "from transformers import TextStreamer\n",
- "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
- "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
- " use_cache = True, temperature = 1.5, min_p = 0.1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "uMuVrWbjAzhc"
- },
- "source": [
- "\n",
- "### Saving, loading finetuned models\n",
- "To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.\n",
- "\n",
- "**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "NAeQ9LXCAEkW"
+ },
+ "source": [
+ "We can also render the LaTeX in the browser directly!"
+ ]
},
- "id": "upcOlWe7A1vc",
- "outputId": "61cf35a0-c007-45ce-f760-8f8f5d82d661"
- },
- "outputs": [
{
- "data": {
- "text/plain": [
- "[]"
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 58
+ },
+ "id": "lXjfJr4W6z8P",
+ "outputId": "a00b0c12-0197-46b6-81a0-291904c77c3b"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/latex": [
+ "$\\displaystyle H ^ { \\prime } = \\beta N \\int d \\lambda \\biggl \\{ \\frac { 1 } { 2 \\beta ^ { 2 } N ^ { 2 } } \\partial _ { \\lambda } \\zeta ^ { \\dagger } \\partial _ { \\lambda } \\zeta + V ( \\lambda ) \\zeta ^ { \\dagger } \\zeta \\biggr \\} \\ .$"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "from IPython.display import display, Math, Latex\n",
+ "\n",
+ "latex = dataset[2][\"text\"]\n",
+ "display(Math(latex))"
]
- },
- "execution_count": 19,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "model.save_pretrained(\"lora_model\") # Local saving\n",
- "tokenizer.save_pretrained(\"lora_model\")\n",
- "# model.push_to_hub(\"your_name/lora_model\", token = \"...\") # Online saving\n",
- "# tokenizer.push_to_hub(\"your_name/lora_model\", token = \"...\") # Online saving"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "AEEcJ4qfC7Lp"
- },
- "source": [
- "Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {
- "colab": {
- "base_uri": "https://localhost:8080/"
},
- "id": "MKX_XKs_BNZR",
- "outputId": "edc73b33-f7cf-4351-9c81-8a06162d59f3"
- },
- "outputs": [
{
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "\\frac { N } { M } \\in { \\bf { Z } } , \\frac { M } { P } \\in { \\bf { Z } } , \\frac { P } { Q } \\in { \\bf { Z } }<|im_end|>\n"
- ]
- }
- ],
- "source": [
- "if False:\n",
- " from unsloth import FastVisionModel\n",
- " model, tokenizer = FastVisionModel.from_pretrained(\n",
- " model_name = \"lora_model\", # YOUR MODEL YOU USED FOR TRAINING\n",
- " load_in_4bit = True, # Set to False for 16bit LoRA\n",
- " )\n",
- " FastVisionModel.for_inference(model) # Enable for inference!\n",
- "\n",
- "image = dataset[0][\"image\"]\n",
- "instruction = \"Write the LaTeX representation for this image.\"\n",
- "\n",
- "messages = [\n",
- " {\"role\": \"user\", \"content\": [\n",
- " {\"type\": \"image\"},\n",
- " {\"type\": \"text\", \"text\": instruction}\n",
- " ]}\n",
- "]\n",
- "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
- "inputs = tokenizer(\n",
- " image,\n",
- " input_text,\n",
- " add_special_tokens = False,\n",
- " return_tensors = \"pt\",\n",
- ").to(\"cuda\")\n",
- "\n",
- "from transformers import TextStreamer\n",
- "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
- "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
- " use_cache = True, temperature = 1.5, min_p = 0.1)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "f422JgM9sdVT"
- },
- "source": [
- "### Saving to float16 for VLLM\n",
- "\n",
- "We also support saving to `float16` directly. Select `merged_16bit` for float16. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {
- "id": "iHjt_SMYsd3P"
- },
- "outputs": [],
- "source": [
- "# Select ONLY 1 to save! (Both not needed!)\n",
- "\n",
- "# Save locally to 16bit\n",
- "if False: model.save_pretrained_merged(\"unsloth_finetune\", tokenizer,)\n",
- "\n",
- "# To export and save to your Hugging Face account\n",
- "if False: model.push_to_hub_merged(\"YOUR_USERNAME/unsloth_finetune\", tokenizer, token = \"PUT_HERE\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!\n",
- "\n",
- "Some other links:\n",
- "1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)\n",
- "2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)\n",
- "3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)\n",
- "6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!\n",
- "\n",
- "\n",
- "

\n",
- "

\n",
- "

\n",
- "\n",
- " Join Discord if you need help + \u2b50\ufe0f
Star us on Github \u2b50\ufe0f\n",
- "\n",
- " This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme)\n",
- "
\n"
- ]
- }
- ],
- "metadata": {
- "accelerator": "GPU",
- "colab": {
- "gpuType": "T4",
- "provenance": []
- },
- "kernelspec": {
- "display_name": "unsloth_env",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "name": "python",
- "version": "3.11.11"
- },
- "widgets": {
- "application/vnd.jupyter.widget-state+json": {
- "046156a2fad9435b84a19ff3163e7e4a": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "K9CBpiISFa6C"
+ },
+ "source": [
+ "To format the dataset, all vision finetuning tasks should be formatted as follows:\n",
+ "\n",
+ "```python\n",
+ "[\n",
+ "{ \"role\": \"user\",\n",
+ " \"content\": [{\"type\": \"text\", \"text\": Q}, {\"type\": \"image\", \"image\": image} ]\n",
+ "},\n",
+ "{ \"role\": \"assistant\",\n",
+ " \"content\": [{\"type\": \"text\", \"text\": A} ]\n",
+ "},\n",
+ "]\n",
+ "```"
+ ]
},
- "087abc8baba848babf93b8f29e5a2bcf": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "oPXzJZzHEgXe"
+ },
+ "outputs": [],
+ "source": [
+ "instruction = \"Write the LaTeX representation for this image.\"\n",
+ "\n",
+ "def convert_to_conversation(sample):\n",
+ " conversation = [\n",
+ " { \"role\": \"user\",\n",
+ " \"content\" : [\n",
+ " {\"type\" : \"text\", \"text\" : instruction},\n",
+ " {\"type\" : \"image\", \"image\" : sample[\"image\"]} ]\n",
+ " },\n",
+ " { \"role\" : \"assistant\",\n",
+ " \"content\" : [\n",
+ " {\"type\" : \"text\", \"text\" : sample[\"text\"]} ]\n",
+ " },\n",
+ " ]\n",
+ " return { \"messages\" : conversation }\n",
+ "pass"
+ ]
},
- "196ef60df0cd417f934f6e2304c3f180": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_cfb40ab3aa994fc9b3088d79ed5c4a26",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_046156a2fad9435b84a19ff3163e7e4a",
- "value": "Generating\u2007test\u2007split:\u2007100%"
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FY-9u-OD6_gE"
+ },
+ "source": [
+ "Let's convert the dataset into the \"correct\" format for finetuning:"
+ ]
},
- "1a74e168c9554fac9978a4736dbcdb11": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "gFW2qXIr7Ezy"
+ },
+ "outputs": [],
+ "source": [
+ "converted_dataset = [convert_to_conversation(sample) for sample in dataset]"
+ ]
},
- "1bc12b432fe5444c8e854153cf84120d": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ndDUB23CGAC5"
+ },
+ "source": [
+ "We look at how the conversations are structured for the first example:"
+ ]
},
- "1d7dc3fd42e04783b7e912161ec5f7c2": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HBoxModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HBoxModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HBoxView",
- "box_style": "",
- "children": [
- "IPY_MODEL_2ed0f7a88aab4e7f9e7c3216bea48b90",
- "IPY_MODEL_48276a93bae44504aa0a485b207051d0",
- "IPY_MODEL_8466623dbeb442468367747c2b187cdd"
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "gGFzmplrEy9I",
+ "outputId": "09ea7f3d-bca0-4faa-86ff-19db880e2d90"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'messages': [{'role': 'user',\n",
+ " 'content': [{'type': 'text',\n",
+ " 'text': 'Write the LaTeX representation for this image.'},\n",
+ " {'type': 'image',\n",
+ " 'image': }]},\n",
+ " {'role': 'assistant',\n",
+ " 'content': [{'type': 'text',\n",
+ " 'text': '{ \\\\frac { N } { M } } \\\\in { \\\\bf Z } , { \\\\frac { M } { P } } \\\\in { \\\\bf Z } , { \\\\frac { P } { Q } } \\\\in { \\\\bf Z }'}]}]}"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
],
- "layout": "IPY_MODEL_2b60182eb6724c0fa3d241a39854cd1e"
- }
+ "source": [
+ "converted_dataset[0]"
+ ]
},
- "23ec6113e4014445baf3dcc4dd6c4e17": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FecKS-dA82f5"
+ },
+ "source": [
+ "Let's first see before we do any finetuning what the model outputs for the first example!"
+ ]
},
- "27456772617447dc80c58eb292c185c8": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HBoxModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HBoxModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HBoxView",
- "box_style": "",
- "children": [
- "IPY_MODEL_196ef60df0cd417f934f6e2304c3f180",
- "IPY_MODEL_9113790223204b179dfeff0623f0136d",
- "IPY_MODEL_d84c3a99025742ae939a2472f36852fa"
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "vcat4UxA81vr",
+ "outputId": "de67935b-b273-4432-b42a-0eb31dceb4fe"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " $$H ^ { \\prime } = \\beta N \\int d \\lambda \\left\\{ \\frac { 1 } { 2 \\beta ^ { 2 } N ^ { 2 } } \\partial _ { \\lambda } \\zeta ^ { \\dagger } \\partial _ { \\lambda } \\zeta + V ( \\lambda ) \\zeta ^ { \\dagger } \\zeta \\right\\} .$$<|im_end|>\n"
+ ]
+ }
],
- "layout": "IPY_MODEL_daed4150f70b4cd1b913f26a732b13c5"
- }
- },
- "28b385c6e67d4c05a4277889c9f5c0c4": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
- },
- "2b60182eb6724c0fa3d241a39854cd1e": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
- },
- "2ed0f7a88aab4e7f9e7c3216bea48b90": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_28b385c6e67d4c05a4277889c9f5c0c4",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_1bc12b432fe5444c8e854153cf84120d",
- "value": "Generating\u2007train\u2007split:\u2007100%"
- }
- },
- "3374e62bec23487098102668768cc9ef": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
- },
- "41ba45cbf8a74dbebe9b62f5b321d315": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
- },
- "48276a93bae44504aa0a485b207051d0": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "FloatProgressModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "FloatProgressModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "ProgressView",
- "bar_style": "success",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_3374e62bec23487098102668768cc9ef",
- "max": 68686,
- "min": 0,
- "orientation": "horizontal",
- "style": "IPY_MODEL_ce09826095904bc18c8eaaff388a216d",
- "value": 68686
- }
- },
- "5a0a4e972dfd49a0a87b274ec3fd97e2": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_dff51ffe23724245a70b220217d5b891",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_d29e7cfee3fe449e82569927b355dab0",
- "value": "Loading\u2007checkpoint\u2007shards:\u2007100%"
- }
+ "source": [
+ "FastVisionModel.for_inference(model) # Enable for inference!\n",
+ "\n",
+ "image = dataset[2][\"image\"]\n",
+ "instruction = \"Write the LaTeX representation for this image.\"\n",
+ "\n",
+ "messages = [\n",
+ " {\"role\": \"user\", \"content\": [\n",
+ " {\"type\": \"image\"},\n",
+ " {\"type\": \"text\", \"text\": instruction}\n",
+ " ]}\n",
+ "]\n",
+ "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
+ "inputs = tokenizer(\n",
+ " image,\n",
+ " input_text,\n",
+ " add_special_tokens = False,\n",
+ " return_tensors = \"pt\",\n",
+ ").to(\"cuda\")\n",
+ "\n",
+ "from transformers import TextStreamer\n",
+ "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
+ "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
+ " use_cache = True, temperature = 1.5, min_p = 0.1)"
+ ]
},
- "695b0f6ecfa143efa7c9c4e22cc33b47": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "idAEIeSQ3xdS"
+ },
+ "source": [
+ "\n",
+ "### Train the model\n",
+ "Now let's train our model. We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!\n",
+ "\n",
+ "We use our new `UnslothVisionDataCollator` which will help in our vision finetuning setup."
+ ]
},
- "696d7cf5fbb64bc484d7de8bc66a4062": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "ProgressStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "ProgressStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "bar_color": null,
- "description_width": ""
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "95_Nn-89DhsL",
+ "outputId": "c4a76c64-06c7-4aa8-f2fd-038653804e6f"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Unsloth: Model does not have a default image size - using 512\n"
+ ]
+ }
+ ],
+ "source": [
+ "from unsloth.trainer import UnslothVisionDataCollator\n",
+ "from trl import SFTTrainer, SFTConfig\n",
+ "\n",
+ "FastVisionModel.for_training(model) # Enable for training!\n",
+ "\n",
+ "trainer = SFTTrainer(\n",
+ " model = model,\n",
+ " tokenizer = tokenizer,\n",
+ " data_collator = UnslothVisionDataCollator(model, tokenizer), # Must use!\n",
+ " train_dataset = converted_dataset,\n",
+ " args = SFTConfig(\n",
+ " per_device_train_batch_size = 2,\n",
+ " gradient_accumulation_steps = 4,\n",
+ " warmup_steps = 5,\n",
+ " max_steps = 30,\n",
+ " # num_train_epochs = 1, # Set this instead of max_steps for full training runs\n",
+ " learning_rate = 2e-4,\n",
+ " logging_steps = 1,\n",
+ " optim = \"adamw_8bit\",\n",
+ " weight_decay = 0.001,\n",
+ " lr_scheduler_type = \"linear\",\n",
+ " seed = 3407,\n",
+ " output_dir = \"outputs\",\n",
+ " report_to = \"none\", # For Weights and Biases\n",
+ "\n",
+ " # You MUST put the below items for vision finetuning:\n",
+ " remove_unused_columns = False,\n",
+ " dataset_text_field = \"\",\n",
+ " dataset_kwargs = {\"skip_prepare_dataset\": True},\n",
+ " max_length = 2048,\n",
+ " ),\n",
+ ")"
+ ]
},
- "751d396294e74f9490817610756c4a01": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "ProgressStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "ProgressStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "bar_color": null,
- "description_width": ""
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "2ejIt2xSNKKp",
+ "outputId": "e4592e9e-a840-4079-b91e-08f919fc5c80"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "GPU = Tesla T4. Max memory = 14.741 GB.\n",
+ "7.66 GB of memory reserved.\n"
+ ]
+ }
+ ],
+ "source": [
+ "# @title Show current memory stats\n",
+ "gpu_stats = torch.cuda.get_device_properties(0)\n",
+ "start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
+ "max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)\n",
+ "print(f\"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.\")\n",
+ "print(f\"{start_gpu_memory} GB of memory reserved.\")"
+ ]
},
- "8466623dbeb442468367747c2b187cdd": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_9947797e53bf41c9a93c9b94c877c863",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_de403b0aaea3409996b04e0c826bc71a",
- "value": "\u200768686/68686\u2007[00:01<00:00,\u200748807.63\u2007examples/s]"
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 1000
+ },
+ "id": "yqxqAZ7KJ4oL",
+ "outputId": "dca00bba-4035-4133-cb34-9fadfb0b011a"
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None}.\n",
+ "The model is already on multiple devices. Skipping the move to device specified in `args`.\n",
+ "==((====))== Unsloth - 2x faster free finetuning | Num GPUs used = 1\n",
+ " \\\\ /| Num examples = 68,686 | Num Epochs = 1 | Total steps = 30\n",
+ "O^O/ \\_/ \\ Batch size per device = 2 | Gradient accumulation steps = 4\n",
+ "\\ / Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8\n",
+ " \"-____-\" Trainable parameters = 51,346,944 of 8,818,470,640 (0.58% trained)\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Unsloth: Will smartly offload gradients to save VRAM!\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ " \n",
+ " \n",
+ "
\n",
+ " [30/30 02:52, Epoch 0/1]\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | Step | \n",
+ " Training Loss | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 1 | \n",
+ " 0.313100 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " 0.373600 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " 0.359900 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " 0.265600 | \n",
+ "
\n",
+ " \n",
+ " | 5 | \n",
+ " 0.220300 | \n",
+ "
\n",
+ " \n",
+ " | 6 | \n",
+ " 0.228400 | \n",
+ "
\n",
+ " \n",
+ " | 7 | \n",
+ " 0.172900 | \n",
+ "
\n",
+ " \n",
+ " | 8 | \n",
+ " 0.115100 | \n",
+ "
\n",
+ " \n",
+ " | 9 | \n",
+ " 0.065900 | \n",
+ "
\n",
+ " \n",
+ " | 10 | \n",
+ " 0.070000 | \n",
+ "
\n",
+ " \n",
+ " | 11 | \n",
+ " 0.054500 | \n",
+ "
\n",
+ " \n",
+ " | 12 | \n",
+ " 0.056300 | \n",
+ "
\n",
+ " \n",
+ " | 13 | \n",
+ " 0.043800 | \n",
+ "
\n",
+ " \n",
+ " | 14 | \n",
+ " 0.043600 | \n",
+ "
\n",
+ " \n",
+ " | 15 | \n",
+ " 0.037000 | \n",
+ "
\n",
+ " \n",
+ " | 16 | \n",
+ " 0.035200 | \n",
+ "
\n",
+ " \n",
+ " | 17 | \n",
+ " 0.027400 | \n",
+ "
\n",
+ " \n",
+ " | 18 | \n",
+ " 0.023400 | \n",
+ "
\n",
+ " \n",
+ " | 19 | \n",
+ " 0.017700 | \n",
+ "
\n",
+ " \n",
+ " | 20 | \n",
+ " 0.027400 | \n",
+ "
\n",
+ " \n",
+ " | 21 | \n",
+ " 0.014100 | \n",
+ "
\n",
+ " \n",
+ " | 22 | \n",
+ " 0.023300 | \n",
+ "
\n",
+ " \n",
+ " | 23 | \n",
+ " 0.007900 | \n",
+ "
\n",
+ " \n",
+ " | 24 | \n",
+ " 0.013500 | \n",
+ "
\n",
+ " \n",
+ " | 25 | \n",
+ " 0.026900 | \n",
+ "
\n",
+ " \n",
+ " | 26 | \n",
+ " 0.022000 | \n",
+ "
\n",
+ " \n",
+ " | 27 | \n",
+ " 0.010300 | \n",
+ "
\n",
+ " \n",
+ " | 28 | \n",
+ " 0.051500 | \n",
+ "
\n",
+ " \n",
+ " | 29 | \n",
+ " 0.014600 | \n",
+ "
\n",
+ " \n",
+ " | 30 | \n",
+ " 0.031700 | \n",
+ "
\n",
+ " \n",
+ "
"
+ ],
+ "text/plain": [
+ ""
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "trainer_stats = trainer.train()"
+ ]
},
- "9113790223204b179dfeff0623f0136d": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "FloatProgressModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "FloatProgressModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "ProgressView",
- "bar_style": "success",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_23ec6113e4014445baf3dcc4dd6c4e17",
- "max": 7632,
- "min": 0,
- "orientation": "horizontal",
- "style": "IPY_MODEL_696d7cf5fbb64bc484d7de8bc66a4062",
- "value": 7632
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "pCqnaKmlO1U9",
+ "outputId": "6e059f68-54f9-46b3-cb03-203e36f3ac11"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "214.9762 seconds used for training.\n",
+ "3.58 minutes used for training.\n",
+ "Peak reserved memory = 8.213 GB.\n",
+ "Peak reserved memory for training = 0.553 GB.\n",
+ "Peak reserved memory % of max memory = 55.715 %.\n",
+ "Peak reserved memory for training % of max memory = 3.751 %.\n"
+ ]
+ }
+ ],
+ "source": [
+ "# @title Show final memory and time stats\n",
+ "used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n",
+ "used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n",
+ "used_percentage = round(used_memory / max_memory * 100, 3)\n",
+ "lora_percentage = round(used_memory_for_lora / max_memory * 100, 3)\n",
+ "print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n",
+ "print(\n",
+ " f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\"\n",
+ ")\n",
+ "print(f\"Peak reserved memory = {used_memory} GB.\")\n",
+ "print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n",
+ "print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n",
+ "print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")"
+ ]
},
- "9947797e53bf41c9a93c9b94c877c863": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ekOmTR1hSNcr"
+ },
+ "source": [
+ "\n",
+ "### Inference\n",
+ "Let's run the model! You can change the instruction and input - leave the output blank!\n",
+ "\n",
+ "We use `min_p = 0.1` and `temperature = 1.5`. Read this [Tweet](https://x.com/menhguin/status/1826132708508213629) for more information on why."
+ ]
},
- "bb5f95f83dce49d49478528866903599": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_e9dd2d8eef0f437b8e0d2b385e2220bd",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_695b0f6ecfa143efa7c9c4e22cc33b47",
- "value": "\u20072/2\u2007[00:28<00:00,\u200713.72s/it]"
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "kR3gIAX-SM2q",
+ "outputId": "1da3fc74-e159-463c-ad2d-f90e17538fde"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "H ^ { \\prime } = \\beta N \\int d \\lambda \\Big \\{ { \\frac { 1 } { 2 \\beta ^ { 2 } N ^ { 2 } } } \\partial _ { \\lambda } \\zeta ^ { \\dagger } \\partial _ { \\lambda } \\zeta + V ( \\lambda ) \\zeta ^ { \\dagger } \\zeta \\Big \\} \\, .<|im_end|>\n"
+ ]
+ }
+ ],
+ "source": [
+ "FastVisionModel.for_inference(model) # Enable for inference!\n",
+ "\n",
+ "image = dataset[2][\"image\"]\n",
+ "instruction = \"Write the LaTeX representation for this image.\"\n",
+ "\n",
+ "messages = [\n",
+ " {\"role\": \"user\", \"content\": [\n",
+ " {\"type\": \"image\"},\n",
+ " {\"type\": \"text\", \"text\": instruction}\n",
+ " ]}\n",
+ "]\n",
+ "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
+ "inputs = tokenizer(\n",
+ " image,\n",
+ " input_text,\n",
+ " add_special_tokens = False,\n",
+ " return_tensors = \"pt\",\n",
+ ").to(\"cuda\")\n",
+ "\n",
+ "from transformers import TextStreamer\n",
+ "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
+ "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
+ " use_cache = True, temperature = 1.5, min_p = 0.1)"
+ ]
},
- "c44003cd636b48a486e85c6dbe57177e": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "FloatProgressModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "FloatProgressModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "ProgressView",
- "bar_style": "success",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_41ba45cbf8a74dbebe9b62f5b321d315",
- "max": 2,
- "min": 0,
- "orientation": "horizontal",
- "style": "IPY_MODEL_751d396294e74f9490817610756c4a01",
- "value": 2
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "uMuVrWbjAzhc"
+ },
+ "source": [
+ "\n",
+ "### Saving, loading finetuned models\n",
+ "To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.\n",
+ "\n",
+ "**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!"
+ ]
},
- "c48ecf14955a4d80bd5c8b8c2c7da38d": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HBoxModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HBoxModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HBoxView",
- "box_style": "",
- "children": [
- "IPY_MODEL_5a0a4e972dfd49a0a87b274ec3fd97e2",
- "IPY_MODEL_c44003cd636b48a486e85c6dbe57177e",
- "IPY_MODEL_bb5f95f83dce49d49478528866903599"
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "upcOlWe7A1vc",
+ "outputId": "61cf35a0-c007-45ce-f760-8f8f5d82d661"
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[]"
+ ]
+ },
+ "execution_count": 19,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
],
- "layout": "IPY_MODEL_e48f4e809eee461cb458e20d1fd73847"
- }
+ "source": [
+ "model.save_pretrained(\"lora_model\") # Local saving\n",
+ "tokenizer.save_pretrained(\"lora_model\")\n",
+ "# model.push_to_hub(\"your_name/lora_model\", token = \"...\") # Online saving\n",
+ "# tokenizer.push_to_hub(\"your_name/lora_model\", token = \"...\") # Online saving"
+ ]
},
- "ce09826095904bc18c8eaaff388a216d": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "ProgressStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "ProgressStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "bar_color": null,
- "description_width": ""
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "AEEcJ4qfC7Lp"
+ },
+ "source": [
+ "Now if you want to load the LoRA adapters we just saved for inference, set `False` to `True`:"
+ ]
},
- "cfb40ab3aa994fc9b3088d79ed5c4a26": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "MKX_XKs_BNZR",
+ "outputId": "edc73b33-f7cf-4351-9c81-8a06162d59f3"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\\frac { N } { M } \\in { \\bf { Z } } , \\frac { M } { P } \\in { \\bf { Z } } , \\frac { P } { Q } \\in { \\bf { Z } }<|im_end|>\n"
+ ]
+ }
+ ],
+ "source": [
+ "if False:\n",
+ " from unsloth import FastVisionModel\n",
+ " model, tokenizer = FastVisionModel.from_pretrained(\n",
+ " model_name = \"lora_model\", # YOUR MODEL YOU USED FOR TRAINING\n",
+ " load_in_4bit = True, # Set to False for 16bit LoRA\n",
+ " )\n",
+ " FastVisionModel.for_inference(model) # Enable for inference!\n",
+ "\n",
+ "image = dataset[0][\"image\"]\n",
+ "instruction = \"Write the LaTeX representation for this image.\"\n",
+ "\n",
+ "messages = [\n",
+ " {\"role\": \"user\", \"content\": [\n",
+ " {\"type\": \"image\"},\n",
+ " {\"type\": \"text\", \"text\": instruction}\n",
+ " ]}\n",
+ "]\n",
+ "input_text = tokenizer.apply_chat_template(messages, add_generation_prompt = True)\n",
+ "inputs = tokenizer(\n",
+ " image,\n",
+ " input_text,\n",
+ " add_special_tokens = False,\n",
+ " return_tensors = \"pt\",\n",
+ ").to(\"cuda\")\n",
+ "\n",
+ "from transformers import TextStreamer\n",
+ "text_streamer = TextStreamer(tokenizer, skip_prompt = True)\n",
+ "_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 128,\n",
+ " use_cache = True, temperature = 1.5, min_p = 0.1)"
+ ]
},
- "d29e7cfee3fe449e82569927b355dab0": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "f422JgM9sdVT"
+ },
+ "source": [
+ "### Saving to float16 for VLLM\n",
+ "\n",
+ "We also support saving to `float16` directly. Select `merged_16bit` for float16. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens."
+ ]
},
- "d84c3a99025742ae939a2472f36852fa": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "HTMLModel",
- "state": {
- "_dom_classes": [],
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "HTMLModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/controls",
- "_view_module_version": "1.5.0",
- "_view_name": "HTMLView",
- "description": "",
- "description_tooltip": null,
- "layout": "IPY_MODEL_087abc8baba848babf93b8f29e5a2bcf",
- "placeholder": "\u200b",
- "style": "IPY_MODEL_1a74e168c9554fac9978a4736dbcdb11",
- "value": "\u20077632/7632\u2007[00:00<00:00,\u200732020.60\u2007examples/s]"
- }
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "iHjt_SMYsd3P"
+ },
+ "outputs": [],
+ "source": [
+ "# Select ONLY 1 to save! (Both not needed!)\n",
+ "\n",
+ "# Save locally to 16bit\n",
+ "if False: model.save_pretrained_merged(\"unsloth_finetune\", tokenizer,)\n",
+ "\n",
+ "# To export and save to your Hugging Face account\n",
+ "if False: model.push_to_hub_merged(\"YOUR_USERNAME/unsloth_finetune\", tokenizer, token = \"PUT_HERE\")"
+ ]
},
- "daed4150f70b4cd1b913f26a732b13c5": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### GGUF / llama.cpp Conversion\n",
+ "To save to `GGUF` / `llama.cpp`, we support it natively now! We clone `llama.cpp` and we default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.\n",
+ "\n",
+ "Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):\n",
+ "* `q8_0` - Fast conversion. High resource use, but generally acceptable.\n",
+ "* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.\n",
+ "* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.\n",
+ "\n",
+ "[**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)"
+ ],
+ "metadata": {
+ "id": "-33LYCDq-Q5f"
+ }
},
- "de403b0aaea3409996b04e0c826bc71a": {
- "model_module": "@jupyter-widgets/controls",
- "model_module_version": "1.5.0",
- "model_name": "DescriptionStyleModel",
- "state": {
- "_model_module": "@jupyter-widgets/controls",
- "_model_module_version": "1.5.0",
- "_model_name": "DescriptionStyleModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "StyleView",
- "description_width": ""
- }
+ {
+ "cell_type": "code",
+ "source": [
+ "# Save to 8bit Q8_0\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer,)\n",
+ "# Remember to go to https://huggingface.co/settings/tokens for a token!\n",
+ "# And change hf to your username!\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, token = \"\")\n",
+ "\n",
+ "# Save to 16bit GGUF\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"f16\")\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"f16\", token = \"\")\n",
+ "\n",
+ "# Save to q4_k_m GGUF\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\")\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\", token = \"\")\n",
+ "\n",
+ "# Save to multiple GGUF options - much faster if you want multiple!\n",
+ "if False:\n",
+ " model.push_to_hub_gguf(\n",
+ " \"hf/unsloth_finetune\", # Change hf to your username!\n",
+ " tokenizer,\n",
+ " quantization_method = [\"q4_k_m\", \"q8_0\", \"q5_k_m\",],\n",
+ " token = \"\",\n",
+ " )"
+ ],
+ "metadata": {
+ "id": "w8KfPN4b-RlH"
+ },
+ "execution_count": null,
+ "outputs": []
},
- "dff51ffe23724245a70b220217d5b891": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "9oGefWRQ-KUw"
+ },
+ "source": [
+ "Now, use the `model-unsloth.gguf` file or `model-unsloth-Q4_K_M.gguf` file in llama.cpp.\n",
+ "\n",
+ "And we're done! If you have any questions on Unsloth, we have a [Discord](https://discord.gg/unsloth) channel! If you find any bugs or want to keep updated with the latest LLM stuff, or need help, join projects etc, feel free to join our Discord!\n",
+ "\n",
+ "Some other links:\n",
+ "1. Train your own reasoning model - Llama GRPO notebook [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.1_(8B)-GRPO.ipynb)\n",
+ "2. Saving finetunes to Ollama. [Free notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)\n",
+ "3. Llama 3.2 Vision finetuning - Radiography use case. [Free Colab](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3.2_(11B)-Vision.ipynb)\n",
+ "6. See notebooks for DPO, ORPO, Continued pretraining, conversational finetuning and more on our [documentation](https://docs.unsloth.ai/get-started/unsloth-notebooks)!\n",
+ "\n",
+ "\n",
+ "

\n",
+ "

\n",
+ "

\n",
+ "\n",
+ " Join Discord if you need help + ⭐️
Star us on Github ⭐️\n",
+ "
\n",
+ "\n",
+ " This notebook and all Unsloth notebooks are licensed [LGPL-3.0](https://github.com/unslothai/notebooks?tab=LGPL-3.0-1-ov-file#readme).\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "gpuType": "T4",
+ "provenance": []
},
- "e48f4e809eee461cb458e20d1fd73847": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ "kernelspec": {
+ "display_name": "unsloth_env",
+ "language": "python",
+ "name": "python3"
},
- "e9dd2d8eef0f437b8e0d2b385e2220bd": {
- "model_module": "@jupyter-widgets/base",
- "model_module_version": "1.2.0",
- "model_name": "LayoutModel",
- "state": {
- "_model_module": "@jupyter-widgets/base",
- "_model_module_version": "1.2.0",
- "_model_name": "LayoutModel",
- "_view_count": null,
- "_view_module": "@jupyter-widgets/base",
- "_view_module_version": "1.2.0",
- "_view_name": "LayoutView",
- "align_content": null,
- "align_items": null,
- "align_self": null,
- "border": null,
- "bottom": null,
- "display": null,
- "flex": null,
- "flex_flow": null,
- "grid_area": null,
- "grid_auto_columns": null,
- "grid_auto_flow": null,
- "grid_auto_rows": null,
- "grid_column": null,
- "grid_gap": null,
- "grid_row": null,
- "grid_template_areas": null,
- "grid_template_columns": null,
- "grid_template_rows": null,
- "height": null,
- "justify_content": null,
- "justify_items": null,
- "left": null,
- "margin": null,
- "max_height": null,
- "max_width": null,
- "min_height": null,
- "min_width": null,
- "object_fit": null,
- "object_position": null,
- "order": null,
- "overflow": null,
- "overflow_x": null,
- "overflow_y": null,
- "padding": null,
- "right": null,
- "top": null,
- "visibility": null,
- "width": null
- }
+ "language_info": {
+ "name": "python",
+ "version": "3.11.11"
},
- "state": {}
- }
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
-}
\ No newline at end of file
+ "widgets": {
+ "application/vnd.jupyter.widget-state+json": {
+ "046156a2fad9435b84a19ff3163e7e4a": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "087abc8baba848babf93b8f29e5a2bcf": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "196ef60df0cd417f934f6e2304c3f180": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_cfb40ab3aa994fc9b3088d79ed5c4a26",
+ "placeholder": "",
+ "style": "IPY_MODEL_046156a2fad9435b84a19ff3163e7e4a",
+ "value": "Generating test split: 100%"
+ }
+ },
+ "1a74e168c9554fac9978a4736dbcdb11": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "1bc12b432fe5444c8e854153cf84120d": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "1d7dc3fd42e04783b7e912161ec5f7c2": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HBoxModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HBoxModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HBoxView",
+ "box_style": "",
+ "children": [
+ "IPY_MODEL_2ed0f7a88aab4e7f9e7c3216bea48b90",
+ "IPY_MODEL_48276a93bae44504aa0a485b207051d0",
+ "IPY_MODEL_8466623dbeb442468367747c2b187cdd"
+ ],
+ "layout": "IPY_MODEL_2b60182eb6724c0fa3d241a39854cd1e"
+ }
+ },
+ "23ec6113e4014445baf3dcc4dd6c4e17": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "27456772617447dc80c58eb292c185c8": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HBoxModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HBoxModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HBoxView",
+ "box_style": "",
+ "children": [
+ "IPY_MODEL_196ef60df0cd417f934f6e2304c3f180",
+ "IPY_MODEL_9113790223204b179dfeff0623f0136d",
+ "IPY_MODEL_d84c3a99025742ae939a2472f36852fa"
+ ],
+ "layout": "IPY_MODEL_daed4150f70b4cd1b913f26a732b13c5"
+ }
+ },
+ "28b385c6e67d4c05a4277889c9f5c0c4": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "2b60182eb6724c0fa3d241a39854cd1e": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "2ed0f7a88aab4e7f9e7c3216bea48b90": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_28b385c6e67d4c05a4277889c9f5c0c4",
+ "placeholder": "",
+ "style": "IPY_MODEL_1bc12b432fe5444c8e854153cf84120d",
+ "value": "Generating train split: 100%"
+ }
+ },
+ "3374e62bec23487098102668768cc9ef": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "41ba45cbf8a74dbebe9b62f5b321d315": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "48276a93bae44504aa0a485b207051d0": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "FloatProgressModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "FloatProgressModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "ProgressView",
+ "bar_style": "success",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_3374e62bec23487098102668768cc9ef",
+ "max": 68686,
+ "min": 0,
+ "orientation": "horizontal",
+ "style": "IPY_MODEL_ce09826095904bc18c8eaaff388a216d",
+ "value": 68686
+ }
+ },
+ "5a0a4e972dfd49a0a87b274ec3fd97e2": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_dff51ffe23724245a70b220217d5b891",
+ "placeholder": "",
+ "style": "IPY_MODEL_d29e7cfee3fe449e82569927b355dab0",
+ "value": "Loading checkpoint shards: 100%"
+ }
+ },
+ "695b0f6ecfa143efa7c9c4e22cc33b47": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "696d7cf5fbb64bc484d7de8bc66a4062": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "ProgressStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "ProgressStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "bar_color": null,
+ "description_width": ""
+ }
+ },
+ "751d396294e74f9490817610756c4a01": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "ProgressStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "ProgressStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "bar_color": null,
+ "description_width": ""
+ }
+ },
+ "8466623dbeb442468367747c2b187cdd": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_9947797e53bf41c9a93c9b94c877c863",
+ "placeholder": "",
+ "style": "IPY_MODEL_de403b0aaea3409996b04e0c826bc71a",
+ "value": " 68686/68686 [00:01<00:00, 48807.63 examples/s]"
+ }
+ },
+ "9113790223204b179dfeff0623f0136d": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "FloatProgressModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "FloatProgressModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "ProgressView",
+ "bar_style": "success",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_23ec6113e4014445baf3dcc4dd6c4e17",
+ "max": 7632,
+ "min": 0,
+ "orientation": "horizontal",
+ "style": "IPY_MODEL_696d7cf5fbb64bc484d7de8bc66a4062",
+ "value": 7632
+ }
+ },
+ "9947797e53bf41c9a93c9b94c877c863": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "bb5f95f83dce49d49478528866903599": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_e9dd2d8eef0f437b8e0d2b385e2220bd",
+ "placeholder": "",
+ "style": "IPY_MODEL_695b0f6ecfa143efa7c9c4e22cc33b47",
+ "value": " 2/2 [00:28<00:00, 13.72s/it]"
+ }
+ },
+ "c44003cd636b48a486e85c6dbe57177e": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "FloatProgressModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "FloatProgressModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "ProgressView",
+ "bar_style": "success",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_41ba45cbf8a74dbebe9b62f5b321d315",
+ "max": 2,
+ "min": 0,
+ "orientation": "horizontal",
+ "style": "IPY_MODEL_751d396294e74f9490817610756c4a01",
+ "value": 2
+ }
+ },
+ "c48ecf14955a4d80bd5c8b8c2c7da38d": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HBoxModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HBoxModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HBoxView",
+ "box_style": "",
+ "children": [
+ "IPY_MODEL_5a0a4e972dfd49a0a87b274ec3fd97e2",
+ "IPY_MODEL_c44003cd636b48a486e85c6dbe57177e",
+ "IPY_MODEL_bb5f95f83dce49d49478528866903599"
+ ],
+ "layout": "IPY_MODEL_e48f4e809eee461cb458e20d1fd73847"
+ }
+ },
+ "ce09826095904bc18c8eaaff388a216d": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "ProgressStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "ProgressStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "bar_color": null,
+ "description_width": ""
+ }
+ },
+ "cfb40ab3aa994fc9b3088d79ed5c4a26": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "d29e7cfee3fe449e82569927b355dab0": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "d84c3a99025742ae939a2472f36852fa": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "HTMLModel",
+ "state": {
+ "_dom_classes": [],
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "HTMLModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/controls",
+ "_view_module_version": "1.5.0",
+ "_view_name": "HTMLView",
+ "description": "",
+ "description_tooltip": null,
+ "layout": "IPY_MODEL_087abc8baba848babf93b8f29e5a2bcf",
+ "placeholder": "",
+ "style": "IPY_MODEL_1a74e168c9554fac9978a4736dbcdb11",
+ "value": " 7632/7632 [00:00<00:00, 32020.60 examples/s]"
+ }
+ },
+ "daed4150f70b4cd1b913f26a732b13c5": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "de403b0aaea3409996b04e0c826bc71a": {
+ "model_module": "@jupyter-widgets/controls",
+ "model_module_version": "1.5.0",
+ "model_name": "DescriptionStyleModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/controls",
+ "_model_module_version": "1.5.0",
+ "_model_name": "DescriptionStyleModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "StyleView",
+ "description_width": ""
+ }
+ },
+ "dff51ffe23724245a70b220217d5b891": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "e48f4e809eee461cb458e20d1fd73847": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ },
+ "e9dd2d8eef0f437b8e0d2b385e2220bd": {
+ "model_module": "@jupyter-widgets/base",
+ "model_module_version": "1.2.0",
+ "model_name": "LayoutModel",
+ "state": {
+ "_model_module": "@jupyter-widgets/base",
+ "_model_module_version": "1.2.0",
+ "_model_name": "LayoutModel",
+ "_view_count": null,
+ "_view_module": "@jupyter-widgets/base",
+ "_view_module_version": "1.2.0",
+ "_view_name": "LayoutView",
+ "align_content": null,
+ "align_items": null,
+ "align_self": null,
+ "border": null,
+ "bottom": null,
+ "display": null,
+ "flex": null,
+ "flex_flow": null,
+ "grid_area": null,
+ "grid_auto_columns": null,
+ "grid_auto_flow": null,
+ "grid_auto_rows": null,
+ "grid_column": null,
+ "grid_gap": null,
+ "grid_row": null,
+ "grid_template_areas": null,
+ "grid_template_columns": null,
+ "grid_template_rows": null,
+ "height": null,
+ "justify_content": null,
+ "justify_items": null,
+ "left": null,
+ "margin": null,
+ "max_height": null,
+ "max_width": null,
+ "min_height": null,
+ "min_width": null,
+ "object_fit": null,
+ "object_position": null,
+ "order": null,
+ "overflow": null,
+ "overflow_x": null,
+ "overflow_y": null,
+ "padding": null,
+ "right": null,
+ "top": null,
+ "visibility": null,
+ "width": null
+ }
+ }
+ }
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/original_template/Qwen3_VL_(8B)-Vision.ipynb b/original_template/Qwen3_VL_(8B)-Vision.ipynb
index 27d69000..cc074d09 100644
--- a/original_template/Qwen3_VL_(8B)-Vision.ipynb
+++ b/original_template/Qwen3_VL_(8B)-Vision.ipynb
@@ -2,21 +2,27 @@
"cells": [
{
"cell_type": "markdown",
- "metadata": {},
+ "metadata": {
+ "id": "gib37dRGOWGF"
+ },
"source": [
"### News"
]
},
{
"cell_type": "markdown",
- "metadata": {},
+ "metadata": {
+ "id": "WES1cJDEOWGF"
+ },
"source": [
"Placeholder"
]
},
{
"cell_type": "markdown",
- "metadata": {},
+ "metadata": {
+ "id": "yGlEz2VVOWGG"
+ },
"source": [
"### Installation"
]
@@ -24,7 +30,9 @@
{
"cell_type": "code",
"execution_count": null,
- "metadata": {},
+ "metadata": {
+ "id": "PglJeZZoOWGG"
+ },
"outputs": [],
"source": [
"# Placeholder"
@@ -133,7 +141,7 @@
},
{
"cell_type": "code",
- "execution_count": 4,
+ "execution_count": null,
"metadata": {
"id": "6bZsfBuZDeCL"
},
@@ -172,7 +180,7 @@
},
{
"cell_type": "code",
- "execution_count": 5,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -251,7 +259,7 @@
},
{
"cell_type": "code",
- "execution_count": 6,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -280,7 +288,7 @@
},
{
"cell_type": "code",
- "execution_count": 7,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -309,7 +317,7 @@
},
{
"cell_type": "code",
- "execution_count": 8,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -348,7 +356,7 @@
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -400,7 +408,7 @@
},
{
"cell_type": "code",
- "execution_count": 10,
+ "execution_count": null,
"metadata": {
"id": "oPXzJZzHEgXe"
},
@@ -435,7 +443,7 @@
},
{
"cell_type": "code",
- "execution_count": 11,
+ "execution_count": null,
"metadata": {
"id": "gFW2qXIr7Ezy"
},
@@ -455,7 +463,7 @@
},
{
"cell_type": "code",
- "execution_count": 12,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -497,7 +505,7 @@
},
{
"cell_type": "code",
- "execution_count": 13,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -555,7 +563,7 @@
},
{
"cell_type": "code",
- "execution_count": 14,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -609,7 +617,7 @@
},
{
"cell_type": "code",
- "execution_count": 15,
+ "execution_count": null,
"metadata": {
"cellView": "form",
"colab": {
@@ -639,7 +647,7 @@
},
{
"cell_type": "code",
- "execution_count": 16,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
@@ -823,7 +831,7 @@
},
{
"cell_type": "code",
- "execution_count": 17,
+ "execution_count": null,
"metadata": {
"cellView": "form",
"colab": {
@@ -877,7 +885,7 @@
},
{
"cell_type": "code",
- "execution_count": 18,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -935,7 +943,7 @@
},
{
"cell_type": "code",
- "execution_count": 19,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -973,7 +981,7 @@
},
{
"cell_type": "code",
- "execution_count": 20,
+ "execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
@@ -1035,7 +1043,7 @@
},
{
"cell_type": "code",
- "execution_count": 21,
+ "execution_count": null,
"metadata": {
"id": "iHjt_SMYsd3P"
},
@@ -1049,6 +1057,55 @@
"# To export and save to your Hugging Face account\n",
"if False: model.push_to_hub_merged(\"YOUR_USERNAME/unsloth_finetune\", tokenizer, token = \"PUT_HERE\")"
]
+ },
+ {
+ "cell_type": "markdown",
+ "source": [
+ "### GGUF / llama.cpp Conversion\n",
+ "To save to `GGUF` / `llama.cpp`, we support it natively now! We clone `llama.cpp` and we default save it to `q8_0`. We allow all methods like `q4_k_m`. Use `save_pretrained_gguf` for local saving and `push_to_hub_gguf` for uploading to HF.\n",
+ "\n",
+ "Some supported quant methods (full list on our [Wiki page](https://github.com/unslothai/unsloth/wiki#gguf-quantization-options)):\n",
+ "* `q8_0` - Fast conversion. High resource use, but generally acceptable.\n",
+ "* `q4_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q4_K.\n",
+ "* `q5_k_m` - Recommended. Uses Q6_K for half of the attention.wv and feed_forward.w2 tensors, else Q5_K.\n",
+ "\n",
+ "[**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Llama3_(8B)-Ollama.ipynb)"
+ ],
+ "metadata": {
+ "id": "qjuPsiqcOYaA"
+ }
+ },
+ {
+ "cell_type": "code",
+ "source": [
+ "# Save to 8bit Q8_0\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer,)\n",
+ "# Remember to go to https://huggingface.co/settings/tokens for a token!\n",
+ "# And change hf to your username!\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, token = \"\")\n",
+ "\n",
+ "# Save to 16bit GGUF\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"f16\")\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"f16\", token = \"\")\n",
+ "\n",
+ "# Save to q4_k_m GGUF\n",
+ "if False: model.save_pretrained_gguf(\"unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\")\n",
+ "if False: model.push_to_hub_gguf(\"hf/unsloth_finetune\", tokenizer, quantization_method = \"q4_k_m\", token = \"\")\n",
+ "\n",
+ "# Save to multiple GGUF options - much faster if you want multiple!\n",
+ "if False:\n",
+ " model.push_to_hub_gguf(\n",
+ " \"hf/unsloth_finetune\", # Change hf to your username!\n",
+ " tokenizer,\n",
+ " quantization_method = [\"q4_k_m\", \"q8_0\", \"q5_k_m\",],\n",
+ " token = \"\",\n",
+ " )"
+ ],
+ "metadata": {
+ "id": "At1T2hJnOdGM"
+ },
+ "execution_count": null,
+ "outputs": []
}
],
"metadata": {