<a href="https://colab.research.google.com/github/kmk4444/System_engineering/blob/main/Llama_3_1_8b_%2B_Unsloth_2x_faster_finetuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

To run this, press "*Runtime*" and press "*Run all*" on a **free** Tesla T4 Google Colab instance!
<div class="align-center">
  <a href="https://github.com/unslothai/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>
  <a href="https://discord.gg/u54VK8m8tk"><img src="https://github.com/unslothai/unsloth/raw/main/images/Discord button.png" width="145"></a>
  <a href="https://ko-fi.com/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/Kofi button.png" width="145"></a></a> Join Discord if you need help + ⭐ <i>Star us on <a href="https://github.com/unslothai/unsloth">Github</a> </i> ⭐
</div>

To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions).

You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save) (eg for Llama.cpp).

[NEW] Llama-3.1 8b, 70b & 405b are trained on a crazy 15 trillion tokens with 128K long context lengths!

**[NEW] Try 2x faster inference in a free Colab for Llama-3.1 8b Instruct [here](https://colab.research.google.com/drive/1T-YBVfnphoVc8E2E854qF3jdia2Ll2W2?usp=sharing)**

In [None]:
%%capture
!pip install unsloth
# Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

* We support Llama, Mistral, Phi-3, Gemma, Yi, DeepSeek, Qwen, TinyLlama, Vicuna, Open Hermes etc
* We support 16bit LoRA or 4bit QLoRA. Both 2x faster.
* `max_seq_length` can be set to anything, since we do automatic RoPE Scaling via [kaiokendev's](https://kaiokendev.github.io/til) method.
* [**NEW**] We make Gemma-2 9b / 27b **2x faster**! See our [Gemma-2 9b notebook](https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing)
* [**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/drive/1WZDi7APtQ9VsvOrQSSC5DDtxq159j8iZ?usp=sharing)
* [**NEW**] We make Mistral NeMo 12B 2x faster and fit in under 12GB of VRAM! [Mistral NeMo notebook](https://colab.research.google.com/drive/17d3U-CAIwzmbDRqbZ9NnpHxCkmXB6LZ0?usp=sharing)

In [None]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = False # Use 4bit quantization to reduce memory usage. Can be False.

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/Meta-Llama-3.1-8B-bnb-4bit",      # Llama-3.1 15 trillion tokens model 2x faster!
    "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
    "unsloth/Meta-Llama-3.1-70B-bnb-4bit",
    "unsloth/Meta-Llama-3.1-405B-bnb-4bit",    # We also uploaded 4bit for 405b!
    "unsloth/Mistral-Nemo-Base-2407-bnb-4bit", # New Mistral 12b 2x faster!
    "unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit",
    "unsloth/mistral-7b-v0.3-bnb-4bit",        # Mistral v3 2x faster!
    "unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
    "unsloth/Phi-3.5-mini-instruct",           # Phi-3.5 2x faster!
    "unsloth/Phi-3-medium-4k-instruct",
    "unsloth/gemma-2-9b-bnb-4bit",
    "unsloth/gemma-2-27b-bnb-4bit",            # Gemma 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Meta-Llama-3.1-8B-Instruct",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 8, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 8,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

<a name="Data"></a>
### Data Prep
We now use the Alpaca dataset from [yahma](https://huggingface.co/datasets/yahma/alpaca-cleaned), which is a filtered version of 52K of the original [Alpaca dataset](https://crfm.stanford.edu/2023/03/13/alpaca.html). You can replace this code section with your own data prep.

**[NOTE]** To train only on completions (ignoring the user's input) read TRL's docs [here](https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only).

**[NOTE]** Remember to add the **EOS_TOKEN** to the tokenized output!! Otherwise you'll get infinite generations!

If you want to use the `llama-3` template for ShareGPT datasets, try our conversational [notebook](https://colab.research.google.com/drive/1XamvWYinY6FOSX9GLvnqSjjsNflxdhNc?usp=sharing).

For text completions like novel writing, try this [notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing).

In [None]:
!pip install openpyxl
import pandas as pd
from datasets import Dataset

csv_path = "/content/drive/MyDrive/Fine-tune/Dataset_last_version.xlsx"

# Excel dosyasını okurken encoding parametresini belirtin
df = pd.read_excel(
    csv_path,
    engine='openpyxl'
)

# Karakter temizleme işlemini kaldırın veya değiştirin
# Bu satırı kaldırın çünkü Türkçe karakterleri siliyor:
# df = df.replace(r'[^\x00-\x7F]+', '?', regex=True)

# Sadece geçersiz karakterleri temizlemek için daha spesifik bir regex kullanın
df = df.applymap(lambda x: x if isinstance(x, str) else str(x))
# Sadece gerçekten problemli karakterleri temizleyin, Türkçe karakterlere dokunmayın
df = df.replace(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]', '', regex=True)

# Gerekli sütunları seçin
df = df[['instruction', 'input', 'output']]

# Dataset'e çevirin
dataset = Dataset.from_pandas(df)

# Prompt formatlama işlemi
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs = examples["input"]
    outputs = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

# Map işlemini uygulayın
dataset = dataset.map(formatting_prompts_func, batched=True)

# Kontrol için yazdırma
df_dataset = dataset.to_pandas()
print(df_dataset.head())

# Dosyanın içeriğini kontrol etmek için


In [None]:
print("İlk birkaç satır:")
print(df[['instruction', 'input', 'output']].head(30))
print("\nKarakter encoding'i kontrol:")
for col in ['instruction', 'input', 'output']:
    print(f"\n{col} sütunundaki benzersiz karakterler:")
    unique_chars = set(''.join(df[col].astype(str).values))
    print(''.join(sorted(unique_chars)))

<a name="Train"></a>
### Train the model
Now let's use Huggingface TRL's `SFTTrainer`! More docs here: [TRL SFT docs](https://huggingface.co/docs/trl/sft_trainer). We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!

#LOG LOSS A GÖRE MAX_STEPS BELİRLEYECEĞİZ. BURADA num_train_epochs=1 YAPINCA TOPLAM STEPS SAYISINI BULACAĞIZ. YANİ 10.000 CİVARI TOPLAM STEPS YAPINCA %100'Ü EĞİTİLİYOR.

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        #num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 1000,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
    ),
)

In [None]:
#@title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

In [None]:
trainer_stats = trainer.train()

# Model kaydetme işlemi - full model olarak kaydet

In [None]:
#@title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory         /max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

### Saving to float16 for VLLM

We also support saving to `float16` directly. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.

In [None]:
# Merge to 16bit
if False: model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit",)
if True: model.push_to_hub_merged("Meta-Llama-3.1-8B-Instruct_syseng_vllm_v3", tokenizer, save_method = "merged_16bit", token = "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")

# Merge to 4bit
if False: model.save_pretrained_merged("model", tokenizer, save_method = "merged_4bit",)
if False: model.push_to_hub_merged("hf/model", tokenizer, save_method = "merged_4bit", token = "")

# Just LoRA adapters
if False: model.save_pretrained_merged("model", tokenizer, save_method = "lora",)
if False: model.push_to_hub_merged("hf/model", tokenizer, save_method = "lora", token = "")

In [None]:
# Kullanıcının sağladığı 1000 eğitim kaybı değerini içeren tam listeyi yeniden tanımlama
#8_8_true
training_loss_values = """
1	2.175000
2	2.223000
3	2.261000
4	1.943900
5	1.878800
6	1.660700
7	1.518600
8	1.341500
9	1.374400
10	1.146700
11	1.154300
12	1.100100
13	0.992000
14	1.069300
15	1.041100
16	1.152200
17	0.919900
18	1.016800
19	0.970200
20	0.891900
21	1.014700
22	1.052800
23	1.050400
24	1.049200
25	0.910100
26	0.958000
27	0.937500
28	0.895200
29	0.987000
30	0.976400
31	0.905100
32	1.041800
33	0.912500
34	1.008900
35	0.902000
36	0.959000
37	0.895600
38	0.983600
39	0.940700
40	1.048100
41	1.058100
42	0.976400
43	1.009100
44	0.900200
45	0.916800
46	0.916300
47	0.955200
48	0.927200
49	0.986700
50	0.954000
51	0.896500
52	0.975300
53	1.015000
54	0.856200
55	0.836600
56	0.961100
57	0.923400
58	0.935300
59	0.848800
60	0.837900
61	0.826000
62	1.018900
63	0.918500
64	0.955300
65	0.908600
66	0.962600
67	0.907400
68	0.848000
69	0.952300
70	0.861800
71	0.828700
72	0.926900
73	0.846500
74	0.959900
75	0.855200
76	0.791500
77	0.887900
78	0.829500
79	0.829300
80	0.889300
81	0.805800
82	0.815100
83	0.931000
84	0.844700
85	0.959200
86	0.884600
87	0.802800
88	0.941200
89	0.971200
90	0.826200
91	0.927600
92	0.896800
93	0.892300
94	0.919000
95	0.995600
96	0.797000
97	0.994600
98	0.890000
99	0.857600
100	0.814100
101	0.884300
102	0.736400
103	0.887600
104	0.842800
105	0.850100
106	0.889400
107	0.860000
108	0.916600
109	0.796100
110	0.959600
111	0.795400
112	0.786000
113	0.741000
114	0.937000
115	0.806600
116	0.857300
117	0.887800
118	0.849600
119	0.873500
120	0.833800
121	0.788900
122	0.839400
123	0.937200
124	0.863100
125	0.822400
126	0.759100
127	0.911800
128	0.821000
129	0.828200
130	0.884200
131	1.009200
132	0.897600
133	0.927800
134	0.808500
135	0.802100
136	0.791900
137	0.781500
138	0.908500
139	0.823900
140	0.770700
141	0.832100
142	0.931200
143	0.951800
144	0.959100
145	0.836000
146	0.983700
147	1.008100
148	1.002600
149	0.821200
150	0.819400
151	0.815600
152	0.796100
153	0.817800
154	0.878300
155	0.883600
156	0.775500
157	0.849300
158	0.896400
159	0.884800
160	0.731400
161	0.750500
162	0.671600
163	0.913000
164	0.822200
165	0.870700
166	0.747600
167	0.809900
168	0.879600
169	0.906400
170	0.868700
171	0.880800
172	0.722100
173	0.820500
174	0.905700
175	0.931500
176	0.913000
177	1.002900
178	0.868300
179	0.910100
180	0.825900
181	0.721800
182	0.811000
183	0.756100
184	0.804100
185	0.789200
186	0.934500
187	0.908100
188	0.883600
189	0.814400
190	0.849700
191	0.814200
192	0.778300
193	0.863100
194	0.804800
195	0.800700
196	0.860400
197	0.687000
198	0.746000
199	0.730800
200	0.698900
201	0.830700
202	0.710600
203	0.832000
204	0.856600
205	1.050600
206	0.764000
207	0.850900
208	0.820700
209	0.761100
210	0.813300
211	0.823000
212	0.766800
213	0.789400
214	0.798400
215	0.847700
216	0.813400
217	0.862100
218	0.708300
219	0.915100
220	0.761700
221	0.807200
222	0.735500
223	0.852400
224	0.814200
225	0.788200
226	0.770800
227	0.804400
228	0.712300
229	0.819500
230	0.939300
231	0.730300
232	0.860200
233	1.040800
234	0.950300
235	0.813400
236	0.795500
237	0.698700
238	0.755400
239	0.824200
240	0.784400
241	0.820800
242	0.818200
243	0.809000
244	0.761700
245	0.862800
246	0.806900
247	0.790900
248	0.707600
249	0.948600
250	0.856700
251	0.826400
252	0.786000
253	0.776800
254	0.673300
255	0.907800
256	0.723800
257	0.968600
258	0.748400
259	0.731200
260	0.751100
261	0.800700
262	0.741900
263	0.746000
264	0.817200
265	0.853100
266	0.777900
267	0.752200
268	0.659700
269	0.755000
270	0.766400
271	0.872200
272	0.785800
273	0.801200
274	0.687100
275	0.810600
276	0.674400
277	0.705600
278	0.829000
279	0.774400
280	0.779800
281	0.721600
282	0.850200
283	0.689300
284	0.787300
285	0.771700
286	0.798100
287	0.784500
288	0.901400
289	0.686800
290	0.765900
291	0.840100
292	0.892800
293	0.875700
294	0.957800
295	0.837600
296	0.782700
297	0.846900
298	0.711300
299	0.771800
300	0.765100
301	0.774100
302	0.681900
303	0.723800
304	0.851000
305	0.862100
306	0.751600
307	0.927800
308	0.695800
309	0.687900
310	0.840400
311	0.774900
312	0.656300
313	0.836400
314	0.764800
315	0.675300
316	0.807600
317	0.702700
318	0.703700
319	0.683700
320	0.848300
321	0.744900
322	0.689100
323	0.783500
324	0.641200
325	0.823700
326	0.799900
327	0.948900
328	0.812200
329	0.785300
330	0.722100
331	0.721900
332	0.714100
333	0.707300
334	0.813400
335	0.861900
336	0.785600
337	0.748600
338	0.848700
339	0.691400
340	0.719200
341	0.844400
342	0.765100
343	0.723600
344	0.789500
345	0.812900
346	0.640000
347	0.747800
348	0.749500
349	0.854900
350	0.805700
351	0.869500
352	0.715600
353	0.766100
354	0.908300
355	0.783800
356	0.695300
357	0.875300
358	0.775400
359	0.647200
360	0.770100
361	0.698800
362	0.852900
363	0.814800
364	0.814300
365	0.818400
366	0.775700
367	0.767500
368	0.782400
369	0.760200
370	0.947100
371	0.777300
372	0.791800
373	0.784200
374	0.704700
375	0.947400
376	0.776000
377	0.779500
378	0.861200
379	0.795500
380	0.639500
381	0.655800
382	0.841600
383	0.717900
384	0.702700
385	0.810500
386	0.724300
387	0.831900
388	0.770300
389	0.834600
390	0.751000
391	0.852900
392	0.685400
393	0.835000
394	0.851500
395	0.734800
396	0.839900
397	0.704600
398	0.853300
399	0.656500
400	0.753300
401	0.762300
402	0.769000
403	0.753100
404	0.693400
405	0.760500
406	0.660900
407	0.849600
408	0.704500
409	0.757200
410	0.764500
411	0.769400
412	0.763800
413	0.708500
414	0.738500
415	0.708900
416	0.708500
417	0.722900
418	0.735300
419	0.757100
420	0.791800
421	0.712300
422	0.832400
423	0.775300
424	0.746600
425	0.728800
426	0.783300
427	0.689800
428	0.863100
429	0.784300
430	0.696600
431	0.862200
432	0.729800
433	0.713600
434	0.847600
435	0.681700
436	0.791000
437	0.617100
438	0.796900
439	0.770800
440	0.796900
441	0.714500
442	0.665300
443	0.691000
444	0.683200
445	0.647000
446	0.725000
447	0.781700
448	0.676300
449	0.714200
450	0.739800
451	0.843300
452	0.732500
453	0.781300
454	0.756600
455	0.767500
456	0.752700
457	0.683800
458	0.774600
459	0.632800
460	0.873600
461	0.758000
462	0.689200
463	0.864800
464	0.765900
465	0.791700
466	0.817200
467	0.713100
468	0.867200
469	0.678500
470	0.771300
471	0.741300
472	0.677600
473	0.750800
474	0.808800
475	0.760400
476	0.957400
477	0.779800
478	0.713000
479	0.884900
480	0.832600
481	0.722700
482	0.709700
483	0.815400
484	0.727000
485	0.691900
486	0.736800
487	0.676800
488	0.823900
489	0.774500
490	0.781000
491	0.694100
492	0.751000
493	0.691000
494	0.734200
495	0.771900
496	0.767600
497	0.739400
498	0.683200
499	0.747000
500	0.868800
501	0.724900
502	0.760200
503	0.687400
504	0.791800
505	0.699400
506	0.714100
507	0.800500
508	0.610200
509	0.645800
510	0.767400
511	0.741200
512	0.732200
513	0.804700
514	0.703800
515	0.776500
516	0.730200
517	0.803800
518	0.706200
519	0.791800
520	0.707400
521	0.741700
522	0.728400
523	0.780100
524	0.733500
525	0.656500
526	0.698000
527	0.745100
528	0.741500
529	0.860700
530	0.626100
531	0.777600
532	0.682800
533	0.649700
534	0.608200
535	0.756400
536	0.750200
537	0.708400
538	0.691300
539	0.677000
540	0.794700
541	0.780800
542	0.705900
543	0.599200
544	0.729700
545	0.754700
546	0.618900
547	0.775200
548	0.704200
549	0.736600
550	0.777800
551	0.809900
552	0.705300
553	0.694200
554	0.666600
555	0.696300
556	0.707200
557	0.634400
558	0.834700
559	0.724100
560	0.734300
561	0.718200
562	0.724500
563	0.742300
564	0.733200
565	0.667800
566	0.663200
567	0.631800
568	0.768500
569	0.842300
570	0.740600
571	0.799600
572	0.651800
573	0.672200
574	0.836600
575	0.825100
576	0.665300
577	0.746500
578	0.765900
579	0.648100
580	0.735600
581	0.741100
582	0.619500
583	0.885500
584	0.614100
585	0.600900
586	0.739400
587	0.745700
588	0.776400
589	0.724600
590	0.669400
591	0.694800
592	0.809300
593	0.694400
594	0.834500
595	0.696700
596	0.832300
597	0.741500
598	0.818400
599	0.615300
600	0.693400
601	0.677100
602	0.781200
603	0.659200
604	0.762100
605	0.779900
606	0.729300
607	0.800100
608	0.795300
609	0.760200
610	0.726100
611	0.698100
612	0.621500
613	0.771400
614	0.658600
615	0.809400
616	0.856300
617	0.800400
618	0.647000
619	0.745000
620	0.815800
621	0.880500
622	0.678500
623	0.736000
624	0.685600
625	0.691800
626	0.809700
627	0.804200
628	0.627100
629	0.741200
630	0.724000
631	0.847700
632	0.648300
633	0.557700
634	0.595000
635	0.650700
636	0.689400
637	0.796400
638	0.693100
639	0.737600
640	0.704200
641	0.787900
642	0.769100
643	0.671400
644	0.732100
645	0.794300
646	0.846000
647	0.697000
648	0.719400
649	0.701600
650	0.668000
651	0.748900
652	0.653000
653	0.775800
654	0.646500
655	0.619200
656	0.845400
657	0.698800
658	0.661600
659	0.678300
660	0.685700
661	0.742200
662	0.726600
663	0.795200
664	0.758300
665	0.667900
666	0.779300
667	0.852800
668	0.741000
669	0.733300
670	0.796100
671	0.744200
672	0.638500
673	0.737300
674	0.729800
675	0.920600
676	0.693400
677	0.671900
678	0.732500
679	0.790300
680	0.801500
681	0.809400
682	0.789200
683	0.709100
684	0.815200
685	0.777400
686	0.805600
687	0.779700
688	0.773600
689	0.828200
690	0.647100
691	0.672600
692	0.714300
693	0.797200
694	0.678400
695	0.703800
696	0.884100
697	0.718100
698	0.670000
699	0.695000
700	0.722200
701	0.664200
702	0.654700
703	0.835200
704	0.826100
705	0.695400
706	0.633000
707	0.875400
708	0.671800
709	0.653800
710	0.593200
711	0.651300
712	0.670100
713	0.841800
714	0.726800
715	0.765600
716	0.751300
717	0.763200
718	0.688900
719	0.730600
720	0.607100
721	0.648600
722	0.809700
723	0.755100
724	0.798600
725	0.818000
726	0.729600
727	0.658600
728	0.637500
729	0.694100
730	0.707600
731	0.701100
732	0.630600
733	0.688400
734	0.759300
735	0.830900
736	0.705500
737	0.665700
738	0.739800
739	0.719800
740	0.903400
741	0.757400
742	0.672700
743	0.743700
744	0.630700
745	0.810900
746	0.773700
747	0.726900
748	0.680500
749	0.831300
750	0.819100
751	0.673900
752	0.662800
753	0.780600
754	0.665500
755	0.756000
756	0.679700
757	0.655800
758	0.701400
759	0.674200
760	0.621200
761	0.793200
762	0.610300
763	0.682500
764	0.740400
765	0.923600
766	0.703900
767	0.696200
768	0.713700
769	0.698200
770	0.796300
771	0.814700
772	0.736600
773	0.712400
774	0.828400
775	0.706800
776	0.688300
777	0.664400
778	0.777500
779	0.649900
780	0.595400
781	0.741500
782	0.678800
783	0.630400
784	0.723200
785	0.766800
786	0.719100
787	0.781200
788	0.716500
789	0.743700
790	0.681500
791	0.826300
792	0.651600
793	0.746300
794	0.734100
795	0.657200
796	0.605500
797	0.638600
798	0.750400
799	0.619900
800	0.772900
801	0.655400
802	0.740600
803	0.567700
804	0.749800
805	0.707700
806	0.740100
807	0.702900
808	0.648900
809	0.710500
810	0.688200
811	0.688400
812	0.747800
813	0.664400
814	0.778700
815	0.717300
816	0.713400
817	0.612700
818	0.744300
819	0.710200
820	0.791400
821	0.685400
822	0.694000
823	0.674800
824	0.709700
825	0.753000
826	0.795400
827	0.832900
828	0.679400
829	0.611700
830	0.762600
831	0.790000
832	0.776000
833	0.851000
834	0.782100
835	0.709200
836	0.636100
837	0.654700
838	0.743700
839	0.642800
840	0.733500
841	0.692700
842	0.712100
843	0.717600
844	0.740100
845	0.698800
846	0.666500
847	0.700000
848	0.762000
849	0.671400
850	0.678300
851	0.684600
852	0.771900
853	0.761300
854	0.672500
855	0.706400
856	0.740100
857	0.684600
858	0.771300
859	0.696400
860	0.686900
861	0.708900
862	0.692200
863	0.592100
864	0.645000
865	0.712500
866	0.663600
867	0.740300
868	0.615600
869	0.664700
870	0.687900
871	0.730500
872	0.805900
873	0.674200
874	0.822200
875	0.853700
876	0.618600
877	0.773300
878	0.673600
879	0.629900
880	0.712200
881	0.753300
882	0.655100
883	0.718400
884	0.622800
885	0.715300
886	0.563200
887	0.630200
888	0.719700
889	0.662200
890	0.700000
891	0.743800
892	0.600200
893	0.607000
894	0.709000
895	0.718300
896	0.617000
897	0.674200
898	0.671400
899	0.738000
900	0.614200
901	0.605800
902	0.696100
903	0.797400
904	0.763600
905	0.760000
906	0.698100
907	0.771900
908	0.709500
909	0.765200
910	0.668800
911	0.788900
912	0.651500
913	0.690300
914	0.667000
915	0.722200
916	0.635200
917	0.700100
918	0.695400
919	0.698600
920	0.688500
921	0.649900
922	0.618000
923	0.709800
924	0.662200
925	0.763900
926	0.623700
927	0.680000
928	0.902600
929	0.663400
930	0.680200
931	0.683700
932	0.724400
933	0.670300
934	0.809700
935	0.758100
936	0.694100
937	0.669100
938	0.706200
939	0.771200
940	0.660900
941	0.799800
942	0.708600
943	0.738500
944	0.647900
945	0.817300
946	0.670600
947	0.677800
948	0.692200
949	0.671300
950	0.646800
951	0.713600
952	0.665000
953	0.872800
954	0.649100
955	0.772000
956	0.796100
957	0.834500
958	0.667900
959	0.655700
960	0.655600
961	0.697100
962	0.652400
963	0.753700
964	0.707200
965	0.587700
966	0.740600
967	0.610400
968	0.620900
969	0.633800
970	0.621100
971	0.709100
972	0.666800
973	0.918500
974	0.717000
975	0.788400
976	0.720300
977	0.641900
978	0.662600
979	0.824900
980	0.611800
981	0.712500
982	0.711300
983	0.685700
984	0.702800
985	0.630000
986	0.636800
987	0.643800
988	0.791700
989	0.771000
990	0.698900
991	0.687800
992	0.717500
993	0.659900
994	0.811600
995	0.625800
996	0.697100
997	0.696700
998	0.764700
999	0.806000
1000	0.727900
"""

import matplotlib.pyplot as plt
import numpy as np

# Parse the training loss values
lines = training_loss_values.strip().split("\n")
epochs = []
loss_values = []

for line in lines:
    epoch, loss = line.split()
    epochs.append(int(epoch))
    loss_values.append(float(loss))

# Create the plot
plt.figure(figsize=(12, 6))
plt.plot(epochs, loss_values, label="Training Loss", color='b')

# Labels and title
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.title("Training Loss Over Epochs")
plt.legend()
plt.grid(True)

# Show the plot
plt.show()


