<a href="https://colab.research.google.com/github/kmk4444/System_engineering/blob/main/Llama_3_1_8b_%2B_Unsloth_2x_faster_finetuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

To run this, press "*Runtime*" and press "*Run all*" on a **free** Tesla T4 Google Colab instance!
<div class="align-center">
  <a href="https://github.com/unslothai/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/unsloth%20new%20logo.png" width="115"></a>
  <a href="https://discord.gg/u54VK8m8tk"><img src="https://github.com/unslothai/unsloth/raw/main/images/Discord button.png" width="145"></a>
  <a href="https://ko-fi.com/unsloth"><img src="https://github.com/unslothai/unsloth/raw/main/images/Kofi button.png" width="145"></a></a> Join Discord if you need help + ⭐ <i>Star us on <a href="https://github.com/unslothai/unsloth">Github</a> </i> ⭐
</div>

To install Unsloth on your own computer, follow the installation instructions on our Github page [here](https://github.com/unslothai/unsloth?tab=readme-ov-file#-installation-instructions).

You will learn how to do [data prep](#Data), how to [train](#Train), how to [run the model](#Inference), & [how to save it](#Save) (eg for Llama.cpp).

[NEW] Llama-3.1 8b, 70b & 405b are trained on a crazy 15 trillion tokens with 128K long context lengths!

**[NEW] Try 2x faster inference in a free Colab for Llama-3.1 8b Instruct [here](https://colab.research.google.com/drive/1T-YBVfnphoVc8E2E854qF3jdia2Ll2W2?usp=sharing)**

In [None]:
%%capture
!pip install unsloth
# Also get the latest nightly Unsloth!
!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

* We support Llama, Mistral, Phi-3, Gemma, Yi, DeepSeek, Qwen, TinyLlama, Vicuna, Open Hermes etc
* We support 16bit LoRA or 4bit QLoRA. Both 2x faster.
* `max_seq_length` can be set to anything, since we do automatic RoPE Scaling via [kaiokendev's](https://kaiokendev.github.io/til) method.
* [**NEW**] We make Gemma-2 9b / 27b **2x faster**! See our [Gemma-2 9b notebook](https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing)
* [**NEW**] To finetune and auto export to Ollama, try our [Ollama notebook](https://colab.research.google.com/drive/1WZDi7APtQ9VsvOrQSSC5DDtxq159j8iZ?usp=sharing)
* [**NEW**] We make Mistral NeMo 12B 2x faster and fit in under 12GB of VRAM! [Mistral NeMo notebook](https://colab.research.google.com/drive/17d3U-CAIwzmbDRqbZ9NnpHxCkmXB6LZ0?usp=sharing)

In [None]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = False # Use 4bit quantization to reduce memory usage. Can be False.

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/Meta-Llama-3.1-8B-bnb-4bit",      # Llama-3.1 15 trillion tokens model 2x faster!
    "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
    "unsloth/Meta-Llama-3.1-70B-bnb-4bit",
    "unsloth/Meta-Llama-3.1-405B-bnb-4bit",    # We also uploaded 4bit for 405b!
    "unsloth/Mistral-Nemo-Base-2407-bnb-4bit", # New Mistral 12b 2x faster!
    "unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit",
    "unsloth/mistral-7b-v0.3-bnb-4bit",        # Mistral v3 2x faster!
    "unsloth/mistral-7b-instruct-v0.3-bnb-4bit",
    "unsloth/Phi-3.5-mini-instruct",           # Phi-3.5 2x faster!
    "unsloth/Phi-3-medium-4k-instruct",
    "unsloth/gemma-2-9b-bnb-4bit",
    "unsloth/gemma-2-27b-bnb-4bit",            # Gemma 2x faster!
] # More models at https://huggingface.co/unsloth

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Meta-Llama-3.1-8B-Instruct",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 8, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 8,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

<a name="Data"></a>
### Data Prep
We now use the Alpaca dataset from [yahma](https://huggingface.co/datasets/yahma/alpaca-cleaned), which is a filtered version of 52K of the original [Alpaca dataset](https://crfm.stanford.edu/2023/03/13/alpaca.html). You can replace this code section with your own data prep.

**[NOTE]** To train only on completions (ignoring the user's input) read TRL's docs [here](https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only).

**[NOTE]** Remember to add the **EOS_TOKEN** to the tokenized output!! Otherwise you'll get infinite generations!

If you want to use the `llama-3` template for ShareGPT datasets, try our conversational [notebook](https://colab.research.google.com/drive/1XamvWYinY6FOSX9GLvnqSjjsNflxdhNc?usp=sharing).

For text completions like novel writing, try this [notebook](https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing).

In [None]:
!pip install openpyxl
import pandas as pd
from datasets import Dataset

csv_path = "/content/drive/MyDrive/Fine-tune/Dataset_last_version.xlsx"

# Excel dosyasını okurken encoding parametresini belirtin
df = pd.read_excel(
    csv_path,
    engine='openpyxl'
)

# Karakter temizleme işlemini kaldırın veya değiştirin
# Bu satırı kaldırın çünkü Türkçe karakterleri siliyor:
# df = df.replace(r'[^\x00-\x7F]+', '?', regex=True)

# Sadece geçersiz karakterleri temizlemek için daha spesifik bir regex kullanın
df = df.applymap(lambda x: x if isinstance(x, str) else str(x))
# Sadece gerçekten problemli karakterleri temizleyin, Türkçe karakterlere dokunmayın
df = df.replace(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]', '', regex=True)

# Gerekli sütunları seçin
df = df[['instruction', 'input', 'output']]

# Dataset'e çevirin
dataset = Dataset.from_pandas(df)

# Prompt formatlama işlemi
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs = examples["input"]
    outputs = examples["output"]
    texts = []
    for instruction, input, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(instruction, input, output) + EOS_TOKEN
        texts.append(text)
    return {"text": texts}

# Map işlemini uygulayın
dataset = dataset.map(formatting_prompts_func, batched=True)

# Kontrol için yazdırma
df_dataset = dataset.to_pandas()
print(df_dataset.head())

# Dosyanın içeriğini kontrol etmek için


In [None]:
print("İlk birkaç satır:")
print(df[['instruction', 'input', 'output']].head(30))
print("\nKarakter encoding'i kontrol:")
for col in ['instruction', 'input', 'output']:
    print(f"\n{col} sütunundaki benzersiz karakterler:")
    unique_chars = set(''.join(df[col].astype(str).values))
    print(''.join(sorted(unique_chars)))

<a name="Train"></a>
### Train the model
Now let's use Huggingface TRL's `SFTTrainer`! More docs here: [TRL SFT docs](https://huggingface.co/docs/trl/sft_trainer). We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!

#LOG LOSS A GÖRE MAX_STEPS BELİRLEYECEĞİZ. BURADA num_train_epochs=1 YAPINCA TOPLAM STEPS SAYISINI BULACAĞIZ. YANİ 10.000 CİVARI TOPLAM STEPS YAPINCA %100'Ü EĞİTİLİYOR.

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False, # Can make training 5x faster for short sequences.
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        #num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 1000,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
    ),
)

In [None]:
#@title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

In [None]:
trainer_stats = trainer.train()

# Model kaydetme işlemi - full model olarak kaydet

In [None]:
#@title Show final memory and time stats
used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
used_memory_for_lora = round(used_memory - start_gpu_memory, 3)
used_percentage = round(used_memory         /max_memory*100, 3)
lora_percentage = round(used_memory_for_lora/max_memory*100, 3)
print(f"{trainer_stats.metrics['train_runtime']} seconds used for training.")
print(f"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.")
print(f"Peak reserved memory = {used_memory} GB.")
print(f"Peak reserved memory for training = {used_memory_for_lora} GB.")
print(f"Peak reserved memory % of max memory = {used_percentage} %.")
print(f"Peak reserved memory for training % of max memory = {lora_percentage} %.")

### Saving to float16 for VLLM

We also support saving to `float16` directly. Select `merged_16bit` for float16 or `merged_4bit` for int4. We also allow `lora` adapters as a fallback. Use `push_to_hub_merged` to upload to your Hugging Face account! You can go to https://huggingface.co/settings/tokens for your personal tokens.

In [None]:
# Merge to 16bit
if False: model.save_pretrained_merged("model", tokenizer, save_method = "merged_16bit",)
if True: model.push_to_hub_merged("Meta-Llama-3.1-8B-Instruct_syseng_vllm_v3", tokenizer, save_method = "merged_16bit", token = "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")

# Merge to 4bit
if False: model.save_pretrained_merged("model", tokenizer, save_method = "merged_4bit",)
if False: model.push_to_hub_merged("hf/model", tokenizer, save_method = "merged_4bit", token = "")

# Just LoRA adapters
if False: model.save_pretrained_merged("model", tokenizer, save_method = "lora",)
if False: model.push_to_hub_merged("hf/model", tokenizer, save_method = "lora", token = "")

In [None]:
# Kullanıcının sağladığı 1000 eğitim kaybı değerini içeren tam listeyi yeniden tanımlama
#r:8, lora_alpha=8
training_loss_values = """
1	2.175000
2	2.223000
3	2.280100
4	2.010100
5	2.059600
6	1.939600
7	1.906600
8	1.685100
9	1.692700
10	1.406800
11	1.336000
12	1.268800
13	1.101700
14	1.211500
15	1.163300
16	1.272500
17	1.002200
18	1.101200
19	1.030000
20	0.975800
21	1.086200
22	1.116400
23	1.096700
24	1.065400
25	0.966200
26	1.002600
27	0.979400
28	0.916800
29	1.009300
30	1.021900
31	0.937800
32	1.072400
33	0.956800
34	1.039800
35	0.926900
36	0.987700
37	0.931600
38	1.025900
39	0.974300
40	1.079500
41	1.086800
42	0.998500
43	1.048500
44	0.913900
45	0.943300
46	0.933800
47	0.994200
48	0.968800
49	1.028800
50	0.974400
51	0.920800
52	1.013900
53	1.040900
54	0.880400
55	0.882300
56	0.980500
57	0.930000
58	0.967400
59	0.863000
60	0.854600
61	0.839800
62	1.038300
63	0.922600
64	0.976700
65	0.929000
66	0.991300
67	0.937600
68	0.878200
69	0.972800
70	0.883800
71	0.837900
72	0.973200
73	0.871500
74	1.000900
75	0.878900
76	0.814200
77	0.909100
78	0.850000
79	0.840300
80	0.925300
81	0.824300
82	0.816000
83	0.949800
84	0.858200
85	0.973900
86	0.897000
87	0.826500
88	0.975200
89	0.994000
90	0.847100
91	0.947300
92	0.910100
93	0.914000
94	0.961700
95	1.023300
96	0.821900
97	1.009600
98	0.921000
99	0.877500
100	0.831800
101	0.910700
102	0.757200
103	0.907400
104	0.858800
105	0.861700
106	0.896200
107	0.879000
108	0.953300
109	0.810200
110	0.978800
111	0.812100
112	0.804300
113	0.761900
114	0.957900
115	0.823400
116	0.878200
117	0.904200
118	0.872800
119	0.878100
120	0.861400
121	0.801100
122	0.853800
123	0.950900
124	0.887000
125	0.839900
126	0.776100
127	0.933800
128	0.830200
129	0.848700
130	0.900600
131	1.027800
132	0.914800
133	0.957300
134	0.819200
135	0.823600
136	0.808300
137	0.818900
138	0.917600
139	0.844400
140	0.791400
141	0.857100
142	0.956200
143	0.957300
144	0.973600
145	0.849700
146	1.001600
147	1.025000
148	1.016500
149	0.835200
150	0.836900
151	0.841800
152	0.818400
153	0.825400
154	0.879900
155	0.897900
156	0.801500
157	0.870200
158	0.922200
159	0.890400
160	0.743400
161	0.764000
162	0.695200
163	0.917700
164	0.851700
165	0.882200
166	0.753700
167	0.825900
168	0.898900
169	0.920600
170	0.884300
171	0.909300
172	0.732600
173	0.847500
174	0.937100
175	0.950600
176	0.929700
177	1.021600
178	0.880400
179	0.927700
180	0.825000
181	0.749900
182	0.825600
183	0.768000
184	0.816200
185	0.796400
186	0.941400
187	0.926100
188	0.897000
189	0.831500
190	0.863500
191	0.824700
192	0.784500
193	0.883100
194	0.811000
195	0.800800
196	0.886700
197	0.707500
198	0.765400
199	0.753300
200	0.714600
201	0.853700
202	0.720100
203	0.826300
204	0.887000
205	1.069500
206	0.777400
207	0.858500
208	0.811900
209	0.774000
210	0.826600
211	0.835300
212	0.785300
213	0.799400
214	0.803600
215	0.868600
216	0.831100
217	0.879400
218	0.735100
219	0.916000
220	0.775100
221	0.818400
222	0.742500
223	0.877300
224	0.826000
225	0.810700
226	0.783700
227	0.823000
228	0.727300
229	0.834300
230	0.957500
231	0.736800
232	0.867200
233	1.053500
234	0.951600
235	0.821500
236	0.812100
237	0.703800
238	0.766800
239	0.842500
240	0.797400
241	0.831700
242	0.834600
243	0.824700
244	0.763500
245	0.874800
246	0.817200
247	0.796100
248	0.714400
249	0.963000
250	0.878400
251	0.839400
252	0.811100
253	0.793000
254	0.688800
255	0.917300
256	0.740900
257	0.982000
258	0.766700
259	0.740800
260	0.763700
261	0.815800
262	0.748300
263	0.757600
264	0.828900
265	0.860100
266	0.800400
267	0.759100
268	0.672900
269	0.774800
270	0.768000
271	0.884800
272	0.801000
273	0.816200
274	0.708900
275	0.822200
276	0.686300
277	0.724100
278	0.848400
279	0.804300
280	0.795600
281	0.747500
282	0.865200
283	0.712500
284	0.803900
285	0.790500
286	0.818200
287	0.796300
288	0.916000
289	0.697400
290	0.773800
291	0.852300
292	0.922800
293	0.894100
294	0.975800
295	0.853300
296	0.806100
297	0.860200
298	0.723400
299	0.783600
300	0.776300
301	0.781700
302	0.695600
303	0.733100
304	0.864400
305	0.882600
306	0.768800
307	0.943500
308	0.722500
309	0.690700
310	0.846000
311	0.792800
312	0.663200
313	0.859700
314	0.779600
315	0.686600
316	0.828700
317	0.690900
318	0.703900
319	0.696100
320	0.860900
321	0.761900
322	0.702300
323	0.803500
324	0.646400
325	0.828700
326	0.816600
327	0.956200
328	0.822800
329	0.802200
330	0.738000
331	0.737600
332	0.728400
333	0.708100
334	0.821500
335	0.867600
336	0.793100
337	0.756000
338	0.849700
339	0.700900
340	0.743500
341	0.853300
342	0.777200
343	0.737600
344	0.797900
345	0.824700
346	0.650300
347	0.754900
348	0.754300
349	0.868600
350	0.817800
351	0.880600
352	0.722700
353	0.767100
354	0.930300
355	0.794600
356	0.703100
357	0.889000
358	0.786100
359	0.660000
360	0.777200
361	0.721400
362	0.856300
363	0.834400
364	0.822400
365	0.836700
366	0.793500
367	0.773300
368	0.799900
369	0.773400
370	0.952400
371	0.789000
372	0.808900
373	0.795800
374	0.716400
375	0.963600
376	0.776100
377	0.793400
378	0.870600
379	0.806600
380	0.645300
381	0.677200
382	0.868700
383	0.728400
384	0.709300
385	0.823300
386	0.734000
387	0.847200
388	0.788700
389	0.847100
390	0.763400
391	0.865300
392	0.704900
393	0.857700
394	0.865500
395	0.752800
396	0.845600
397	0.713800
398	0.863200
399	0.665700
400	0.767500
401	0.781200
402	0.787000
403	0.762100
404	0.709100
405	0.773900
406	0.681300
407	0.866500
408	0.708000
409	0.767600
410	0.773600
411	0.779200
412	0.767500
413	0.718100
414	0.750000
415	0.716900
416	0.720500
417	0.724900
418	0.736300
419	0.765100
420	0.804500
421	0.733500
422	0.836800
423	0.801900
424	0.749300
425	0.736300
426	0.783100
427	0.692200
428	0.881700
429	0.803100
430	0.713600
431	0.891400
432	0.745300
433	0.721800
434	0.859000
435	0.688200
436	0.796300
437	0.638300
438	0.817100
439	0.775900
440	0.793900
441	0.719500
442	0.673500
443	0.707300
444	0.694600
445	0.648500
446	0.732200
447	0.789600
448	0.688900
449	0.734000
450	0.755500
451	0.863800
452	0.753000
453	0.802000
454	0.771800
455	0.777300
456	0.763400
457	0.693300
458	0.791300
459	0.644600
460	0.884700
461	0.778600
462	0.693600
463	0.882000
464	0.787100
465	0.813500
466	0.821600
467	0.727000
468	0.869700
469	0.694700
470	0.794300
471	0.758000
472	0.687000
473	0.762100
474	0.819900
475	0.765400
476	0.971300
477	0.792500
478	0.733300
479	0.900800
480	0.841400
481	0.728000
482	0.720300
483	0.840800
484	0.732800
485	0.707200
486	0.753000
487	0.686000
488	0.835700
489	0.795100
490	0.789300
491	0.709600
492	0.761400
493	0.706800
494	0.738600
495	0.782100
496	0.787200
497	0.758100
498	0.702000
499	0.756600
500	0.877900
501	0.735100
502	0.771700
503	0.690000
504	0.802200
505	0.708100
506	0.730600
507	0.806700
508	0.609600
509	0.660200
510	0.780800
511	0.760200
512	0.752300
513	0.809300
514	0.726300
515	0.786900
516	0.750600
517	0.812200
518	0.717500
519	0.798900
520	0.722500
521	0.755800
522	0.738700
523	0.789400
524	0.747900
525	0.664000
526	0.706500
527	0.750500
528	0.755100
529	0.870700
530	0.634700
531	0.780000
532	0.685000
533	0.668600
534	0.610100
535	0.759100
536	0.760600
537	0.715100
538	0.709200
539	0.688800
540	0.798200
541	0.792300
542	0.724400
543	0.620700
544	0.745600
545	0.765200
546	0.631600
547	0.788100
548	0.721700
549	0.743200
550	0.800500
551	0.824600
552	0.718000
553	0.709900
554	0.676200
555	0.700700
556	0.718200
557	0.649100
558	0.839800
559	0.739100
560	0.742600
561	0.735200
562	0.732000
563	0.755300
564	0.743200
565	0.680000
566	0.674000
567	0.637500
568	0.788100
569	0.872000
570	0.756800
571	0.809200
572	0.660400
573	0.693800
574	0.846300
575	0.820200
576	0.680700
577	0.758900
578	0.775500
579	0.663600
580	0.744300
581	0.751900
582	0.629600
583	0.909800
584	0.625400
585	0.610800
586	0.757600
587	0.753700
588	0.794900
589	0.741000
590	0.675900
591	0.708800
592	0.829300
593	0.703700
594	0.859100
595	0.711600
596	0.846500
597	0.756500
598	0.823400
599	0.625000
600	0.699400
601	0.686900
602	0.792500
603	0.670200
604	0.774400
605	0.789800
606	0.748000
607	0.809700
608	0.805400
609	0.771700
610	0.735500
611	0.702500
612	0.635500
613	0.786700
614	0.673800
615	0.826200
616	0.871100
617	0.811900
618	0.664100
619	0.756400
620	0.839100
621	0.896200
622	0.702600
623	0.756000
624	0.694700
625	0.708300
626	0.819800
627	0.820400
628	0.636000
629	0.753100
630	0.739500
631	0.871600
632	0.657200
633	0.572300
634	0.608500
635	0.661800
636	0.704300
637	0.807600
638	0.703100
639	0.752600
640	0.722300
641	0.797100
642	0.785800
643	0.684500
644	0.742200
645	0.801400
646	0.855200
647	0.705100
648	0.729500
649	0.718300
650	0.681000
651	0.757700
652	0.657800
653	0.798000
654	0.662800
655	0.635200
656	0.865000
657	0.707300
658	0.675700
659	0.684200
660	0.695200
661	0.755400
662	0.739800
663	0.813400
664	0.777500
665	0.683000
666	0.792700
667	0.869400
668	0.758100
669	0.748100
670	0.818800
671	0.747000
672	0.655600
673	0.753200
674	0.741200
675	0.933100
676	0.712900
677	0.682600
678	0.739800
679	0.807300
680	0.823000
681	0.832200
682	0.811100
683	0.708800
684	0.829600
685	0.796900
686	0.820100
687	0.794100
688	0.788300
689	0.843000
690	0.671500
691	0.689700
692	0.728300
693	0.813600
694	0.680700
695	0.710800
696	0.905900
697	0.735600
698	0.694100
699	0.714200
700	0.743300
701	0.685600
702	0.666400
703	0.860900
704	0.842600
705	0.711700
706	0.636500
707	0.886600
708	0.688800
709	0.665100
710	0.614400
711	0.671700
712	0.691500
713	0.854600
714	0.741000
715	0.786300
716	0.770400
717	0.777800
718	0.706500
719	0.743800
720	0.611800
721	0.659500
722	0.820800
723	0.771000
724	0.823200
725	0.831600
726	0.748800
727	0.673800
728	0.646800
729	0.708900
730	0.712700
731	0.723300
732	0.642200
733	0.699100
734	0.781600
735	0.852700
736	0.723300
737	0.678900
738	0.750700
739	0.736900
740	0.928700
741	0.777400
742	0.683000
743	0.759100
744	0.640900
745	0.828400
746	0.789400
747	0.739500
748	0.697200
749	0.859700
750	0.824400
751	0.695500
752	0.677800
753	0.796000
754	0.671800
755	0.772400
756	0.698200
757	0.676200
758	0.723800
759	0.696400
760	0.640500
761	0.809900
762	0.620700
763	0.696800
764	0.756900
765	0.937100
766	0.720100
767	0.713800
768	0.731900
769	0.717300
770	0.818700
771	0.833500
772	0.755400
773	0.735600
774	0.842500
775	0.725700
776	0.705400
777	0.687200
778	0.790400
779	0.670200
780	0.609100
781	0.767500
782	0.696000
783	0.656800
784	0.732700
785	0.777600
786	0.729700
787	0.789800
788	0.731800
789	0.754200
790	0.701100
791	0.843900
792	0.654200
793	0.759700
794	0.757300
795	0.676800
796	0.624200
797	0.652800
798	0.768000
799	0.631800
800	0.793500
801	0.673900
802	0.754700
803	0.579600
804	0.764400
805	0.716600
806	0.756100
807	0.715600
808	0.672300
809	0.731700
810	0.702700
811	0.695300
812	0.766100
813	0.683500
814	0.790800
815	0.724800
816	0.732200
817	0.628700
818	0.761300
819	0.720600
820	0.811000
821	0.703900
822	0.708900
823	0.694900
824	0.719300
825	0.773700
826	0.810500
827	0.850800
828	0.689400
829	0.621900
830	0.781500
831	0.812600
832	0.790300
833	0.855100
834	0.804200
835	0.719700
836	0.651500
837	0.670100
838	0.768600
839	0.669700
840	0.755100
841	0.710100
842	0.716100
843	0.731600
844	0.747200
845	0.716600
846	0.681800
847	0.718100
848	0.792700
849	0.681500
850	0.696900
851	0.698000
852	0.801500
853	0.772000
854	0.687800
855	0.720800
856	0.750600
857	0.698200
858	0.795800
859	0.714900
860	0.697400
861	0.726400
862	0.705300
863	0.610300
864	0.661300
865	0.728300
866	0.676600
867	0.753300
868	0.635100
869	0.685000
870	0.705300
871	0.739300
872	0.830100
873	0.677200
874	0.842800
875	0.870000
876	0.634700
877	0.785400
878	0.687800
879	0.637200
880	0.724700
881	0.773600
882	0.682600
883	0.737500
884	0.641300
885	0.735100
886	0.577400
887	0.654600
888	0.732100
889	0.680800
890	0.715000
891	0.768400
892	0.615500
893	0.617200
894	0.729900
895	0.739600
896	0.634000
897	0.688900
898	0.686400
899	0.749300
900	0.633400
901	0.624500
902	0.706600
903	0.815000
904	0.781800
905	0.768600
906	0.711800
907	0.792500
908	0.726800
909	0.773300
910	0.681200
911	0.797900
912	0.668200
913	0.707600
914	0.679600
915	0.735500
916	0.660500
917	0.714100
918	0.707700
919	0.715200
920	0.711400
921	0.664500
922	0.631900
923	0.739600
924	0.677000
925	0.781900
926	0.649200
927	0.684200
928	0.918500
929	0.683300
930	0.692600
931	0.697700
932	0.740300
933	0.688000
934	0.827200
935	0.775900
936	0.714600
937	0.686900
938	0.719200
939	0.788600
940	0.687700
941	0.827000
942	0.736200
943	0.753500
944	0.665300
945	0.832100
946	0.683900
947	0.697500
948	0.717000
949	0.690100
950	0.665600
951	0.740500
952	0.677300
953	0.887900
954	0.660600
955	0.785100
956	0.809700
957	0.853900
958	0.687900
959	0.683800
960	0.676500
961	0.717600
962	0.668700
963	0.770300
964	0.726000
965	0.607600
966	0.754900
967	0.626500
968	0.635600
969	0.650000
970	0.630700
971	0.723800
972	0.681500
973	0.936000
974	0.736100
975	0.802000
976	0.741100
977	0.651700
978	0.681700
979	0.849300
980	0.624700
981	0.728900
982	0.727200
983	0.701400
984	0.714000
985	0.649300
986	0.651300
987	0.664200
988	0.820200
989	0.790700
990	0.713700
991	0.703100
992	0.732100
993	0.676800
994	0.836000
995	0.647400
996	0.722200
997	0.712400
998	0.783900
999	0.824200
1000	0.747000
"""

import matplotlib.pyplot as plt
import numpy as np

# Parse the training loss values
lines = training_loss_values.strip().split("\n")
epochs = []
loss_values = []

for line in lines:
    epoch, loss = line.split()
    epochs.append(int(epoch))
    loss_values.append(float(loss))

# Create the plot
plt.figure(figsize=(12, 6))
plt.plot(epochs, loss_values, label="Training Loss", color='b')

# Format the y-axis to show four decimal places
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'{x:.4f}'))

# Labels and title
plt.xlabel("Epochs")
plt.ylabel("Loss")
plt.title("Training Loss Over Epochs")
plt.legend()
plt.grid(True)

# Show the plot
plt.show()


