# Adatbányászat beadandó: LLM Tesztelés

## A modell amivel dolgozunk: 
    dlite-v2-1_5b
### A modell elérhetősége:
    https://huggingface.co/aisquared/dlite-v2-1_5b
### Alap modell:
    https://huggingface.co/openai-community/gpt2-large
### Tanító adathalmaz 01:
    https://huggingface.co/datasets/aisquared/databricks-dolly-15k
### Tanító adathalmaz 02(Feladatspecifikus finomhangoláshoz):
    questions.csv

### -1 Lépés. Lokális környezet:
    Windows 11 X64
    Intel I7-10750H
    16GB NVIDIA
    GTX1650Ti-4GB

### 0. Lépés: Előfeltételek.
    Rendszergazdai jog ajánlott
	Python <= 3.10.11 (Ennél újabb verzió esetén a Pytorch jelenlegi verziója nem működik)
	Nvdia driver
	Cuda
	Cuda toolkit and devkit

### 1. Lépés: futtatókörnyezet kialakítása
##### Másik meghajtón dolgozunk mert minden elég nagy méretű.
    d:
    cd D:\elte-ik-adatbanyaszat

##### Python virtuális környezetben dolgozunk, hogy ne szemeteljük a gépet, többször próbálkozhassunk.
    python -m venv venv
    venv\Scripts\activate

##### Ajánlott telepítés a cikk szerint, sajnos CPU only, javítsuk.
    pip install "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2"

##### Telepítsünk rá 2.2-es torch-ot CUDA 12.1-el.
    pip install torch==2.2.0+cu121 -f https://download.pytorch.org/whl/torch_stable.html

##### Telepítsünk Jupitert, hogy könnyebb dolgunk legyen
    pip install jupyter

##### Indítsuk a Jupitert
    jupyter notebook

## 2. Lépés: a környezet tesztelése

#### Megfelelő a Python verzió?

In [2]:
!python --version

Python 3.10.11


#### Elérhető a CUDA?

In [3]:
!nvidia-smi

Wed Mar 13 18:51:30 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 551.76                 Driver Version: 551.76         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1650 Ti   WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   38C    P3             11W /   35W |       0MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

#### Elérhető a CUDA devkit?

In [4]:
!nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Wed_Nov_22_10:30:42_Pacific_Standard_Time_2023
Cuda compilation tools, release 12.3, V12.3.107
Build cuda_12.3.r12.3/compiler.33567101_0


#### Elérhető a Pytorch? És CUDA-val GPU-n?

In [5]:
import torch
print(torch.cuda.is_available())
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
torch.zeros(1).cuda()

True
cuda:0


tensor([0.], device='cuda:0')

## 3. Lépés: Szerezzük be a modellt lokálisan

In [6]:
!git clone https://huggingface.co/aisquared/dlite-v2-1_5b

Cloning into 'dlite-v2-1_5b'...
Updating files:  66% (8/12)
Updating files:  75% (9/12)
Updating files:  83% (10/12)
Updating files:  91% (11/12)
Updating files: 100% (12/12)
Updating files: 100% (12/12), done.
Filtering content: 100% (2/2), 2.94 GiB | 11.13 MiB/s
Filtering content: 100% (2/2), 2.94 GiB | 11.08 MiB/s, done.


## 4. Lépés: Teszteljük a modellt lokális fájlból GPU-n.

In [15]:
import sys
sys.path.insert(1, './dlite-v2-1_5b/')
from instruct_pipeline import InstructionTextGenerationPipeline
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("dlite-v2-1_5b", padding_side="left")
model = AutoModelForCausalLM.from_pretrained("dlite-v2-1_5b", device_map="auto", torch_dtype=torch.bfloat16)

#model.to("cuda:0")

generate_text = InstructionTextGenerationPipeline(model=model, tokenizer=tokenizer)

res = generate_text("Who was George Washington?")
print(res)



George Washington was a revolutionary who led the American Revolution and became the first President of the United States.


#### +1 Teszteljük tovább

In [18]:
res = generate_text("Tell me from Isaac Newton.")
print(res)

Isaac Newton is famous for his three laws of classical mechanics.  According to these three laws, all objects at rest remain at rest, and for every action, there is an equal and opposite reaction.  For example, if I throw a ball, I can expect the ball to hit a ceiling.  If the ball is moving forward at a constant velocity, the acceleration due to gravity is equal to the square of the distance moved, or 3 g.

Newton also discovered two other unique laws that gave him the final two laws of classical mechanics.  First, for every action, there is an equal and opposite reaction. For example, if I throw the ball, the air in front of me pushes the ball away. Second, if a system is in motion, then the force of its motion is equal to its mass multiplied by its velocity.


In [19]:
res = generate_text("Who was Isaac Asimow?")
print(res)

Isaac Asimow (July 23, 1924 – September 17, 2013), known professionally as Isaac Klein, was a singer, songwriter and guitarist who was best known for his years as a member of the rock band R.E.M. He is known for his works with R.E.M. on their album Stuck in the Middle with You (1995), their Movie fame with Their Greatest Hits (2001), and his solo work, including the Grammy Award-winning album Of This World (2005).


In [20]:
res = generate_text("Who wrote the Invincible?")
print(res)

The Invincible is a superhero created by writer Robert Kirkman and artist Tony Moore. It first appeared in the second issue of the Invincible comic book series written by Kirkman and published by Image Comics in 2004.


The Invincible is an American teenager with superpowers. Though he is often at odds with society, the two eventually become best friends and form a superhero team called the Team. The character has been influenced by many popular superheroes, including Spider-Man, Superman, and Wonder Woman.


In [21]:
res = generate_text("Who wrote the Invincible SciFi novel?")
print(res)

Russell Blake wrote the novel Invincible.


In [22]:
res = generate_text("Who was Stanislaw Lem?")
print(res)

Stanislaw Lem is a Polish-American author, best known for his science fiction, including such classics as The Stars, a Star, a Nest and The Silent Planet.


In [23]:
res = generate_text("List Stanislaw Lem writings.")
print(res)

Stanislaw Lem's primary literary genres are the speculative fiction and the memoir.
