# Pytorch version to use gemma model locally

Copyright 2024 Paul T. Miller > For Academic Use

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

# Get started
## Install the latest NVIDIA system driver https://www.nvidia.com/download/index.aspx
#### If you wish to add cuda and cudnn to your system:
##### https://developer.nvidia.com/cuda-downloads
##### https://developer.nvidia.com/cudnn-downloads
  
## Install torch with cuda support (https://pytorch.org/get-started/locally/)
For Windows: 
+ pip3 install torch torchvision torchaudio --index-url https[]()://download.pytorch.org/whl/cu121
+ pip3 install immutabledict sentencepiece Flask
  
For Linux: 
+ pip3 install torch torchvision torchaudio
+ pip3 install immutabledict sentencepiece Flask
  
Or install using Conda with: 
+ conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

In [None]:
import platform
os_type = platform.system()
if os_type == 'Linux':
    !pip3 install torch torchvision torchaudio
else:
    !pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
!pip install immutabledict sentencepiece Flask
import torch 
from pytorch_gemma import GemmaModel
torch.cuda.is_available()

## Add the large language model to the project
+ Version options are '2b', '2b-it', '7b', 7b-it'. The 'it' in name indicates an instruction-tuned model.   
+ Download the desired model from https://www.kaggle.com/models/google/gemma/frameworks/pyTorch. 
<img src="models/model_download.png">
+ <b>Recommend starting with 2b-it<b>. (Worked on a simple laptop with RTX2060 6GB VRAM).  
+ Extract contents from the downloaded archive.tar.gz and put the files under the 'model/\<version\>/' folder.  

### In Jupyter they look something like:
<img src="models/2b_it_pic.png">

### In file viewer they look something like:
<img src="models/files.png"> 

## Modify the parameters in the code as desired
+ The output_size indicates the max allowed for model output. 
+ Machine - use 'cuda' if you have cuda installed with an Nvidia GPU, otherwise use 'cpu'.
+ Use of 'cpu' is not recommended and will be slow. Drop output_size down if using 'cpu'.

In [None]:
## Originally tested on Nvidia RTX2060 with 6GB VRAM on an Asus laptop
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"The {device} device is available on this system. Try to use it below for machine.")

In [None]:
model = None
model = GemmaModel(output_len=250, version='2b-it', machine=device)

# Run the following and enter your question

In [None]:
# model.output_len = 1000 ## we can modify on the fly if desired

In [None]:
## You can manually set the request
## text = "Explain Einstein's general theory of relativity"
## print(model.get_response(text))
## or

In [None]:
## use looped input
text = model.get_input()