<a href="https://colab.research.google.com/github/myrondza10/gemma-llm-genai-/blob/main/gemma.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Install Required Libraries & Packages

In [None]:
pip install transformers==4.31.0 accelerate==0.21.0 einops==0.6.1 langchain==0.0.240 xformers==0.0.20 bitsandbytes==0.41.0 peft safetensors sentencepiece streamlit langchain sentence-transformers gradio pypdf chromadb==0.4.15 pypdfium2

# Pretrained (Google's) Gemma Model 🤗

<img src= "https://miro.medium.com/v2/resize:fit:1200/1*d0jWRcc5uum17Rhl9x51HA.png" height=200 width=400>

<img src = "https://cdn.wccftech.com/wp-content/uploads/2023/12/What-is-AI-1.jpg" height=400 width=720>



### Import Required Libraries & Packages

In [None]:
import torch
import transformers
from transformers import BitsAndBytesConfig
import os
import gradio as gr
import chromadb
from langchain.llms import HuggingFacePipeline
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain import HuggingFacePipeline
from langchain.document_loaders import PyPDFium2Loader
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.chains import RetrievalQA

### Import Google's Gemma Model from Hugging Face 🤗

In [None]:
model_id = 'google/gemma-2b'

device = f'cuda:{torch.cuda.current_device()}' if torch.cuda.is_available() else 'cpu'

hf_auth = '<hugging_face_access_token>'

In [None]:
device

'cpu'

In [None]:
model_config = transformers.AutoConfig.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    use_auth_token=hf_auth
)

tokenizer = transformers.AutoTokenizer.from_pretrained(
    model_id,
    use_auth_token=hf_auth
)



model.safetensors.index.json:   0%|          | 0.00/13.5k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.95G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/67.1M [00:00<?, ?B/s]

Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to `gelu_pytorch_tanh`.if you want to use the legacy `gelu`, edit the `model.config` to set `hidden_activation=gelu`   instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/137 [00:00<?, ?B/s]



tokenizer_config.json:   0%|          | 0.00/1.11k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/555 [00:00<?, ?B/s]

### Gemma's Model Architecture

In [None]:
model

GemmaForCausalLM(
  (model): GemmaModel(
    (embed_tokens): Embedding(256000, 2048, padding_idx=0)
    (layers): ModuleList(
      (0-17): 18 x GemmaDecoderLayer(
        (self_attn): GemmaAttention(
          (q_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear(in_features=2048, out_features=256, bias=False)
          (v_proj): Linear(in_features=2048, out_features=256, bias=False)
          (o_proj): Linear(in_features=2048, out_features=2048, bias=False)
          (rotary_emb): GemmaRotaryEmbedding()
        )
        (mlp): GemmaMLP(
          (gate_proj): Linear(in_features=2048, out_features=16384, bias=False)
          (up_proj): Linear(in_features=2048, out_features=16384, bias=False)
          (down_proj): Linear(in_features=16384, out_features=2048, bias=False)
          (act_fn): PytorchGELUTanh()
        )
        (input_layernorm): GemmaRMSNorm()
        (post_attention_layernorm): GemmaRMSNorm()
      )
    )
    (norm): GemmaRMSNo

### LLM Output

In [None]:
input_text = "What does the company Peak.ai do?"
input_ids = tokenizer(input_text, return_tensors="pt").to("cpu")

outputs = model.generate(**input_ids,max_length=500)
print(tokenizer.decode(outputs[0]))

<bos>What does the company Peak.ai do?

Peak.ai is a company that provides a platform for businesses to automate their marketing and sales processes. The platform uses artificial intelligence (AI) to help businesses identify and target the right customers, as well as to automate the sales process.

What are the benefits of using Peak.ai?

Peak.ai can help businesses save time and money by automating their marketing and sales processes. The platform can also help businesses identify and target the right customers, which can lead to increased sales.

What are the features of Peak.ai?

Peak.ai offers a number of features, including:

- A customer relationship management (CRM) system that helps businesses track and manage their customer interactions.

- A marketing automation platform that helps businesses create and send marketing campaigns.

- A sales automation platform that helps businesses create and send sales campaigns.

- A customer service platform that helps businesses manage cus