<a href="https://colab.research.google.com/github/dream80/TonyColab/blob/master/backup/google_gemma_get_started.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Google Gemma Get Started

`Gemma` is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the `Gemini` models,developed by Google DeepMind and other teams across Google.

## Key details

- 2 sizes: 2B and 7B, each of which is released with pre-trained and instruction-tuned variants.
- Toolchains for inference and supervised fine-tuning (SFT) across all major frameworks: `JAX`, `PyTorch`, and `TensorFlow` through native **Keras 3.0**.
- Ready-to-use Colab and Kaggle notebooks
  - Colab notebook: https://ai.google.dev/gemma/docs/get_started
- Optimization across multiple AI hardware platforms ensures industry-leading performance, including NVIDIA GPUs and Google Cloud TPUs.
- Terms of use permit responsible commercial usage and distribution for all organizations, regardless of size.

## Keras 3

Keras version 3 is required to run Gemma model on Keras.

Keras is a simple, flexible and powerful deep learning API written in Python and capable of running on top of either JAX, TensorFlow, or PyTorch.

## Gemma on Kaggle

`Kaggle` is an online community platform for data scientists and machine learning enthusiasts.

We can find Gemma models information @ https://www.kaggle.com/models/google/gemma/frameworks/keras.







In [None]:
!pip install -U keras-nlp
!pip install -U keras

In [2]:
from google.colab import userdata
import os

os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME')
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY')

def generate_kaggle_json():
  username = userdata.get("KAGGLE_USERNAME")
  key = userdata.get("KAGGLE_KEY")

  home_dir = os.path.expanduser("~")
  dot_kaggle_dir = f"{home_dir}/.kaggle"

  if not os.path.exists(dot_kaggle_dir) or not os.path.isdir(dot_kaggle_dir):
     os.makedirs(dot_kaggle_dir)

  with open(f"{dot_kaggle_dir}/kaggle.json", "w") as file:
    file.write('{"username":"' + username + '","key":"' + key + '"}')


In [3]:
# generate_kaggle_json()

In [4]:
import keras
import keras_nlp
import numpy as np

In [5]:
print(keras.__version__)

3.0.5


In [None]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma_instruct_2b_en")
gemma_lm.generate("Keras is a", max_length=30)

### Batch Prompts

In [7]:
gemma_lm.generate(["Keras is a", "The sky is blue because"], max_length=30)

['Keras is a deep learning library for Python that provides a wide range of tools and functionalities for building, training, and evaluating deep learning models.',
 'The sky is blue because of Rayleigh scattering. Rayleigh scattering is the scattering of light by particles of a similar size to the wavelength of light. In']

### Different Sampler

By default, `greedy` sampling is used for GemmaCausalLM. Let's switch it to `top_k`.

In [8]:
gemma_lm.compile(sampler="top_k")
gemma_lm.generate("Premier league is the best league in Europe because", max_length=64)

'Premier league is the best league in Europe because:\n\n1. The league has more top-tier teams.\n2. The league is more competitive.\n3. The league has a rich history.\n4. The league has a global reach.\n5. The league provides a platform for young players.'