<a href="https://colab.research.google.com/github/SeoyeonPark1223/Gemma_FineTuning/blob/main/2nd_slang_lora_tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
import os
from google.colab import userdata

os.environ["KAGGLE_USERNAME"] = 'trispark'
os.environ["KAGGLE_KEY"] = userdata.get('trispark')

In [4]:
!pip install -q -U keras-nlp
!pip install -q -U "keras>=3"

In [5]:
os.environ["KERAS_BACKEND"]= 'jax'
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00"

In [6]:
import keras
import keras_nlp

In [7]:
import pandas as pd

## Load Dataset

In [8]:
slang_dataset = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/all_slang_only_words.csv")

In [9]:
slang_data = []

for index, row in slang_dataset.iterrows():
    # Instruction prompts the user to input the context
    instruction = (
        "Given the context below, create a new Gen Z slang term. ",
        "The slang should be catchy, easy to use, and relevant to modern youth culture. ",
        "Make sure it's something that would feel natural in casual conversation:\n\n",
        "Context: " + row['Context'],
        "Make sure that you should provide slang, description, and example as given."
    )

    # Response provides the description and example for the slang
    response = (
        "Slang: {slang}\n\n"
        "Description: {description}\n\n"
        "Example: {example}".format(
            slang=row['Slang'],
            description=row['Description'],
            example=row['Example']
        )
    )

    template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
    slang_data.append(template.format(instruction=instruction, response=response))

## Load Model + LoRA fine-tuning

In [10]:
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma2_2b_en")

In [11]:
gemma_lm.backbone.enable_lora(rank=8)

In [12]:
gemma_lm.summary()

In [13]:
# Limit the input sequence length to 256 (to control memory usage)
gemma_lm.preprocessor.sequence_length = 256

# Use AdamW (optimizer for transformer models)
optimizer = keras.optimizers.AdamW(
    learning_rate = 5e-5,
    weight_decay = 0.01,
)

# Exclude layernorm and bias terms from decay
optimizer.exclude_from_weight_decay(var_names=['bias', 'scale'])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

gemma_lm.fit(slang_data, epochs=5, batch_size=1)

Epoch 1/5
[1m1779/1779[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1523s[0m 833ms/step - loss: 0.5060 - sparse_categorical_accuracy: 0.7883
Epoch 2/5
[1m1779/1779[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1523s[0m 843ms/step - loss: 0.2676 - sparse_categorical_accuracy: 0.8725
Epoch 3/5
[1m1779/1779[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1463s[0m 822ms/step - loss: 0.2530 - sparse_categorical_accuracy: 0.8775
Epoch 4/5
[1m1779/1779[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1463s[0m 822ms/step - loss: 0.2390 - sparse_categorical_accuracy: 0.8822
Epoch 5/5
[1m1779/1779[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1453s[0m 817ms/step - loss: 0.2235 - sparse_categorical_accuracy: 0.8888


<keras.src.callbacks.history.History at 0x7a7ce9b0ef50>

## Inference (which is soooo bad)

In [16]:
tag = (
    "Given the context below, create a new slang term. "
    "The slang should be catchy, easy to use, and relevant to modern youth culture. "
    "Make sure it's something that would feel natural in casual conversation:\n\n"
)

context = "You're hanging out with friends at school just chatting in recess"

condition = "You should suggest new slang and its definition, also give some examples for clarification. Example should be long and also precise."

prompt = template.format(
    instruction = tag + context + condition,
    response="",
)

sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)

output = gemma_lm.generate(prompt, max_length=512)

print(output)

Instruction:
Given the context below, create a new slang term. The slang should be catchy, easy to use, and relevant to modern youth culture. Make sure it's something that would feel natural in casual conversation:

You're hanging out with friends at school just chatting in recessYou should suggest new slang and its definition, also give some examples for clarification. Example should be long and also precise.

Response:
Slang: ZH

Context: Hang out

Description: Hang out.

Usage: I’ll ZH with you.


In [20]:
tag = (
    "Given the context below, create a new slang term. "
    "The slang should be catchy, easy to use, and relevant to modern youth culture. "
    "Make sure it's something that would feel natural in casual conversation:\n\n"
)

context = "You're hanging out with friends at a restaurant drinking wine."

condition = "You should suggest a new slang and its definition, also give one example for clarification. Example should be long and also precise."

prompt = template.format(
    instruction = tag + context + condition,
    response="",
)

sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)

output = gemma_lm.generate(prompt, max_length=512)

print(output)

Instruction:
Given the context below, create a new slang term. The slang should be catchy, easy to use, and relevant to modern youth culture. Make sure it's something that would feel natural in casual conversation:

You're hanging out with friends at a restaurant drinking wine.You should suggest a new slang and its definition, also give one example for clarification. Example should be long and also precise.

Response:
Slang: YWIAW

Context: You’re welcome, it’s wine and appetizers.

Example: Let’s order some food—YWIAW.


In [22]:
tag = (
    "Given the context below, create a new slang term. "
    "The slang should be catchy, easy to use, and relevant to modern youth culture. "
    "Make sure it's something that would feel natural in casual conversation:\n\n"
)

context = "You're at your office working on your project with your teammates"

condition = "You should suggest a new slang and its definition, also give one example for clarification. Example should be long and also precise."

prompt = template.format(
    instruction = tag + context + condition,
    response="",
)

sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
gemma_lm.compile(sampler=sampler)

output = gemma_lm.generate(prompt, max_length=512)

print(output)

Instruction:
Given the context below, create a new slang term. The slang should be catchy, easy to use, and relevant to modern youth culture. Make sure it's something that would feel natural in casual conversation:

You're at your office working on your project with your teammatesYou should suggest a new slang and its definition, also give one example for clarification. Example should be long and also precise.

Response:
Slang: ZYN

Context: Zero your network

Description: Refers to disconnecting from the internet to prevent any data leaks

Use: Zyn for added security

#
