<a href="https://colab.research.google.com/github/sourcesync/kagglex_gemma/blob/gw%2Finitial/colab/gemma_ft_dolly__with_context_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


#  This notebook demonstrates the following:
   * fine tuning "gemma2_2b_en" on the dolly dataset
   * shows prompt completion before and after fine-tuning this model
   * it runs successfully in COLAB


# Get access to Gemma via your Kaggle account:
  * Log into your Kaggle account
  * Request access to Gemma models using your Kaggle account.  You can follow these instructions here: https://www.kaggle.com/code/nilaychauhan/get-started-with-gemma-using-kerasnlp
  * You need to wait for confirmation.  Note that this didn't take too long for me.
  * Create an API key in your Kaggle account you will need later.  You can follow these instructions here: https://christianjmills.com/posts/kaggle-obtain-api-key-tutorial/



# Ensure your Colab account can access Gemma:
  * Add the Kaggle API key into your COLAB secrets.  You can follow these instructions here: https://drlee.io/how-to-use-secrets-in-google-colab-for-api-key-protection-a-guide-for-openai-huggingface-and-c1ec9e1277e0



# Select an AI hardware accelerator
  * Select hardware options near the top right of your Colab notebook
  * I tested with A100 and it worked well.  Note that I have a Colab Pro subscription.


# Install required python packages

In [1]:
%%time
!pip install -q -U keras-nlp
!pip install -q -U "keras>=3"

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/548.4 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m548.4/548.4 kB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.2/5.2 MB[0m [31m99.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m43.9 MB/s[0m eta [36m0:00:00[0m
[?25hCPU times: user 35.3 ms, sys: 15.2 ms, total: 50.5 ms
Wall time: 7.73 s


# Import required python packages

In [17]:
import os
import keras
import keras_nlp
from keras_nlp.models import GemmaBackbone, BertBackbone
from keras.models import load_model
from IPython.display import Markdown
import textwrap
from google.colab import userdata
import json
import random
import pprint

# Configure this notebook
* set up KERAS parameters recommended by Google
* integrate KAGGLE API secret key

In [3]:
os.environ["KERAS_BACKEND"] = "jax"  # Or "torch" or "tensorflow".
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"]="1.00" # Avoid memory fragmentation on JAX backend.
os.environ["KAGGLE_USERNAME"] = userdata.get('KAGGLE_USERNAME') # Link to KAGGLE API secret key
os.environ["KAGGLE_KEY"] = userdata.get('KAGGLE_KEY') # Link to KAGGLE API secret key

# Retrieve the fine-tuning dataset

In [4]:
%%time
!wget -O databricks-dolly-15k.jsonl https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl
!pwd
!ls

--2024-09-27 17:15:40--  https://huggingface.co/datasets/databricks/databricks-dolly-15k/resolve/main/databricks-dolly-15k.jsonl
Resolving huggingface.co (huggingface.co)... 13.35.210.114, 13.35.210.66, 13.35.210.61, ...
Connecting to huggingface.co (huggingface.co)|13.35.210.114|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://cdn-lfs.hf.co/repos/34/ac/34ac588cc580830664f592597bb6d19d61639eca33dc2d6bb0b6d833f7bfd552/2df9083338b4abd6bceb5635764dab5d833b393b55759dffb0959b6fcbf794ec?response-content-disposition=inline%3B+filename*%3DUTF-8%27%27databricks-dolly-15k.jsonl%3B+filename%3D%22databricks-dolly-15k.jsonl%22%3B&Expires=1727716541&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcyNzcxNjU0MX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy5oZi5jby9yZXBvcy8zNC9hYy8zNGFjNTg4Y2M1ODA4MzA2NjRmNTkyNTk3YmI2ZDE5ZDYxNjM5ZWNhMzNkYzJkNmJiMGI2ZDgzM2Y3YmZkNTUyLzJkZjkwODMzMzhiNGFiZDZiY2ViNTYzNTc2NGRhYjVkODMzYjM5M2I1NTc1OWRmZmI

# Define some useful functions used later
* display_chat() function

In [5]:
def display_chat(prompt, response):
  '''Displays an LLM prompt and response in a pretty way.'''
  prompt = prompt.replace('\n\n','<br><br>')
  prompt = prompt.replace('\n','<br>')
  formatted_prompt = "<font size='+1' color='brown'>🙋‍♂️<blockquote>" + prompt + "</blockquote></font>"
  response = response.replace('•', '  *')
  response = textwrap.indent(response, '', predicate=lambda _: True)
  response = response.replace('\n\n','<br><br>')
  response = response.replace('\n','<br>')
  response = response.replace("```","")
  formatted_text = "<font size='+1' color='teal'>🤖<blockquote>" + response + "</blockquote></font>"
  return Markdown(formatted_prompt+formatted_text)

# Load the fine-tuning dataset and sample a record
* loads the dataset into an array
* randomly sample a record for use later

In [26]:
ft_dataset_all = []
with open("/content/databricks-dolly-15k.jsonl") as file:
    ft_dataset_all = [ json.loads(ln) for ln in file.readlines()]
ft_record = random.choice(ft_dataset_all)
pprint.pp(ft_record)

{'instruction': 'As per the passage which languages did Ram Mohan Roy know?',
 'context': 'Ram Mohan Roy was born in\xa0Radhanagar,\xa0Hooghly District,\xa0'
            'Bengal Presidency. His great grandfather Krishnakanta '
            'Bandyopadhyay was a Rarhi\xa0Kulin\xa0(noble)\xa0Brahmin. Among '
            'Kulin Brahmins\xa0– descendants of the six families of Brahmins '
            'imported from\xa0Kannauj\xa0by\xa0Ballal Sen\xa0in the 12th '
            'century\xa0– those from the Rarhi district of West Bengal were '
            'notorious in the 19th century for living off dowries by marrying '
            'several women.\xa0Kulinism\xa0was a synonym for polygamy and the '
            'dowry system, both of which Rammohan campaigned against.\xa0His '
            'father, Ramkanta, was a\xa0Vaishnavite, while his mother, Tarini '
            'Devi, was from a\xa0Shaivite\xa0family. He was a great scholar of '
            'Sanskrit, Persian and English languages and also 

# Decide on how much fine-tuning data to use
* Often this is determined experimentally
* I've found at least 1000 data points suffice in general

In [28]:
ft_data = ft_dataset_all[:1000]

# Load the Gemma model

In [29]:
%%time
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma2_2b_en")
# uncomment the following lines to "sample the softmax probabilities of the model"
#sampler = keras_nlp.samplers.TopKSampler(k=5, seed=2)
#gemma_lm.compile(sampler=sampler)

CPU times: user 9.01 s, sys: 8.9 s, total: 17.9 s
Wall time: 46.9 s


# Ask the model something related to the random record
* If its not general knowledge (or in the pre-training of Gemma), we would not expect the model to "know" anything about it.

In [30]:
%%time
prompt = "Who is Ram Mohan Roy?"
completion = gemma_lm.generate(prompt,max_length=1024)
response = completion.replace(prompt, "")
display_chat(prompt, response)

CPU times: user 2min 11s, sys: 1.36 s, total: 2min 12s
Wall time: 1min 2s


<font size='+1' color='brown'>🙋‍♂️<blockquote>Who is Ram Mohan Roy?</blockquote></font><font size='+1' color='teal'>🤖<blockquote><br><br>Ram Mohan Roy was born in 1772 in the village of Radhanagar, in the district of Hooghly, in the province of Bengal, in the Indian subcontinent. He was the son of a Brahmin priest.<br><br>He was a Hindu reformer, who was born in a Brahmin family. He was a great scholar and a great thinker. He was a great reformer and a great social reformer. He was a great reformer of the Hindu religion. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was a great reformer of the Hindu society. He was</blockquote></font>

# The model seems to know something!  Let's ask it a specific question in the random record context.

In [31]:
%%time
prompt = "What languages did Ram Mohan Roy know?"
completion = gemma_lm.generate(prompt,max_length=1024)
response = completion.replace(prompt, "")
display_chat(prompt, response)

CPU times: user 12.6 s, sys: 1.25 ms, total: 12.6 s
Wall time: 12.6 s


<font size='+1' color='brown'>🙋‍♂️<blockquote>What languages did Ram Mohan Roy know?</blockquote></font><font size='+1' color='teal'>🤖<blockquote><br><br>[Answer 1]<br><br>He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and Urdu.<br><br>    He was a polyglot. He knew Bengali, English, Persian, Arabic, Sanskrit, and</blockquote></font>

# At this point, the model's answer has some issues:
* The responses aren't well-formmatted
* The response is wrong (if we are to believe the context data is the truth).
* Let's try some fine-tuning to fix this.

# First, let's prepare a fine-tuning dataset to deal with the response formatting issue (so-called "instruction following")
* we won't use the context field

[link text](https://)# New Section

In [33]:
data = []
for item in ft_data:
    template = "Instruction:\n{instruction}\n\nResponse:\n{response}"
    data.append(template.format(**item))
pprint.pp(random.choice(data))

('Instruction:\n'
 'tell me whether these are synonyms or antonyms of love: dislike, care, like, '
 'hate, affection, harsh\n'
 '\n'
 'Response:\n'
 'Synonyms: care, like, affection\n'
 'Antonyms: dislike, hate, harsh')


# Fine-tune the model for just proper "instruction following"

In [34]:
%%time
gemma_lm.backbone.enable_lora(rank=4)

# Limit the input sequence length to 256 (to control memory usage).
gemma_lm.preprocessor.sequence_length = 256
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
gemma_lm.fit(data, epochs=1, batch_size=1)

[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m160s[0m 58ms/step - loss: 0.8256 - sparse_categorical_accuracy: 0.5422
CPU times: user 4min 40s, sys: 11.7 s, total: 4min 52s
Wall time: 2min 40s


<keras.src.callbacks.history.History at 0x7a1a26042800>

# Now let's ask the fine-tuned model the same question
* we should expect better response formatting (ie, instruction following)
* we should not expect it answering correctly based on the fine-tuning context data since we did not use it in the fine-tuning dataset

In [35]:
prompt = template.format(
    instruction="What languages did Ram Mohan Roy know?",
    response="",
)
completion = gemma_lm.generate(prompt)
response = completion.replace(prompt, "")
display_chat(prompt, response)

<font size='+1' color='brown'>🙋‍♂️<blockquote>Instruction:<br>What languages did Ram Mohan Roy know?<br><br>Response:<br></blockquote></font><font size='+1' color='teal'>🤖<blockquote>Ram Mohan Roy was a Bengali polymath who was a pioneer of the Indian Renaissance. He was a scholar of Sanskrit, Persian, Arabic, and English.</blockquote></font>

# Now let's create a fine-tuning dataset using the dataset context
* in this case, we assume the dataset context has ground-truth facts that the model should be using

In [39]:
data = []
for item in ft_data:
    template = "Instruction:\n{instruction}\n\nContext:\n{context}\n\nResponse:\n{response}"
    data.append(template.format(**item))
pprint.pp(random.choice(data))

('Instruction:\n'
 'Given a reference text about Georg Friedrich Parrot, tell me when and where '
 'he was born as well as what he studied.\n'
 '\n'
 'Context:\n'
 'Georg Friedrich Parrot (15 July 1767 – 8 July 1852) was a German scientist, '
 'the first rector of the Imperial University of Dorpat (today Tartu, Estonia) '
 'in what was then the Governorate of Livonia of the Russian Empire.\n'
 '\n'
 'Education\n'
 'Georges-Frédéric Parrot was born in Mömpelgard (now Montbéliard) (then part '
 'of the Duchy of Württemberg, from 1806 in France). His father, a surgeon by '
 "profession and the local duke's physician in ordinary, had a respectable "
 'position in the society becoming the mayor of his hometown. As the family '
 'was Protestants, they sent Georg Friedrich to study physics and mathematics '
 'at the University of Stuttgart in Stuttgart, the capital of the Duchy '
 '(1782–1786).\n'
 '\n'
 'Response:\n'
 'Georg Friedrich Parrot was born on July 15, 1767 in Mömpelgard. He studie

# Reload base model and fine-tune on the new dataset which has context included

In [40]:
%%time
gemma_lm = keras_nlp.models.GemmaCausalLM.from_preset("gemma2_2b_en")
gemma_lm.backbone.enable_lora(rank=4)

# Limit the input sequence length to 256 (to control memory usage).
gemma_lm.preprocessor.sequence_length = 256
# Use AdamW (a common optimizer for transformer models).
optimizer = keras.optimizers.AdamW(
    learning_rate=5e-5,
    weight_decay=0.01,
)
# Exclude layernorm and bias terms from decay.
optimizer.exclude_from_weight_decay(var_names=["bias", "scale"])

gemma_lm.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=optimizer,
    weighted_metrics=[keras.metrics.SparseCategoricalAccuracy()],
)
gemma_lm.fit(data, epochs=1, batch_size=1)

[1m1000/1000[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m139s[0m 59ms/step - loss: 1.0938 - sparse_categorical_accuracy: 0.5655
CPU times: user 2min 22s, sys: 19.5 s, total: 2min 42s
Wall time: 3min 6s


<keras.src.callbacks.history.History at 0x7a19f0c0e290>

# Now let's ask the new fine-tuned model the same question
* we should expect better response formatting (ie, instruction following)
* we might expect it to snwer correctly based on the fine-tuning context data since we used it in the fine-tuning dataset

In [42]:
prompt = template.format(
    instruction="What languages did Ram Mohan Roy know?",
    context="",
    response="",
)
completion = gemma_lm.generate(prompt)
response = completion.replace(prompt, "")
display_chat(prompt, response)

<font size='+1' color='brown'>🙋‍♂️<blockquote>Instruction:<br>What languages did Ram Mohan Roy know?<br><br>Context:<br><br><br>Response:<br></blockquote></font><font size='+1' color='teal'>🤖<blockquote>Ram Mohan Roy was a Bengali polymath who was a pioneer of the Indian Renaissance. He was a scholar of Sanskrit, Persian, Arabic, and English. He was also a poet, a translator, a journalist, a social reformer, and a politician.</blockquote></font>

# Fine-tuning with the context "fact" did not seem to help.  Let's just prompt the model with the context at prompt-time



In [44]:
prompt = template.format(
    instruction="What languages did Ram Mohan Roy know?",
    context=ft_record['context'],
    response="",
)
completion = gemma_lm.generate(prompt)
response = completion.replace(prompt, "")
display_chat(prompt, response)

<font size='+1' color='brown'>🙋‍♂️<blockquote>Instruction:<br>What languages did Ram Mohan Roy know?<br><br>Context:<br>Ram Mohan Roy was born in Radhanagar, Hooghly District, Bengal Presidency. His great grandfather Krishnakanta Bandyopadhyay was a Rarhi Kulin (noble) Brahmin. Among Kulin Brahmins – descendants of the six families of Brahmins imported from Kannauj by Ballal Sen in the 12th century – those from the Rarhi district of West Bengal were notorious in the 19th century for living off dowries by marrying several women. Kulinism was a synonym for polygamy and the dowry system, both of which Rammohan campaigned against. His father, Ramkanta, was a Vaishnavite, while his mother, Tarini Devi, was from a Shaivite family. He was a great scholar of Sanskrit, Persian and English languages and also knew Arabic, Latin and Greek. One parent prepared him for the occupation of a scholar, the Shastri, while the other secured for him all the worldly advantages needed to launch a career in the laukik or worldly sphere of public administration.[citation needed] Torn between these two parental ideals from early childhood, Ram Mohan vacillated between the two for the rest of his life.During his childhood Ram Mohan Roy witnessed death of his sister in law through sati. The seventeen year old girl was dragged towards the pyre where Ram Mohan Roy witnessed her terrified state. He tried to protest but to no avail. She was burned alive. The people chanted "Maha Sati! Maha Sati! Maha Sati!" (great wife) over her painful screams. <br><br>Response:<br></blockquote></font><font size='+1' color='teal'>🤖<blockquote>Instruction:<br>What languages did Ram Mohan Roy know?<br><br>Context:<br>Ram Mohan Roy was born in Radhanagar, Hooghly District, Bengal Presidency. His great grandfather Krishnakanta Bandyopadhyay was a Rarhi Kulin (noble) Brahmin. Among Kulin Brahmins – descendants of the six families of Brahmins imported from Kannauj by Ballal Sen in the 12th century – those from the Rarhi district of West Bengal were notorious in the 19th century for living off dowries by marrying several women. Kulinism was a synonym for polygamy and the dowry system, both of which Rammohan campaigned against. His father, Ramkanta, was a Vaishnavite, while his mother, Tarini Devi, was from a Shaivite family. He was a great scholar of Sanskrit, Persian and English languages and also knew Arabic, Latin and Greek. One parent prepared him for the occupation of a scholar, the Shastri, while the other secured for him all the worldly advantages needed to launch a career in the laukik or worldly sphere of public administration.[citation needed] Torn</blockquote></font>