# Урок 9. Часть 1

1. Смотрим на работу Gemma 2B.
1. Решаем задачи с помощью Gemma 2B.

## Gemma 2B

1. Ссылка на kaggle.com https://www.kaggle.com/models/google/gemma
1. Ссылка на huggingface.co https://huggingface.co/google/gemma-1.1-2b-it

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch


# Загружаем модель
model_path = "/kaggle/input/gemma/transformers/1.1-2b-it/1/"

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, device_map="auto", revision="float16").eval()

for param in model.parameters():
    param.requires_grad = False

In [None]:
model

In [None]:
model.lm_head.weight

In [None]:
# Токенизируем текст
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

input_ids

In [None]:
# Первая генерация текста
outputs = model.generate(**input_ids, max_new_tokens=20)

print(tokenizer.decode(outputs[0]))

In [None]:
# Токенизируем как чат, модель училась общаться в формате диалога
conversation = [{"role": "user", "content": "Write me a poem about Machine Learning."}]
input_ids = tokenizer.apply_chat_template(conversation, add_generation_prompt=True, return_tensors="pt").to("cuda")

print(tokenizer.batch_decode(input_ids)[0])

In [None]:
# Вторая генерация текста
generate_ids = model.generate(input_ids, max_new_tokens=20)

print(tokenizer.batch_decode(generate_ids)[0])

In [None]:
new_tokens = generate_ids[0, input_ids.shape[-1]:]

print(tokenizer.decode(new_tokens, skip_special_tokens=True))

In [None]:
import time
from functools import wraps

def measure_time(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()  # Record the start time
        result = func(*args, **kwargs)  # Call the actual function
        end_time = time.time()  # Record the end time
        elapsed_time = end_time - start_time  # Calculate the elapsed time
        print(f"Function '{func.__name__}' took {elapsed_time:.4f} seconds to complete.\n")
        return result  # Return the result of the function
    return wrapper

# Напишем функцию для генерации текста - ответа на сообщение пользователя
@measure_time
@torch.inference_mode()
def generate_text(prompt, **kwargs):
    conversation = [{"role": "user", "content": prompt}]
    input_ids = tokenizer.apply_chat_template(conversation, add_generation_prompt=True, return_tensors="pt").to("cuda")
    generate_ids = model.generate(input_ids, **kwargs)
    new_tokens = generate_ids[0, input_ids.shape[-1]:]
    return tokenizer.decode(new_tokens, skip_special_tokens=True)

In [None]:
print(generate_text("Hello!", max_new_tokens=100))

### Температура

Параметр температуры в LLM контролирует случайность предсказаний. Низкие значения температуры делают модель более детерминированной, в то время как высокие значения температуры делают ее более креативной и разнообразной. Вот общие рекомендации по настройке температуры:

1. **Низкая температура (0.1 - 0.3)**:
   - Результаты становятся более детерминированными и сфокусированными
   - Хорошо подходит для задач, требующих точности, таких как ответы на фактические вопросы или генерация кода
   - Меньше вероятность получения неожиданных или креативных ответов

2. **Средняя температура (0.4 - 0.7)**:
   - Балансирует между случайностью и детерминизмом
   - Подходит для большинства задач общего назначения
   - Обеспечивает связные, но несколько разнообразные результаты

3. **Высокая температура (0.8 - 1.0)**:
   - Увеличивает креативность и разнообразие в ответах
   - Полезна для творческого письма, мозговых штурмов или создания поэзии
   - Ответы могут быть менее предсказуемыми и более разнообразными

4. **Очень высокая температура (выше 1.0)**:
   - Может приводить к очень разнообразным и иногда бессмысленным результатам
   - Обычно не рекомендуется для большинства задач, но можно экспериментировать для высоко креативных задач

### Практические рекомендации:
- **0.7**: Часто является хорошим выбором по умолчанию для сбалансированных результатов
- **0.5**: Хорошо подходит для смешения креативности и связности
- **0.2**: Идеально для задач, требующих высокой точности и последовательности

Настройка температуры позволяет точно настроить поведение языковой модели для выполнения конкретных задач, поэтому эксперименты в этих диапазонах помогут достичь желаемого стиля вывода.

In [None]:
# Генерируем с разной температурой
temperature_values = [0.1, 0.3, 0.5, 0.75, 1.0, 1.5]

for temperature in temperature_values:
    print(f"{temperature=}", end="\n\n")
    print(generate_text("Write me a haiku about deep learning", max_new_tokens=100, temperature=temperature, do_sample=True))
    print("-" * 100)

## Решаем задачи с помощью Gemma 2B

### Саммаризация

In [None]:
text = """
The temperature parameter in language models (LLMs) controls the randomness of the predictions. Lower temperatures make the model more deterministic, while higher temperatures make it more creative and diverse. Here are some general guidelines for setting the temperature:

1. **Low Temperature (0.1 - 0.3)**:
   - Results in more deterministic and focused outputs.
   - Good for tasks requiring precision, such as factual answers or code generation.
   - The model is less likely to produce unexpected or creative responses.

2. **Medium Temperature (0.4 - 0.7)**:
   - Balances between randomness and determinism.
   - Suitable for most general-purpose tasks.
   - Produces coherent yet somewhat varied outputs.

3. **High Temperature (0.8 - 1.0)**:
   - Increases creativity and diversity in responses.
   - Useful for creative writing, brainstorming, or generating poetry.
   - Outputs may be less predictable and more varied.

4. **Very High Temperature (above 1.0)**:
   - Can produce highly diverse and sometimes nonsensical outputs.
   - Generally not recommended for most tasks but can be experimented with for highly creative tasks.

### Practical Recommendations:
- **0.7**: Often a good default choice for balanced outputs.
- **0.5**: Good for a mix of creativity and coherence.
- **0.2**: Ideal for tasks requiring high accuracy and consistency.

Adjusting the temperature allows you to fine-tune the behavior of the language model to suit specific needs, so experimenting within these ranges can help you achieve the desired output style.
"""

prompt = f"""Summarize the following text in 2-3 sentences, capturing the main points and key details while maintaining coherence and accuracy. Ensure the summary is concise and informative.

'''
{text}
'''
"""

In [None]:
print(generate_text(prompt, temperature=0.2, do_sample=True, max_new_tokens=100))

### Определение тональности текста

In [None]:
text = "This new GPT4o is a complete disaster. It's slow, inaccurate, and difficult to use. I hate it very much, the Google's Gemma is sooo better."

prompt = f"""Determine the sentiment of this text. Return only sentiment.

'''
{text}
'''
"""

In [None]:
print(generate_text(prompt, temperature=0.2, do_sample=True, max_new_tokens=100))

### Классификация

In [None]:
text = "The latest smartphone from Apple has received super positive reviews for its sleek design and powerful performance. But it is very expensive, so think for yourself!"
prompt = f"""Classify the following text into one of the categories: technology, sports, politics. Return only a category.

'''
{text}
'''
"""

In [None]:
print(generate_text(prompt, temperature=0.2, do_sample=True, max_new_tokens=100))

### Перевод

In [None]:
text = "Биршерт Алексей Дмитриевич записывает лекцию и семинар для 9 занятия по курсу Глубинное обучение."
source_language = "russian"
target_language = "spanish"

prompt = f"""Translate this text from {source_language} to {target_language}. Return only translation.

'''
{text}
'''
"""

In [None]:
translation = generate_text(prompt, temperature=0.7, do_sample=True, max_new_tokens=100)

print(translation)

In [None]:
prompt = f"""Translate this text from {target_language} to {source_language}. Return only translation.

'''
{translation}
'''
"""

In [None]:
back_translation = generate_text(prompt, temperature=0.7, do_sample=True, max_new_tokens=100)

print(back_translation)

### Ответы на вопросы по тексту


In [None]:
text = """
Jason Statham is an English actor and former competitive diver, best known for his roles in action-thriller films. Born on July 26, 1967, in Shirebrook, Derbyshire, England, Statham's journey to stardom is as remarkable as the characters he portrays. Before entering the film industry, he was a member of Britain's National Diving Squad for over a decade, competing at world championships and the Commonwealth Games. His rugged good looks and athletic build, combined with his martial arts skills, made him a natural fit for the action genre.

Statham's film career began in 1998 when he was cast in Guy Ritchie's crime comedy "Lock, Stock and Two Smoking Barrels." His performance caught the attention of audiences and critics alike, leading to a follow-up role in Ritchie's "Snatch" (2000), where he starred alongside Brad Pitt and Benicio del Toro. These early roles established him as a reliable actor capable of delivering tough, street-smart characters with a touch of humor.

He gained international fame with his role as Frank Martin in "The Transporter" series (2002-2008), where he performed many of his own stunts, showcasing his skills in martial arts, driving, and combat. This franchise solidified his reputation as a top-tier action star. Statham continued to build on this success with roles in high-profile action films such as "Crank" (2006), "War" (2007), and "Death Race" (2008).

In 2010, Statham joined the ensemble cast of "The Expendables," alongside other action legends like Sylvester Stallone and Arnold Schwarzenegger. The film's success led to two sequels, further cementing his status in Hollywood. He also became part of the "Fast & Furious" franchise, debuting as the villain Deckard Shaw in "Fast & Furious 6" (2013) and reprising the role in subsequent films, including "Furious 7" (2015), "The Fate of the Furious" (2017), and the spin-off "Hobbs & Shaw" (2019).

Statham's appeal lies in his ability to bring authenticity to his roles, performing stunts and fight scenes with a level of realism that resonates with audiences. Off-screen, he is known for his private and low-key lifestyle, a stark contrast to the high-octane characters he portrays. He has been in a long-term relationship with model Rosie Huntington-Whiteley, with whom he shares a son.

Jason Statham's career is a testament to his versatility and dedication to his craft. From his beginnings as a competitive diver to becoming one of Hollywood's most bankable action stars, he has consistently delivered performances that are both compelling and entertaining. His contributions to the action genre have earned him a loyal fan base and a lasting legacy in the film industry.
"""

In [None]:
questions = [
    "What were Jason Statham's professions before he became an actor?",
    "Which film marked the beginning of Jason Statham's film career in 1998?",
    "How did Jason Statham's role in 'The Transporter' series impact his career?",
    "In which film franchise did Jason Statham play the character Deckard Shaw?",
    "Who is Jason Statham's long-term partner, and do they have any children?"
]

In [None]:
for question in questions:
    print(question, end="\n\n")

    prompt = f"""Answer the question based on the context provided:

QUESTION
{question}

'''CONTEXT
{text}
'''
    """
    print(generate_text(prompt, temperature=0.7, do_sample=True, max_new_tokens=100))

    print("-" * 100)

### Ответы на вопросы PRO MAX ULTRA PLUS

In [None]:
texts = [
    """Cats are fascinating creatures, beloved by millions worldwide for their independent yet affectionate nature. Domesticated over 4,000 years ago in ancient Egypt, cats were initially valued for their ability to control vermin. Over time, they became symbols of grace and mystery, a status they still hold today.

One of the most striking features of cats is their agility. Their bodies are designed for hunting and climbing, with flexible spines and powerful hind legs allowing them to leap great distances. A cat's retractable claws are perfect for capturing prey and climbing, while their keen senses of sight and hearing make them excellent hunters.

Cats are known for their meticulous grooming habits. They spend a significant portion of their day licking their fur to keep it clean and free of parasites. This behavior also serves to regulate their body temperature and reinforce social bonds when they groom each other.

Despite their reputation for independence, many cats form strong attachments to their owners. They communicate through a variety of vocalizations, including purring, meowing, and hissing. Purring, often associated with contentment, can also be a self-soothing mechanism when they are in pain or stressed.

Cats have a unique social structure. Unlike dogs, which are pack animals, cats are solitary hunters. However, they can be quite social when they feel secure, often forming close bonds with other cats and even other species, including humans. Their ability to adapt to various environments makes them popular pets in urban and rural settings alike.

In terms of health, cats are generally robust animals, but they do require regular veterinary care. Vaccinations, flea control, and a balanced diet are crucial for maintaining their health. Indoor cats tend to live longer than their outdoor counterparts, as they are less exposed to diseases, accidents, and predators.

In conclusion, cats are complex, multifaceted animals that bring joy and companionship to many households. Their blend of independence and affection, coupled with their graceful demeanor and playful antics, make them one of the most cherished pets globally.""",

    """The world's oceans, covering more than 70% of the Earth's surface, are essential to life on our planet. They regulate the climate, provide food, and support countless species, from the smallest plankton to the largest whales.

The five main oceans – the Pacific, Atlantic, Indian, Southern, and Arctic – are interconnected and influence global weather patterns. The Pacific Ocean, the largest, spans over 60 million square miles and is home to the Mariana Trench, the deepest point on Earth. The Atlantic Ocean, known for its vital role in trade and history, connects the Americas with Europe and Africa. The Indian Ocean, crucial for monsoon patterns, supports a rich diversity of marine life. The Southern Ocean, encircling Antarctica, plays a critical role in regulating the Earth's temperature. The Arctic Ocean, the smallest and shallowest, is significant for its unique polar ecosystems and rapidly changing ice cover.

Oceans are a major source of biodiversity. Coral reefs, often called the "rainforests of the sea," provide habitat for a quarter of all marine species. Mangroves and seagrass beds are essential for carbon sequestration and serve as nurseries for many fish species. The open ocean, though appearing barren, supports life from microscopic phytoplankton to massive blue whales.

Human activities have increasingly impacted oceans. Overfishing, pollution, and climate change are major threats. Overfishing depletes fish stocks and disrupts marine ecosystems. Pollution, including plastics and chemical runoff, harms marine life and enters the food chain, affecting human health. Climate change causes ocean acidification and warming, which bleach coral reefs and alter the habitats of many species.

Conservation efforts are vital. Marine protected areas (MPAs), sustainable fishing practices, and pollution controls can help preserve ocean health. International cooperation, like the Paris Agreement, aims to address climate change's impacts on oceans. Protecting the oceans is crucial not only for marine life but for the well-being of future generations. Understanding and mitigating human impact on oceans is imperative for maintaining their ecological balance and the planet's health.""",

    """Paris, often referred to as the "City of Light," is renowned for its rich history, stunning architecture, and vibrant cultural scene. As the capital of France, Paris is a major European city and a global center for art, fashion, gastronomy, and culture.

The city's layout is defined by its grand boulevards, iconic landmarks, and the Seine River, which divides Paris into the Left Bank and the Right Bank. The Eiffel Tower, one of the most recognizable structures in the world, offers breathtaking views of the city. Nearby, the Champs-Élysées stretches from the Arc de Triomphe to the Place de la Concorde, lined with shops, theaters, and cafes.

Paris is also home to some of the world's most famous museums. The Louvre, originally a royal palace, houses thousands of works of art, including Leonardo da Vinci's "Mona Lisa" and the ancient Greek statue, "Venus de Milo." The Musée d'Orsay, located in a former railway station, showcases an extensive collection of Impressionist and Post-Impressionist masterpieces by artists such as Monet, Van Gogh, and Degas.

The city's architecture is a blend of historical and contemporary styles. The Gothic Notre-Dame Cathedral, despite the devastating fire in 2019, remains a symbol of French heritage. Modern architectural feats like the glass pyramid entrance to the Louvre and the futuristic design of the La Défense business district highlight Paris's innovative spirit.

Parisian cuisine is celebrated worldwide. From the rustic charm of traditional bistros to the elegance of Michelin-starred restaurants, the city's culinary scene is diverse and exquisite. Delicacies like croissants, escargot, and crème brûlée, paired with fine wines, define the gastronomic experience.

Paris is also a hub of fashion and design. The city's Fashion Week attracts global attention, and its boutiques and ateliers showcase cutting-edge trends. Renowned fashion houses like Chanel, Louis Vuitton, and Dior have their headquarters here, reinforcing Paris's status as a fashion capital.

In essence, Paris is a city that effortlessly combines its historical roots with modern innovation, making it a timeless destination that captivates millions of visitors each year."""
]

Ссылка на модель https://huggingface.co/intfloat/multilingual-e5-large

In [None]:
from torch import Tensor
from transformers import AutoModel

# Каждый текст должен начинаться для модели с "query: " или "passage: ", даже если текст на русском.
embedding_tokenizer = AutoTokenizer.from_pretrained('intfloat/multilingual-e5-large')
embedding_model = AutoModel.from_pretrained('intfloat/multilingual-e5-large', device_map="auto")

for param in embedding_model.parameters():
    param.requires_grad = False

In [None]:
def split_and_process_input_texts(input_texts, delimiter='\n\n'):
    splited_texts = []

    for text in input_texts:
        splited_texts.extend([f"passage: {part.strip()}" for part in text.split(delimiter) if part.strip()])

    return splited_texts


splited_texts = split_and_process_input_texts(texts)

batch_dict = embedding_tokenizer(splited_texts, max_length=512, padding=True, truncation=True, return_tensors='pt').to("cuda")

In [None]:
batch_dict

In [None]:
import torch.nn.functional as F


def average_pool(last_hidden_states: Tensor, attention_mask: Tensor) -> Tensor:
    last_hidden = last_hidden_states.masked_fill(~attention_mask[..., None].bool(), 0.0)
    return last_hidden.sum(dim=1) / attention_mask.sum(dim=1)[..., None]


outputs = embedding_model(**batch_dict)
embeddings = average_pool(outputs.last_hidden_state, batch_dict['attention_mask'])

embeddings = F.normalize(embeddings, p=2, dim=1)

In [None]:
def get_context(query: str) -> str:
    query = f"query: {query}"

    query_tokens = embedding_tokenizer([query], max_length=512, padding=True, truncation=True, return_tensors='pt').to("cuda")

    query_outputs = embedding_model(**query_tokens)

    query_embedding = average_pool(query_outputs.last_hidden_state, query_tokens['attention_mask'])

    query_embedding = F.normalize(query_embedding, p=2, dim=1)

    scores = (query_embedding @ embeddings.T) * 100
    top_k_indices = scores[0].topk(5).indices

    selected_texts = [splited_texts[idx].lstrip("passage: ") for idx in top_k_indices]
    context_string = "\n\n".join(selected_texts)

    return context_string

In [None]:
print(get_context("Who live in Paris?"))

In [None]:
def answer_question(question: str) -> str:
    context = get_context(question)

    prompt = f"""You have access to a set of documents containing relevant information. Use these documents to answer the following question comprehensively and accurately. Ensure your response is detailed, specific, and directly addresses the question. Do not include any information that is not supported by the provided documents.

Question:
{question}

Context:
{context}"""

    return generate_text(prompt, temperature=0.7, do_sample=True, max_new_tokens=100)

In [None]:
cat_questions = [
    "What were cats initially valued for in ancient Egypt?",
    "How do cats' retractable claws benefit them?",
    "What is one reason why indoor cats tend to live longer than outdoor cats?"
]

ocean_questions = [
    "What role does the Pacific Ocean play in global geography?",
    "Why are coral reefs referred to as the 'rainforests of the sea'?",
    "What are some major threats to the world's oceans caused by human activities?"
]

paris_questions = [
    "What are some of the notable landmarks in Paris?",
    "Which famous artworks can be found in the Louvre?",
    "How does Paris combine historical and modern architectural styles?"
]

questions = cat_questions + ocean_questions + paris_questions

questions

In [None]:
for question in questions:
    print(answer_question(question))
    print("-" * 100)

## Резюме

1. Загрузили модель Gemma с помощью библиотеки `Transformers`
2. Вспомнили про температуру семплирования и попробовали разные температуры
3. Порешали разные задачи с помощью Gemma
4. Построили прототип RAG системы