## Documentation

- https://ai.google.dev/gemini-api/docs/quickstart?lang=python
- https://ai.google.dev/gemini-api/docs/sdks
- https://github.com/googleapis/python-genai

## Dependencies

- `pip install google-genai`

In [5]:
from google import genai
import json

In [7]:
with open(r"./credentials", "r") as json_file:
    credentials = json.load(json_file)

In [9]:
credentials = credentials["GoogleAIStudio"]

In [11]:
client = genai.Client(api_key=credentials["key"])

In [None]:
models = ["gemini-2.0-flash",
          "gemini-2.0-flash-lite",
          "gemini-2.0-flash-001"
         ]

In [13]:
response = client.models.generate_content(model="gemini-2.0-flash",
                                          contents="How does RLHF work?"
                                         )


In [15]:
dir(response)

['__abstractmethods__',
 '__annotations__',
 '__class__',
 '__class_getitem__',
 '__class_vars__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__fields__',
 '__fields_set__',
 '__format__',
 '__ge__',
 '__get_pydantic_core_schema__',
 '__get_pydantic_json_schema__',
 '__getattr__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__pretty__',
 '__private_attributes__',
 '__pydantic_complete__',
 '__pydantic_core_schema__',
 '__pydantic_custom_init__',
 '__pydantic_decorators__',
 '__pydantic_extra__',
 '__pydantic_fields_set__',
 '__pydantic_generic_metadata__',
 '__pydantic_init_subclass__',
 '__pydantic_parent_namespace__',
 '__pydantic_post_init__',
 '__pydantic_private__',
 '__pydantic_root_model__',
 '__pydantic_serializer__',
 '__pydantic_validator__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__repr_a

In [58]:
response.model_fields["candidates"]

FieldInfo(annotation=Union[list[Candidate], NoneType], required=False, default=None, alias='candidates', alias_priority=1, description='Response variations returned by the model.\n      ')

## Managing Model Usage

In [60]:
response.usage_metadata

GenerateContentResponseUsageMetadata(cache_tokens_details=None, cached_content_token_count=None, candidates_token_count=1791, candidates_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=1791)], prompt_token_count=6, prompt_tokens_details=[ModalityTokenCount(modality=<MediaModality.TEXT: 'TEXT'>, token_count=6)], thoughts_token_count=None, tool_use_prompt_token_count=None, tool_use_prompt_tokens_details=None, total_token_count=1797)

In [62]:
response.usage_metadata.candidates_token_count

1791

In [64]:
response.usage_metadata.prompt_token_count

6

In [66]:
response.usage_metadata.total_token_count

1797

## Response Assessment

### Get number of response candidates

In [71]:
len(response.candidates)

1

### Candidate Confidence

In [77]:
response.candidates[0].avg_logprobs

-0.3898322245450342

In [93]:
response.candidates[0].token_count

In [95]:
response.candidates[0].validate

<bound method BaseModel.validate of <class 'google.genai.types.Candidate'>>

In [89]:
dir(response.candidates[0])

['__abstractmethods__',
 '__annotations__',
 '__class__',
 '__class_getitem__',
 '__class_vars__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__fields__',
 '__fields_set__',
 '__format__',
 '__ge__',
 '__get_pydantic_core_schema__',
 '__get_pydantic_json_schema__',
 '__getattr__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__pretty__',
 '__private_attributes__',
 '__pydantic_complete__',
 '__pydantic_core_schema__',
 '__pydantic_custom_init__',
 '__pydantic_decorators__',
 '__pydantic_extra__',
 '__pydantic_fields_set__',
 '__pydantic_generic_metadata__',
 '__pydantic_init_subclass__',
 '__pydantic_parent_namespace__',
 '__pydantic_post_init__',
 '__pydantic_private__',
 '__pydantic_root_model__',
 '__pydantic_serializer__',
 '__pydantic_validator__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__repr_a

In [34]:
len(response.candidates)

1

In [36]:
print(response.text)

RLHF (Reinforcement Learning from Human Feedback) is a technique used to fine-tune large language models (LLMs) to better align with human preferences and instructions. In essence, it's a process of training a reward model based on human feedback, and then using that reward model to further train the LLM using reinforcement learning. Here's a breakdown of how it works, step-by-step:

**1. Pre-training the LLM (Optional, but Usually Necessary):**

*   **Purpose:**  Start with a pre-trained LLM.  This is crucial as it provides the model with a foundation of general language understanding and generation capabilities.
*   **Method:** The LLM is typically pre-trained on a massive dataset of text and code using self-supervised learning.  Examples include training it to predict the next word in a sequence (causal language modeling) or to mask and predict missing words.  Examples of these base models include the GPT family from OpenAI, LLaMA from Meta, and others.

**2. Generating Responses an