# ⚒️ GenAI Model Validation Workshop

[Program](https://docs.google.com/document/d/1uqOlTim6czjeK16xXz4tXvYznTxkiy19moeoGNSynSY/edit?tab=t.0#heading=h.8c6jf2k12z8) | [GitHub](https://github.com/h2oai/h2o-genai-model-validation-training)



## Load environment setup

Store environment in the file .env in the directory if this notebook.

For example:

In [1]:
!echo -e "\
OPENAI_API_KEY="..."\n \
H2OGPTE_URL="https://h2ogpte.dev.h2o.ai"\n \
H2OGPTE_API_KEY="..."\n \
TOKENIZERS_PARALLELISM="false"\n \
" > .env

### ☢️ Dry-run specific workaround

🚨🚨🚨 Please do not distribute 🚨🚨🚨 


In [None]:
#!curl -LJO -H 'Accept: application/octet-stream' -H 'Authorization: token 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' https://api.github.com/repos/h2oai/h2o-mrm/releases/assets/205220690

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 20.3M  100 20.3M    0     0  15.1M      0  0:00:01  0:00:01 --:--:-- 20.8M


In [None]:
#!curl -LJO -H 'Accept: application/octet-stream' -H 'Authorization: token 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' https://api.github.com/repos/h2oai/h2o-mrm/releases/assets/205220692

In [None]:
#!wget  --no-check-certificate  -o t.zip 'https://docs.google.com/uc?export=download&id=1ttBMhiJbH8pfbt2h8G1LwykZEYcND3vV'

In [None]:
#!unzip 'uc?export=download&id=1ttBMhiJbH8pfbt2h8G1LwykZEYcND3vV'

In [None]:
#!wget -q https://www.occ.treas.gov/publications-and-resources/publications/comptrollers-handbook/files/model-risk-management/pub-ch-model-risk.pdf

In [36]:
!ls -la 

total 54264
drwxr-xr-x@ 9 michal  wheel       288 Nov 14 11:21 [1m[36m.[m[m
drwxr-xr-x@ 3 michal  staff        96 Nov 12 18:59 [1m[36m..[m[m
-rw-r--r--@ 1 michal  wheel       375 Nov 11 18:34 .env
drwxr-xr-x@ 9 michal  wheel       288 Nov 11 18:25 [1m[36m.venv[m[m
drwxr-xr-x@ 4 michal  wheel       128 Nov 13 23:09 [1m[36mh2o_mrm-0.3.0-data[m[m
-rw-r--r--@ 1 michal  wheel   4297658 Nov 13 15:26 h2o_mrm-0.3.0.ipynb
-rw-r--r--@ 1 michal  wheel  21340546 Nov 14 11:20 h2o_mrm-0.3.2-cp311-cp311-linux_x86_64.whl
-rw-r--r--@ 1 michal  wheel       166 Nov 14 11:19 local.env
-rw-r--r--@ 1 michal  wheel   2112022 Nov 10 17:56 pub-ch-model-risk.pdf


In [None]:
!pip install h2o_mrm-0.3.2-cp311-cp311-linux_x86_64.whl -c constraints-311.txt

### Workaround
This will be cleaned up...

In [2]:
import ipywidgets as widgets

In [None]:
from IPython.display import display, Javascript
# Refresh the page only if a specific flag is not set
def refresh_page_once():
    display(Javascript("""
    if (!localStorage.getItem('pageRefreshed')) {
        localStorage.setItem('pageRefreshed', 'true');
        window.location.reload();
    } else {
        localStorage.removeItem('pageRefreshed');
    }
    """))
refresh_page_once()

In [3]:
# Supress Warnings
import warnings
warnings.filterwarnings("ignore")

# Load Environment Variables
from dotenv import load_dotenv

_ = load_dotenv()

In [4]:
# Python packages
from pathlib import Path

# Experiment
from h2o_mrm.experiment import Experiment

# Topic Modeling
from h2o_mrm.widgets import topic_model_widget

# Question Generation
from h2o_mrm.widgets.chunk_nav import create_qa_gen_widget
from h2o_mrm.widgets.chunk_nav.core import create_question_generator, create_summarizer

# Generated Question Evaluation
from h2o_mrm.widgets.aw_data_table import create_genqa_eval_widget

# RAG Models
from h2o_mrm.rag_models import H2OGPTERAG, H2ogpteConfig

# 1. Embedding and Explainability

The goal of experiment is to analyze document ["Comptroller’s Handbook: Model Risk Management"](https://www.occ.treas.gov/publications-and-resources/publications/comptrollers-handbook/files/model-risk-management/index-model-risk-management.html) in the context of RAG systems.


Topics covered
 - TODO

TODO
- str representation of Experiment to know what was computed and what now
- 


In [5]:
%whos

Variable                    Type                         Data/Info
------------------------------------------------------------------
Experiment                  type                         <class 'h2o_mrm.experiment.Experiment'>
H2OGPTERAG                  type                         <class 'h2o_mrm.rag_models.H2OGPTERAG'>
H2ogpteConfig               type                         <class 'h2o_mrm.rag_models.H2ogpteConfig'>
Path                        type                         <class 'pathlib.Path'>
create_genqa_eval_widget    cython_function_or_method    <cyfunction create_genqa_<...>al_widget at 0x353d25ff0>
create_qa_gen_widget        cython_function_or_method    <cyfunction create_qa_gen_widget at 0x353d24a00>
create_question_generator   cython_function_or_method    <cyfunction create_questi<...>generator at 0x353d25b10>
create_summarizer           cython_function_or_method    <cyfunction create_summarizer at 0x353d25a40>
load_dotenv                 function                     

In [6]:
%pdef Experiment

[0;31mClass constructor information:
[0m [0mExperiment[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mname[0m[0;34m:[0m [0;34m'str'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mmax_tokens_per_chunk[0m[0;34m:[0m [0;34m'int | None'[0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0membedding_model_name[0m[0;34m:[0m [0;34m'str | None'[0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mreset_collection[0m[0;34m:[0m [0;34m'bool'[0m [0;34m=[0m [0;32mFalse[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdb_path[0m[0;34m:[0m [0;34m'str'[0m [0;34m=[0m [0;34m'./database.db'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mvector_db_path[0m[0;34m:[0m [0;34m'str'[0m [0;34m=[0m [0;34m'./vector_db'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m [0;34m->[0m [0;34m'None'[0m[0;34m[0m[0;34m[0m[0m
 

The MRM experiment _OCC Handbook Analysis_ analyse downloaded OCC Handbook.

It does:
 - chunking of document using H2OGPTe chunking strategy.
 - embedding of chunks into vectors using given embedding model.
  
  TODO: remove unecessary parameters
  TODO: add parameter to disable/enable use of cache

  

In [31]:
exp = Experiment( 
    "OCC Handbook", # Do not change name since it is used for cache look-ups to speed up computation.
    max_tokens_per_chunk=320,
    embedding_model_name="BAAI/bge-m3",
    reset_collection=False,
    db_path="/tmp/cache/database.db",
    vector_db_path="/tmp/cache/chromadb/",
)
exp.add_documents(["./pub-ch-model-risk.pdf"])

In [32]:
exp


Name:            OCC Handbook
Docs:            ['pub-ch-model-risk.pdf']
Embedding model: BAAI/bge-m3
Chunks:          0 (max tokens/chunk: 320)
Topics:


Local cache embeddings: /tmp/cache/chromadb/
Local cache collection: /tmp/cache/database.db


#### Create Chunks

In [33]:
# Create and Save Chunks
exp_chunks = exp.chunk_documents()

TODO: filter out some chunks that they are not used for topic modeling


#### Topic Modeling

In [37]:
exp.build_all_topic_models(
    n_neighbors=[40],
    n_components=[15],
    min_cluster_size=[5, 7, 9],
)

Computing all possible topic models to build ...
All possible topic models already built.


In [39]:
# List topic models for this experiment
exp.list_topic_models()

Unnamed: 0,id,name,n_neighbors,n_components,min_cluster_size,silhouette_score,num_clusters
0,c7202784-9a72-4667-904e-1dfd76daad0a,zealous_einstein,10,2,10,0.782312,7
1,adaf4913-a8cd-442d-9c3d-3cbeb26c0133,ecstatic_albattani,10,2,11,0.775471,6
2,d27944a4-3d27-4004-9f45-003d0c7b7889,mystifying_kare,25,25,9,0.766555,7
3,20f7470a-3a9c-48a8-8cb5-883b1db76e4f,ecstatic_beaver,15,2,10,0.742892,6
4,19cc6e64-9ab6-413c-8584-15975e3a6a3b,stoic_euler,10,2,9,0.742816,9
...,...,...,...,...,...,...,...
1083,86029502-c100-4fbf-8af8-163d4e0b64ae,ecstatic_yalow,15,27,5,0.249047,9
1084,d9ff2fce-f8a9-400c-b325-7c2ad701b3ed,inspiring_morse,15,22,5,0.229325,9
1085,e043f74c-8bbd-43a1-a6b0-49d6cea29b8c,quizzical_mcnulty,15,10,5,0.214644,9
1086,f313b35f-3098-4b0c-a480-35535d6fc89f,admiring_dijkstra,20,35,5,0.211012,9


In [40]:
exp.set_topic_model()

In [None]:
from h2o_mrm.viz import create_topics_distribution_pie

create_topics_distribution_pie(exp.chunks, exp.topic_names)

In [42]:
from h2o_mrm.viz import create_chunk_distribution_map

create_chunk_distribution_map(exp.chunks, exp.topic_names)

In [15]:
exp.set_topic_model()

tmw = topic_model_widget.create_widget(
    tm_config=exp.bertopic_model_config,
    create_topic_cluster_data=exp.build_topic_cluster_creator(
        show_doc_in_tooltip=True,
        show_topic_names=True,
    ),
    interactive=True,
)
tmw

Widget(fig_data={'data': [{'colorscale': [[0, 'rgba(99, 110, 250, 0)'], [1, 'rgba(99, 110, 250, 0.7)']], 'hove…

In [10]:
# Change the huperparameters for the Topic Model manually to create a new custom config

# my_topic_model_id = exp.add_topic_model(tmw.topic_model_config, name="my_topic_model_1")
# exp.set_topic_model(my_topic_model_id)

In [19]:
exp.get_num_chunks_in_topic_chart()

# 2. Test Generation and Benchmarking

- Automatic Prompt engineering
- Automatic QA generation


In [None]:
llama_summerizer = create_summarizer(
    model_type="h2ogpte",
    model_name="meta-llama/Meta-Llama-3.1-70B-Instruct",
)
llama_question_generator = create_question_generator(
    model_type="h2ogpte",
    model_name="meta-llama/Meta-Llama-3.1-70B-Instruct",
)

#### Interactive Question Generation

In [None]:
question_gen_widget = create_qa_gen_widget(
    exp.chunks,
    fig_data=exp.fig_data,
    summarize_text=llama_summerizer,
    generate_questions=llama_question_generator,
)
question_gen_widget

#### Automatic Question Generation

In [None]:
# exp.generate_questions(
#     topics=[
#         2,
#     ],
#     summarizer=llama_summerizer,
#     question_generator=llama_question_generator,
#     question_generator_name="Meta-Llama-3.1-70B-Instruct",
#     sampling_method="twinning",
# )

In [None]:
generated_questions = exp.list_generated_questions()
print(len(generated_questions))
for x in generated_questions[:5]:
    print(x)

#### Evaluate Generated Questions

In [None]:
exp.validate_generated_questions()

#### Load Validated Questions in a Widget

In [None]:
validated_questions = exp.get_validated_questions()
genq_eval_widget = create_genqa_eval_widget(validated_questions)
genq_eval_widget

# 3. Eval Metrics and RAG

#### Metrics

- [X] Groundedness
- [X] Context Recall
- [X] Context Precision
- [X] Recall Relevancy
- [X] Precision Relevancy
- [X] Answer Relevancy



#### Get Answers from RAG

In [None]:
rag_name = "h2ogpte.dev.h2o.ai"
rag_version = "1.6.0-dev28"
llm_name = "meta-llama/Meta-Llama-3.1-70B-Instruct"
llm_args = dict(
    temperature=0.0,
    seed=42,
    max_new_tokens=4096,
)

In [None]:
rag_under_test_id = exp.register_rag_under_test(
    rag_name=rag_name,
    rag_version=rag_version,
    llm_name=llm_name,
    llm_args=llm_args,
    embedding_model_name="BAAI/bge-m3",
)
rag_under_test_id

In [None]:
rag_collection_name = "OCC Handbook 3"
config = H2ogpteConfig.from_env()
rag = H2OGPTERAG(config, rag_collection_name, llm_name, llm_args)

In [None]:
rag.add_documents([Path("./pub-ch-model-risk.pdf")])

In [None]:
exp.get_answers_from_rag(
    rag_under_test_id=rag_under_test_id,
    answer_question=rag.answer_question,
)

In [None]:
exp.add_rag_chunks(rag_under_test_id, rag.get_all_chunks)

In [None]:
exp.evaluate_answers(rag_under_test_id)

In [None]:
exp.plot_metrics(rag_under_test_id)