<a href="https://colab.research.google.com/github/naok-000/lecture-ai-engineering-day3/blob/main/ai_engineering_03_homework.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AIエンジニアリング実践講座 第3回課題

- 講義で学んだRAG(Retrieval-Augmented Generation)技術を用いて、LLMの生成内容を改善する実践的な取り組みを行った．演習で利用したコードをベースに、独自の質問と参照文書を用いて実験を行い、RAGの効果を定量的・定性的に評価した．
- プログラムはGoogle Colab（無料版）T4 GPU を利用して実行した．

In [1]:
!pip install --upgrade transformers
!pip install google-colab-selenium
!pip install bitsandbytes

Collecting google-colab-selenium
  Downloading google_colab_selenium-1.0.14-py3-none-any.whl.metadata (2.7 kB)
Collecting selenium (from google-colab-selenium)
  Downloading selenium-4.32.0-py3-none-any.whl.metadata (7.5 kB)
Collecting trio~=0.17 (from selenium->google-colab-selenium)
  Downloading trio-0.30.0-py3-none-any.whl.metadata (8.5 kB)
Collecting trio-websocket~=0.9 (from selenium->google-colab-selenium)
  Downloading trio_websocket-0.12.2-py3-none-any.whl.metadata (5.1 kB)
Collecting outcome (from trio~=0.17->selenium->google-colab-selenium)
  Downloading outcome-1.3.0.post0-py2.py3-none-any.whl.metadata (2.6 kB)
Collecting wsproto>=0.14 (from trio-websocket~=0.9->selenium->google-colab-selenium)
  Downloading wsproto-1.2.0-py3-none-any.whl.metadata (5.6 kB)
Downloading google_colab_selenium-1.0.14-py3-none-any.whl (8.2 kB)
Downloading selenium-4.32.0-py3-none-any.whl (9.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.4/9.4 MB[0m [31m67.0 MB/s[0m eta 

In [2]:
# 演習用のコンテンツを取得
!git clone https://github.com/naok-000/lecture-ai-engineering-day3.git

Cloning into 'lecture-ai-engineering-day3'...
remote: Enumerating objects: 20, done.[K
remote: Counting objects: 100% (20/20), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 20 (delta 6), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (20/20), 11.41 MiB | 11.44 MiB/s, done.
Resolving deltas: 100% (6/6), done.


In [3]:
# HuggingFace Login
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [4]:
# CUDAが利用可能ならGPUを、それ以外ならCPUをデバイスとして設定
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [5]:
import random
random.seed(0)

In [13]:
# モデル(Gemma2)の読み込み

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_name = "google/gemma-2-2b-jpn-it"
tokenizer = AutoTokenizer.from_pretrained(model_name)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=False,
)

model = AutoModelForCausalLM.from_pretrained(
            model_name,
            device_map="auto",
            quantization_config=bnb_config,
            torch_dtype=torch.bfloat16,
        )

tokenizer_config.json:   0%|          | 0.00/46.9k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/555 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/805 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/24.2k [00:00<?, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.99G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/241M [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/168 [00:00<?, ?B/s]

## 1. ベースラインモデル評価

まずはベースモデルがどの程度知識を持っているか確かめる

In [23]:
def generate_output(query):
  messages = [
      {"role": "user", "content": query},
  ]
  input_ids = tokenizer.apply_chat_template(
      messages,
      add_generation_prompt=True,
      return_tensors="pt"
  ).to(model.device)

  terminators = [
      tokenizer.eos_token_id,
      tokenizer.convert_tokens_to_ids("<|eot_id|>")
  ]

  outputs = model.generate(
      input_ids,
      max_new_tokens=1024,
      eos_token_id=terminators,
      do_sample=False,
      # temperature=0.6, # If do_sample=True
      # top_p=0.9,  # If do_sample=True
  )

  response = outputs[0][input_ids.shape[-1]:]
  return tokenizer.decode(response, skip_special_tokens=True)

In [24]:
questions = [
    "what are the main vulnerabilities of the OBD-II port that make it a critical entry point for vehicle malware attacks?",
    "OBD-IIポートが車両のマルウェア攻撃における重要な侵入口となる主な脆弱性は何ですか？",
    "How do different wireless technologies (such as Bluetooth, Wi-Fi, Cellular, and DSRC) collectively contribute to the increased malware attack surface in modern intelligent vehicles?",
    "Bluetooth、Wi-Fi、セルラー、DSRC などの無線通信技術は、現代のインテリジェント車両におけるマルウェア攻撃の対象領域をどのようにして拡大しているのですか？"
]

for question in questions:
    print(f"question: {question}")
    response = generate_output(question)
    print(f"response: \n{response}")

question: what are the main vulnerabilities of the OBD-II port that make it a critical entry point for vehicle malware attacks?
response: 
The OBD-II port, while a valuable tool for diagnostics and monitoring, presents several vulnerabilities that make it a tempting target for vehicle malware attacks. 

Here's a breakdown of the main vulnerabilities:

**1. Lack of Secure Protocols:**

* **Open Communication:** OBD-II operates on open protocols, meaning any device can connect and send commands. This lack of security makes it easy for attackers to exploit vulnerabilities in the communication channel.
* **No Authentication:**  There's no built-in authentication mechanism, meaning attackers can easily impersonate legitimate devices and gain unauthorized access.
* **Weak Encryption:**  The communication between the OBD-II port and the vehicle's computer is often unencrypted, making it vulnerable to eavesdropping and data interception.

**2. Limited Security Measures:**

* **Lack of Updates:

In [28]:
questions = [
    "What is the primary benefit of deploying Fixed Sensor Nodes (FSNs) in Vehicle-to-Infrastructure (V2I) cooperative systems, especially in complex urban scenarios?",
    "固定センサノード（FSN）をVehicle-to-Infrastructure（V2I）協調システムに導入する主な利点は何ですか？特に複雑な都市環境においては？",
    "How do cooperative object detection and cooperative tracking complement each other in enhancing safety and reliability in V2I autonomous driving systems, and what specific mechanisms enable this synergy?",
    "協調型物体検出と協調型追跡は、V2I型自動運転システムにおける安全性と信頼性の向上にどのように相互補完的に働きますか？また、この相乗効果を可能にする具体的なメカニズムは何ですか？",
    "Who is the author of 'Infrastructure Assisted Autonomous Driving: Research, Challenges, and Opportunities'?",
    "'Infrastructure Assisted Autonomous Driving: Research, Challenges, and Opportunities'の著者は誰ですか？"
]

for question in questions:
    print(f"question: {question}")
    response = generate_output(question)
    print(f"response: \n{response}")

question: What is the primary benefit of deploying Fixed Sensor Nodes (FSNs) in Vehicle-to-Infrastructure (V2I) cooperative systems, especially in complex urban scenarios?
response: 
The primary benefit of deploying Fixed Sensor Nodes (FSNs) in Vehicle-to-Infrastructure (V2I) cooperative systems, especially in complex urban scenarios, is **enhanced situational awareness and improved safety**. 

Here's a breakdown:

**1. Comprehensive Coverage:** FSNs can be strategically placed in urban environments to provide a wide-area view of traffic flow, road conditions, and pedestrian activity. This comprehensive coverage helps to fill gaps in the information provided by individual vehicles, leading to a more complete picture of the environment.

**2. Real-Time Data Collection:** FSNs can continuously collect and transmit data about traffic density, speed, road markings, and other relevant information. This real-time data stream provides valuable insights for both vehicles and infrastructure sys

## 2. RAGの活用

In [29]:
from sentence_transformers import SentenceTransformer

emb_model = SentenceTransformer("infly/inf-retriever-v1-1.5b", trust_remote_code=True)
# In case you want to reduce the maximum length:
emb_model.max_seq_length = 4096

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/284 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/19.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/55.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/918 [00:00<?, ?B/s]

modeling_qwen.py:   0%|          | 0.00/65.2k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/infly/inf-retriever-v1-1.5b:
- modeling_qwen.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


model.safetensors:   0%|          | 0.00/3.09G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.33k [00:00<?, ?B/s]

tokenization_qwen.py:   0%|          | 0.00/10.8k [00:00<?, ?B/s]

A new version of the following files was downloaded from https://huggingface.co/infly/inf-retriever-v1-1.5b:
- tokenization_qwen.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.


vocab.json:   0%|          | 0.00/2.78M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/1.67M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/7.03M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/80.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/370 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/297 [00:00<?, ?B/s]

資料をpdfからテキストに変換する

参考：https://zenn.dev/quiver/articles/21c2978cf869db


In [30]:
!pip install -U pypdfium2
import pypdfium2 as pdfium

def pdf_to_text(pdf):
    all_text = ""
    for page in pdf:
        textpage = page.get_textpage()
        text = textpage.get_text_range()
        all_text += text
    return all_text

pdf = pdfium.PdfDocument('/content/lecture-ai-engineering-day3/data/Infrastructure_Assisted_Autonomous_Driving_Research_Challenges_and_Opportunities.pdf')

raw_writedown = pdf_to_text(pdf)

documents = [text.strip() for text in raw_writedown.split(".")]

print(documents)







ドキュメントからLLMに渡す参考資料を抽出する

In [38]:
# ドキュメントから質問に関連するものを抽出し，参考資料を作成
def generate_ref(question, documents):
  print(f"question: {question}\n")
  # Retrievalの実行
  query_embeddings = emb_model.encode([question], prompt_name="query")
  document_embeddings = emb_model.encode(documents)

  # 各ドキュメントの類似度スコア
  scores = (query_embeddings @ document_embeddings.T) * 100
  print(scores.tolist())

  topk = 5
  references = []
  for ref in [".".join(documents[max(0, i-2): min(i+2, len(documents))]).strip() for i in scores.argsort()[0][::-1][:topk]]:

    query = f"与えられた[参考資料]が[質問]に直接関連しているかを、'yes''no'で答えること。[参考資料]\n{ref}\n\n[質問] {question}"
    response = generate_output(query)

    print("\n\n対象となるドキュメント:\n", ref.replace(".", ".\n"))
    print("\n関連しているかどうか: ", response)

    if "yes" in response.lower():
      references.append(ref)
  return references

In [41]:
references_list = []
for question in questions:
  references_list.append(generate_ref(question, documents))

print (references_list)

question: What is the primary benefit of deploying Fixed Sensor Nodes (FSNs) in Vehicle-to-Infrastructure (V2I) cooperative systems, especially in complex urban scenarios?

[[57.01553726196289, 54.25518798828125, 54.95021438598633, 53.27619171142578, 52.318702697753906, 54.0478401184082, 66.72727966308594, 54.08906173706055, 55.33769226074219, 61.63075637817383, 52.77857971191406, 62.31318664550781, 64.13614654541016, 66.03884887695312, 70.3398666381836, 68.09932708740234, 68.29730224609375, 70.71939849853516, 63.60863494873047, 65.78865051269531, 69.48518371582031, 62.953060150146484, 69.99651336669922, 71.75926971435547, 71.32183074951172, 55.88545227050781, 58.53241729736328, 62.874664306640625, 65.29774475097656, 65.49638366699219, 58.1700325012207, 53.96236038208008, 54.184139251708984, 54.25286102294922, 53.04143524169922, 56.91801452636719, 64.88912200927734, 57.16991424560547, 68.82803344726562, 73.40160369873047, 73.63208770751953, 63.358734130859375, 61.35334777832031, 62.003

参考資料を渡して質問に回答させる

In [43]:
for i in range(len(questions)):
  print(f"question: {questions[i]}")
  query =  f"[参考資料]\n{references_list[i]}\n\n[質問] {questions[i]}"
  response = generate_output(query)
  print(f"response: \n{response}")

question: What is the primary benefit of deploying Fixed Sensor Nodes (FSNs) in Vehicle-to-Infrastructure (V2I) cooperative systems, especially in complex urban scenarios?
response: 
FSNs の主要なメリットは、**複雑な都市環境でのV2Iシステムの効率的な動作とコスト削減**です。


**具体的には:**

* **オンボードセンサーと処理能力の削減:**  FSNs を導入することで、車両が複雑な状況を検知する際に、より多くの処理をFSNsに委託できます。これにより、車両の複雑なセンサーと処理能力の必要性を減らし、コストを削減できます。
* **より安全な運転:**  FSNs は、車両と道路の相互連携を可能にすることで、より安全な運転を実現します。




 



question: 固定センサノード（FSN）をVehicle-to-Infrastructure（V2I）協調システムに導入する主な利点は何ですか？特に複雑な都市環境においては？
response: 
## FSNをV2Iシステムに導入する主な利点

**複雑な都市環境においては、FSN導入により、以下の利点が期待できます。**

* **物理的Occlusionの克服:**  V2Iシステムは、車とFSNの協調によって、車に物理的に隠れている人体の検知を可能にします。
* **より高い精度:**  複数のセンサーからの情報を融合することで、より正確な環境モデルを構築できます。
* **コスト削減:**  車に搭載するセンサーや処理能力を削減できるため、コスト削減につながります。
* **安全性向上:**  より正確な環境認識により、危険な状況を早期に検知し、安全な運転を支援できます。
* **効率的な道路管理:**  FSNは道路の状況をリアルタイムで監視し、交通の効率性を向上させることができます。



**具体的には、複雑な都市環境では以下のようなメリットが挙げられます。**

* **密集した道路環境:**  FSNは、車と連携して道路の複雑な状況を把握し、安全な走行を支援できます。
* **視認性の制限