# Gemma Sprint : **Enhancing Emergency Response with AI**

Team Member : 박성재([github](https://github.com/SeongjaeP)), 이현준([github](https://github.com/Hyundduny)), 최상헌([github](https://github.com/csh01470))



## Ⅰ. **Project Specification**

### 01. Description

<img src="https://drive.google.com/uc?id=10_17J6prS7UR7dsAgvSaY5dYu43GIXSY" width="50%" height="auto"></img>

> This project aims to develop an AI-powered system for the automatic classification of emergency calls using the advanced language understanding capabilities of the Gemma language model. By accurately categorizing calls based on the type of emergency (fire, medical, etc.), the system will enable faster and more targeted dispatch of emergency services, potentially saving lives.

### 02. Dataset Information



> This dataset, provided by AI Hub, encompasses approximately 3,000 hours of emergency call audio data accompanied by corresponding textual transcripts. The textual dataset is derived from the vocal recordings and has been further categorized into various segments. These include labels for levels of urgency, marked as `상(high)`/`중(medium)`/`하(low)`, along with a broader classification of reports such as `구급(ambulance services)`/`구조(rescue operations)`/`화재(fire incidents)`. Additionally, a more detailed classification is available, delineating types of reports into categories like `질병(illness)`/`심정지(cardiac arrest)`/`중상(severe injuries)`/`사고(accidents)` and `자살(suicide attempts)`. Additionally, the dataset includes address data

#### (0) Size

- **Emergency call audio data**: 3,064 hours
- **Text data corresponding to the audio**: 158,973 cases

#### (1) Sample Audio data(.wav)

<audio controls>
  <source src="https://drive.google.com/uc?id=1hqzmqA3VJpNEJqq0nwmWgPMp7bhshjtE" type="audio/mp3">
Your browser does not support the audio element.
</audio>

#### (2) Sample Text data(.json)
```json
{
    'index': 1,
    'text': '
      [] 다.
      [] 여보세요.
      [] 여보세요 네.
      [] 네 여기 산수동 [개인정보] 아버지께서 지금 거, 뭐야 입가에 살짝 거품기 있고 일어나질 않으시거든요. 빨리 좀 와 주시겠어요.
      [] 아이고, 어, 거품 있다고요?
      [] 예 살짝 거품기가 있어요.
      [] 거 깨우면 안 일어난가요? 아버님?
      [] 네?
      [] 깨우면 안, 깨워도 안 일어나요? 반응이 없어요?
      [] 네 원래 새벽부터 일어나시는 분인데,
      [] 숨은 쉰가요?
      [] 예.
      [] 숨은 쉬어요?
      [] 계속, 네 코고 쉬는 것처럼 계속 꺽꺽대고 있어요.
      [] 아후, 뭐 그 당뇨 있어요? 당뇨?
      [] 예, 아니, 당뇨는 자세는 모르겠고, 혈압 혈압이 좀 있으신데,
      [] 아 그래요. 약간 심정지가 의심돼, 네 심정지가 의심되거든요. 잠시만요. 거기가 그 선덕사 아래죠? 선덕사아래?
      [] 네네네 맞아요.
      [] 네 알겠습니다. 그쪽으로 구급차 가기 전에 그 응급처치 안내 받아보세요.
      [] 네.
      [] 여보세요.
    ',
    'disasterLarge': '구급',
    'disasterMedium': '질병(중증)',
    'urgencyLevel': '상',
    'address': '광주광역시 동구 산수동'
}
```

(⚠️) For demonstration, the 'text' in json key does not exist in the actual sample JSON, and it was created as an **example**.

### 03. Process

<img src="https://drive.google.com/uc?id=1b0qHQI3yf_hyY4LWI7PVRftfKysT3FML" width="auto" height="auto"></img>

#### (1) Set work environment

#### (2) Read (audio) data & STT(Speech-To-Text)

#### (3) Tokenizing

#### (4) Attention Encoder

#### (5-1) Labels Classifier

#### (5-2) Address Text-Genererator

#### (6) Predict Data



<b></b>

## Ⅱ. **Code**

### 00. Set work environment

In [None]:
#(1) Install pacakges
!pip3 install -q -U faster_whisper
!pip3 install -q -U transformers
!pip3 install -q -U datasets
!pip3 install -q -U bitsandbytes
!pip3 install -q -U peft
!pip3 install -q -U sentence-transformers
!pip3 install -q -U faiss-gpu

In [None]:
#(2) Import packages
import os
import warnings
import glob
import requests
import json
import re
from faster_whisper import WhisperModel
import huggingface_hub
import shutil
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import datasets
import transformers
import peft
import faiss
import bitsandbytes as bnb
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModelForCausalLM
from IPython.display import Audio, display

#(3) Set options
warnings.filterwarnings(action='ignore')
np.set_printoptions(precision=4, suppress=True)

In [None]:
#(4) Check cuda(GPU)
print(f'>> cuda available : {torch.cuda.is_available()}')
if torch.cuda.is_available() :
  print(f'>> device list : ')
  for i in range(torch.cuda.device_count()):
      print(f"  - device {i}: {torch.cuda.get_device_name(i)}")

>> cuda available : True
>> device list : 
  - device 0: Tesla T4


In [None]:
#(5) Define `device`
if torch.cuda.is_available() :
  device_type = 'cuda'
else :
  device_type = 'cpu'
device = torch.device(type=device_type)

In [None]:
#(6) Login hugging-face
huggingface_hub.login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
#(7-1) Make user-defined functions for STT
def split_into_sentences(text):
    sentences = re.split(r'(?<=[?.])\s*', text)
    return sentences

def extract_text_from_file(file_path: str) -> dict :
    model_size = "large-v3"
    model = WhisperModel(model_size, device='auto', compute_type = "int8")
    segments, info = model.transcribe(
        file_path,
        language="ko",
        beam_size=10,
        word_timestamps=True,
        vad_filter=True,
        vad_parameters=dict(min_silence_duration_ms=500)
    )
    return segments

In [None]:
#(7-2) Make user-defined functions for Labels Classifier
def download_file(file_id, file_path) :
  if not os.path.exists(path=file_path):
      url = f'https://drive.google.com/uc?id={file_id}'
      response = requests.get(url)
      with open(file=file_path, mode='wb') as file:
          file.write(response.content)

def tokenize_txt(example):
    output = tokenizer(text=example, truncation=True, padding='max_length', max_length=128)
    return output

def compute_metrics(preds, labels):
    preds_ul, preds_dl, preds_dm = preds
    labels_ul, labels_dl, labels_dm = labels
    f1_ul = sklearn.metrics.f1_score(y_true=labels_ul, y_pred=preds_ul, average='weighted')
    f1_dl = sklearn.metrics.f1_score(y_true=labels_dl, y_pred=preds_dl, average='weighted')
    f1_dm = sklearn.metrics.f1_score(y_true=labels_dm, y_pred=preds_dm, average='weighted')
    output = {
        'f1_urgencyLevel': f1_ul,
        'f1_disasterLarge': f1_dl,
        'f1_disasterMedium': f1_dm,
    }
    return output

def evaluate_value(model, dataloader, device):
    model.eval()
    all_preds_ul = []
    all_preds_dl = []
    all_preds_dm = []
    all_labels_ul = []
    all_labels_dl = []
    all_labels_dm = []

    with torch.no_grad():
        for batch in dataloader:
            input_ids = batch['input_ids'].to(device)
            attention_mask = batch['attention_mask'].to(device)
            labels_ul = batch['urgencyLevel'].to(device)
            labels_dl = batch['disasterLarge'].to(device)
            labels_dm = batch['disasterMedium'].to(device)

            outputs = model(input_ids=input_ids, attention_mask=attention_mask)
            logits_ul = outputs['logits']['logits_ul']
            logits_dl = outputs['logits']['logits_dl']
            logits_dm = outputs['logits']['logits_dm']

            preds_ul = torch.argmax(input=logits_ul, dim=1)
            preds_dl = torch.argmax(input=logits_dl, dim=1)
            preds_dm = torch.argmax(input=logits_dm, dim=1)

            all_preds_ul.extend(preds_ul.cpu().numpy())
            all_preds_dl.extend(preds_dl.cpu().numpy())
            all_preds_dm.extend(preds_dm.cpu().numpy())
            all_labels_ul.extend(labels_ul.cpu().numpy())
            all_labels_dl.extend(labels_dl.cpu().numpy())
            all_labels_dm.extend(labels_dm.cpu().numpy())

    output = compute_metrics(
        (all_preds_ul, all_preds_dl, all_preds_dm),
        (all_labels_ul, all_labels_dl, all_labels_dm)
    )
    model.train()
    return output

In [None]:
#(7-3) Make user-defined functions for Get Address
def create_simple_prompt(user_input):
    chat = [
        {"role": "user", "content": f"신고 전화 대화에서 신고자 주소만 출력해줘.\n{user_input}"}
    ]
    prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
    return prompt

def create_rag_prompt(user_input, first_output, embedding_model, index, address_chunks, k=10):
    # 답변 임베딩 생성
    answer_embedding = embedding_model.encode([first_output])

    # DB 임베딩과 답변 임베딩 비교하여 상위 K개의 관련 문서 검색
    _, I = index.search(answer_embedding, k)

    # 참고 자료로 사용할 문서 추출
    context = "참고 자료:\n"
    for i in I[0]:
        context += f"문서 {i+1}: {address_chunks[i]}\n"

    # 최종 프롬프트 생성
    chat = [
        {"role": "user", "content": f"최종 출력은 참고자료 문서를 그대로 활용해줘.\n{context}\n신고 전화 대화에서 신고자 주소만 출력해줘.\n{user_input}"}
    ]
    prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
    return prompt


In [None]:
#(9) Make user-defined class
class MultiLabelDataset(datasets.Dataset) :
    def __init__(self, dataset) :
        self.dataset = dataset

    def __getitem__(self, idx):
        example = self.dataset[idx]
        input_ids = torch.tensor(data=example['input_ids'])
        attention_mask = torch.tensor(data=example['attention_mask'])
        urgencyLevel = torch.tensor(data=example['urgencyLevel'])
        disasterLarge = torch.tensor(data=example['disasterLarge'])
        disasterMedium = torch.tensor(data=example['disasterMedium'])
        output = {
            'input_ids'      : input_ids,
            'attention_mask' : attention_mask,
            'urgencyLevel'   : urgencyLevel,
            'disasterLarge'  : disasterLarge,
            'disasterMedium' : disasterMedium
        }
        return output

    def __len__(self) :
        output = len(self.dataset)
        return output

class MultiPredictModel(nn.Module):
    def __init__(self, encoder, disasterLarge_labels, disasterMedium_labels, urgencyLevel_labels):
        super(MultiPredictModel, self).__init__()
        self.encoder = encoder
        hidden_size = self.encoder.config.hidden_size

        # self.dropout = nn.Dropout(p=0.1)
        self.classifiers = nn.ModuleDict(modules={
            'urgencyLevel'   : self._build_classifier(input_size=hidden_size, output_size=len(urgencyLevel_labels)),
            'disasterLarge'  : self._build_classifier(input_size=hidden_size, output_size=len(disasterLarge_labels)),
            'disasterMedium' : self._build_classifier(input_size=hidden_size, output_size=len(disasterMedium_labels)),
        })

        self.labels = {
            'urgencyLevel'   : urgencyLevel_labels,
            'disasterLarge'  : disasterLarge_labels,
            'disasterMedium' : disasterMedium_labels
        }

    def _build_classifier(self, input_size, output_size):
        return nn.Sequential(
            nn.Linear(in_features=input_size, out_features=(input_size//2)),
            nn.BatchNorm1d(num_features=(input_size//2)),
            nn.ReLU(),
            nn.Dropout(p=0.1),
            nn.Linear(in_features=(input_size//2), out_features=(input_size//4)),
            nn.BatchNorm1d(num_features=input_size//4),
            nn.ReLU(),
            nn.Dropout(p=0.1),
            nn.Linear(in_features=(input_size//4), out_features=output_size)
        )

    def forward(self, input_ids, attention_mask, disasterLarge=None, disasterMedium=None, urgencyLevel=None):
        outputs = self.encoder(input_ids=input_ids, attention_mask=attention_mask)
        # pooled_output = outputs.last_hidden_state[:, 0, :].float()
        pooled_output = outputs.last_hidden_state.mean(dim=1).float()
        # pooled_output = self.dropout(pooled_output)

        logits = {}
        for key, classifier in self.classifiers.items():
            logits[key] = classifier(pooled_output)

        total_loss = None
        CON = (
            (urgencyLevel is not None) and
            (disasterLarge is not None) and
            (disasterMedium is not None)
        )
        if CON :
            loss_fct = nn.CrossEntropyLoss()
            loss_ul = loss_fct(logits['urgencyLevel'], urgencyLevel.long())
            loss_dl = loss_fct(logits['disasterLarge'], disasterLarge.long())
            loss_dm = loss_fct(logits['disasterMedium'], disasterMedium.long())
            total_loss = loss_ul + loss_dl + loss_dm
        output = {
            'loss'   : total_loss,
            'logits' : {
                'logits_ul': logits['urgencyLevel'],
                'logits_dl': logits['disasterLarge'],
                'logits_dm': logits['disasterMedium']
            }
        }
        return output

    def predict(self, texts, tokenizer, device):
        self.eval()

        if type(texts) == str :
            texts = [texts]

        encodings = tokenizer(
            text=texts,
            truncation=True,
            padding=True,
            max_length=128,
            return_tensors='pt'
        ).to(device=device)

        with torch.no_grad():
            outputs = self.forward(
                input_ids=encodings['input_ids'],
                attention_mask=encodings['attention_mask']
            )

        predictions = []
        batch_size = encodings['input_ids'].shape[0]
        for i in range(batch_size):
            pred = {}
            total_values = {}
            best_values = {}
            for key in self.classifiers.keys():
                if key == 'urgencyLevel':
                    logits = outputs['logits']['logits_ul'][i]
                elif key == 'disasterLarge':
                    logits = outputs['logits']['logits_dl'][i]
                elif key == 'disasterMedium':
                    logits = outputs['logits']['logits_dm'][i]
                else:
                    raise KeyError(f"Unknown key: {key}")

                probs = torch.nn.functional.softmax(input=logits, dim=0).cpu()
                pred_class = self.labels[key][torch.argmax(input=probs).item()]
                best_values[key] = pred_class
                total_values[f'labels_{key}'] = self.labels[key]
                total_values[f'probs_{key}'] = probs.numpy().tolist()

            predictions.append({
                'bestValues'  : best_values,
                'totalValues' : total_values
            })
        return predictions

### 01. Read dataset

In [None]:
#(1) Download audio file (or another sample)
sample_wav_path = 'sample_audio.wav'
sample_wav_id = '1hqzmqA3VJpNEJqq0nwmWgPMp7bhshjtE'
download_file(file_id=sample_wav_id, file_path=sample_wav_path)

#(2) Play audio file
display(Audio(data=sample_wav_path, autoplay=False))

In [None]:
#(3) Define labels
urgencyLevel_unique_lbs = ['상', '중', '하']
disasterLarge_unique_lbs = ['구급', '구조', '기타', '화재']
disasterMedium_unique_lbs = [
  '기타', '기타구급', '기타구조', '기타화재', '대물사고', '부상', '사고',
  '산불', '심정지', '안전사고', '약물중독', '일반화재', '임산부', '자살',
  '질병(중증 외)',  '질병(중증)'
]

#(4) Download text file
sample_txt_path = 'sample_txt.json'
sample_txt_id = '1Himjdq4q8THIWAPPTPwV43KNwgxYU4-x'
download_file(file_id=sample_txt_id, file_path=sample_txt_path)

#(5) Read text file
with open(file=sample_txt_path, mode='r', encoding='utf-8') as file:
    sample_txt = json.load(file)
    sample_txt = {
      'urgencyLevel' : sample_txt['urgencyLevel'],
      'disasterLarge' : sample_txt['disasterLarge'],
      'disasterMedium' : sample_txt['disasterMedium'],
      'address' : sample_txt['address']
    }

#(7) Print text file
sample_txt

{'urgencyLevel': '상',
 'disasterLarge': '구급',
 'disasterMedium': '질병(중증)',
 'address': '광주광역시 동구 산수동'}

In [None]:
#(8) Download address
address_path = 'building_road_address.csv'
address_id = '1V-_U-3Lyjd0AiJQ-UYmXO8qL_h_FlGWf'
download_file(file_id=address_id, file_path=address_path)

#(9) Read address file
address_data = pd.read_csv("building_road_address.csv", encoding='utf-8')

#(10) Print address data
address_data

Unnamed: 0,도로명주소
0,부산광역시 중구 초량상로 13 우남이채롬 (영주동)
1,부산광역시 중구 동영로 52-8 경원빌라 (영주동)
2,부산광역시 중구 동영로 56 한솔아트빌라 (영주동)
3,부산광역시 중구 동영로 63 동국쉐르빌 (영주동)
4,부산광역시 중구 동영로 66 홍조이브빌A (영주동)
...,...
186231,서울특별시 강동구 아리수로97길 19 강일리버파크 (강일동)
186232,서울특별시 강동구 아리수로97길 20 강일리버파크 (강일동)
186233,서울특별시 강동구 아리수로97길 68 강일리버파크 (강일동)
186234,서울특별시 강동구 아리수로98길 25 강일리버파크 (강일동)


### 02. STT(Speech-To-Text)

In [None]:
#(1)
extracted_text = extract_text_from_file(file_path=sample_wav_path)

In [None]:
#(2)
stt_txt = ''
for segment in extracted_text :
    sentences = split_into_sentences(segment.text)
    for sentence in sentences :
        if sentence :
            sentence = f'[] {sentence} \n'
            CON = bool(re.search(pattern=r'  ', string=sentence))
            while CON :
              sentence = re.sub(pattern=r'  ', repl=' ', string=sentence)
              CON = bool(re.search(pattern=r'  ', string=sentence))
            stt_txt += sentence

#(3)
print(stt_txt)

[] 여보세요? 
[] 네, 여기 산수동 아버지께서 지금 거 뭐야 입가에 살짝 거품기 있고 일어나지 않으시거든요. 
[] 빨리 좀 와주시겠어요? 
[] 어 거품이 있다고요? 
[] 네, 살짝 거품기가 있어요. 
[] 그 깨우면 안 일어나는가요? 
[] 아버님? 
[] 네? 
[] 깨우면 아 깨워도 안 일어나요? 
[] 반응이 없어요? 
[] 네, 원래 새벽부터 일어나시는 분인데 숨은 쉰가요? 
[] 네. 
[] 숨은 쉬어요? 
[] 그래서 네, 코 고시는 것처럼 계속 아우 뭐 그럼 당뇨 있어요? 
[] 당뇨? 
[] 예, 아니, 당뇨는 당연히 모르겠고 혈압이 좀 있으신데. 
[] 네, 심정지 의심 되거든요. 
[] 잠시만요. 
[] 거기가 그 선덕사 아래죠? 
[] 선덕사 아래. 
[] 네, 네, 네, 맞아요. 
[] 네, 알겠습니다. 
[] 그쪽으로 구급차 가기 전에 그 응급처치 안내 받아보세요. 
[] 네. 
[] 여보세요? 
[] 네, 알겠습니다. 



<b></b>

### 03. Tokenizing

In [None]:
#(1) Load tokenizer
base_model_path = 'google/gemma-2-2b-it'
tokenizer = transformers.AutoTokenizer.from_pretrained(
    pretrained_model_name_or_path=base_model_path
)
tokenizer.padding_side = 'right'

In [None]:
#(2)
print(f'>> Vocab size of the tokenizer "{base_model_path}" : {len(tokenizer.get_vocab()):,}')

>> Vocab size of the tokenizer "google/gemma-2-2b-it" : 256,000


### 04. Use Labels Classifier

In [None]:
#(1)
bnb_config = transformers.BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

#(2)
base_model = transformers.AutoModel.from_pretrained(
    pretrained_model_name_or_path=base_model_path,
    quantization_config=bnb_config,
    device_map='auto'
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
#(3)
ft_model = MultiPredictModel(
    encoder=base_model,
    urgencyLevel_labels=urgencyLevel_unique_lbs,
    disasterLarge_labels=disasterLarge_unique_lbs,
    disasterMedium_labels=disasterMedium_unique_lbs
).to(device=device)

#(4)
repo_id = "sanghe0n/Gemma2-multiLabelPredict"
file_nm = 'modelState.bin'
best_model_path = huggingface_hub.hf_hub_download(repo_id=repo_id, filename=file_nm)
state_dict = torch.load(f=best_model_path, map_location=device)

#(5)
try :
  ft_model.load_state_dict(state_dict=state_dict, strict=False)
except :
  pass

In [None]:
#(5)
ft_model.eval()

MultiPredictModel(
  (encoder): Gemma2Model(
    (embed_tokens): Embedding(256000, 2304, padding_idx=0)
    (layers): ModuleList(
      (0-25): 26 x Gemma2DecoderLayer(
        (self_attn): Gemma2Attention(
          (q_proj): Linear4bit(in_features=2304, out_features=2048, bias=False)
          (k_proj): Linear4bit(in_features=2304, out_features=1024, bias=False)
          (v_proj): Linear4bit(in_features=2304, out_features=1024, bias=False)
          (o_proj): Linear4bit(in_features=2048, out_features=2304, bias=False)
          (rotary_emb): Gemma2RotaryEmbedding()
        )
        (mlp): Gemma2MLP(
          (gate_proj): Linear4bit(in_features=2304, out_features=9216, bias=False)
          (up_proj): Linear4bit(in_features=2304, out_features=9216, bias=False)
          (down_proj): Linear4bit(in_features=9216, out_features=2304, bias=False)
          (act_fn): PytorchGELUTanh()
        )
        (input_layernorm): Gemma2RMSNorm((2304,), eps=1e-06)
        (pre_feedforward_layernor

In [None]:
#(6)
pred_labels = ft_model.predict(texts=stt_txt, tokenizer=tokenizer, device=device)[0]

#(7)
pred_labels

{'bestValues': {'urgencyLevel': '상',
  'disasterLarge': '구급',
  'disasterMedium': '질병(중증)'},
 'totalValues': {'labels_urgencyLevel': ['상', '중', '하'],
  'probs_urgencyLevel': [0.9999204874038696,
   7.169481978053227e-05,
   7.883282705734018e-06],
  'labels_disasterLarge': ['구급', '구조', '기타', '화재'],
  'probs_disasterLarge': [1.0,
   2.8802134011129965e-09,
   4.288197574808805e-10,
   5.179130901922235e-09],
  'labels_disasterMedium': ['기타',
   '기타구급',
   '기타구조',
   '기타화재',
   '대물사고',
   '부상',
   '사고',
   '산불',
   '심정지',
   '안전사고',
   '약물중독',
   '일반화재',
   '임산부',
   '자살',
   '질병(중증 외)',
   '질병(중증)'],
  'probs_disasterMedium': [3.73436659373283e-09,
   5.879176751477644e-05,
   1.9575259191384475e-09,
   1.661844883926733e-08,
   2.408330423975258e-09,
   1.9992143052149913e-08,
   3.899120581962734e-09,
   3.756706057345127e-09,
   0.008578823879361153,
   2.2477415484445373e-09,
   6.831772338955489e-07,
   1.2230373158672592e-08,
   0.004659168887883425,
   7.623347092478028e-11,
   0

### 05. Use Address-Inferencer By RAG

#### (`PLUS`) Reason for Using a **RAG** Model for Address Estimation
> Emergency calls to 119 are connected sequentially from the fire station at the legal dong (neighborhood) level to the fire center at the legal city level. When the fire center receives a transferred call and inquires about the caller's location, the caller typically specifies their position based on building references. However, the fire center, which oversees all local fire stations in the area, faces challenges in pinpointing the exact address solely based on the building name due to the existence of multiple buildings with similar names.
To address this issue, we have developed a Retrieval-Augmented Generation (RAG) system utilizing the building road address database. By extracting the top five building addresses within the fire station's jurisdiction (legal dong) from the database—based on their similarity to the building name provided by the caller—and incorporating these into the prompt, we enable the accurate summarization of the caller's location in the correct address format during emergency calls.

In [None]:
#(1)
model = AutoModelForCausalLM.from_pretrained(base_model_path, quantization_config=bnb_config, low_cpu_mem_usage=True)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/187 [00:00<?, ?B/s]

In [None]:
#(2)
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
#(3) 데이터에서 대화, 주소 추출
text = stt_txt
address = sample_txt['address']

#(4) 신고 센터 소재지에 포함된 주소만 DB에서 추출
specific_word = address.split(' ')[-1]
filtered_address = address_data[address_data['도로명주소'].str.contains(specific_word, na=False)]

#(5) 추출한 주소 Chunk화, 검색을 위해 건물 이름만 Embedding화
address_chunks = filtered_address['도로명주소'].tolist()
building_chunks = [''.join(address.split()[-2]) for address in address_chunks]
embeddings = embedding_model.encode(building_chunks)
d = embeddings.shape[1]

#(6) Indexing embeddings using FAISS
index = faiss.IndexFlatL2(d)
index.add(embeddings)

In [None]:
#(7)
prompt = create_simple_prompt(text)
# print(prompt)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
# print(tokenizer.decode(outputs[0]).split("<start_of_turn>model")[1])
first_output = tokenizer.decode(outputs[0]).split('\n')[-2]

The 'max_batch_size' argument of HybridCache is deprecated and will be removed in v4.46. Use the more precisely named 'batch_size' argument instead.
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)


In [None]:
#(8)
prompt = create_rag_prompt(text, first_output, embedding_model, index, address_chunks, k=5)
print(prompt)

<bos><start_of_turn>user
최종 출력은 참고자료 문서를 그대로 활용해줘.
참고 자료:
문서 11: 광주광역시 동구 동계로 39-1 산수빌리지B동 (산수동)
문서 10: 광주광역시 동구 동계로 39 산수빌리지A동 (산수동)
문서 12: 광주광역시 동구 동계로 39-2 산수빌리지C동 (산수동)
문서 2: 광주광역시 동구 무등로 417-9 동진맨션 (산수동)
문서 19: 광주광역시 동구 경양로367번길 17 산수맨션 (산수동)

신고 전화 대화에서 신고자 주소만 출력해줘.
[] 여보세요? 
[] 네, 여기 산수동 아버지께서 지금 거 뭐야 입가에 살짝 거품기 있고 일어나지 않으시거든요. 
[] 빨리 좀 와주시겠어요? 
[] 어 거품이 있다고요? 
[] 네, 살짝 거품기가 있어요. 
[] 그 깨우면 안 일어나는가요? 
[] 아버님? 
[] 네? 
[] 깨우면 아 깨워도 안 일어나요? 
[] 반응이 없어요? 
[] 네, 원래 새벽부터 일어나시는 분인데 숨은 쉰가요? 
[] 네. 
[] 숨은 쉬어요? 
[] 그래서 네, 코 고시는 것처럼 계속 아우 뭐 그럼 당뇨 있어요? 
[] 당뇨? 
[] 예, 아니, 당뇨는 당연히 모르겠고 혈압이 좀 있으신데. 
[] 네, 심정지 의심 되거든요. 
[] 잠시만요. 
[] 거기가 그 선덕사 아래죠? 
[] 선덕사 아래. 
[] 네, 네, 네, 맞아요. 
[] 네, 알겠습니다. 
[] 그쪽으로 구급차 가기 전에 그 응급처치 안내 받아보세요. 
[] 네. 
[] 여보세요? 
[] 네, 알겠습니다.<end_of_turn>
<start_of_turn>model



In [None]:
#(9)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
pred_address = tokenizer.decode(outputs[0]).split("<start_of_turn>model")[1].replace('<end_of_turn>', '')
pred_address = re.sub(pattern=r'\n', repl='', string=pred_address)
pred_address = pred_address.strip()

#(10)
pred_address

'산수동'

### 06. Predict Sample Data

In [None]:
print('>> Input ::')
print(stt_txt)
print('>> Labels ::')
print(f'  - 긴급정도 : {sample_txt["urgencyLevel"]}')
print(f'  - 신고 대분류 : {sample_txt["disasterLarge"]}')
print(f'  - 신고 소분류 : {sample_txt["disasterMedium"]}')
print(f'  - 신고 주소 : {sample_txt["address"]}')
print(f' ')
print('>> Best predictions ::')
print(f'  - 긴급정도 : {pred_labels["bestValues"]["urgencyLevel"]}')
print(f'  - 신고 대분류 : {pred_labels["bestValues"]["disasterLarge"]}')
print(f'  - 신고 소분류 : {pred_labels["bestValues"]["disasterMedium"]}')
print(f'  - 신고 주소 : {pred_address}')
print(f' ')
print('>> Total predictions ::')
print(f'  - 긴급정도 라벨   : {pred_labels["totalValues"]["labels_urgencyLevel"]}')
print(f'  - 긴급정도 확률   : {pred_labels["totalValues"]["probs_urgencyLevel"]}')
print(f'  - 신고 대분류 라벨 : {pred_labels["totalValues"]["labels_disasterLarge"]}')
print(f'  - 신고 대분류 확률 : {pred_labels["totalValues"]["probs_disasterLarge"]}')
print(f'  - 신고 소분류 라벨 : {pred_labels["totalValues"]["labels_disasterMedium"]}')
print(f'  - 신고 소분류 확률 : {pred_labels["totalValues"]["probs_disasterMedium"]}')

>> Input ::
[] 여보세요? 
[] 네, 여기 산수동 아버지께서 지금 거 뭐야 입가에 살짝 거품기 있고 일어나지 않으시거든요. 
[] 빨리 좀 와주시겠어요? 
[] 어 거품이 있다고요? 
[] 네, 살짝 거품기가 있어요. 
[] 그 깨우면 안 일어나는가요? 
[] 아버님? 
[] 네? 
[] 깨우면 아 깨워도 안 일어나요? 
[] 반응이 없어요? 
[] 네, 원래 새벽부터 일어나시는 분인데 숨은 쉰가요? 
[] 네. 
[] 숨은 쉬어요? 
[] 그래서 네, 코 고시는 것처럼 계속 아우 뭐 그럼 당뇨 있어요? 
[] 당뇨? 
[] 예, 아니, 당뇨는 당연히 모르겠고 혈압이 좀 있으신데. 
[] 네, 심정지 의심 되거든요. 
[] 잠시만요. 
[] 거기가 그 선덕사 아래죠? 
[] 선덕사 아래. 
[] 네, 네, 네, 맞아요. 
[] 네, 알겠습니다. 
[] 그쪽으로 구급차 가기 전에 그 응급처치 안내 받아보세요. 
[] 네. 
[] 여보세요? 
[] 네, 알겠습니다. 

>> Labels ::
  - 긴급정도 : 상
  - 신고 대분류 : 구급
  - 신고 소분류 : 질병(중증)
  - 신고 주소 : 광주광역시 동구 산수동
 
>> Best predictions ::
  - 긴급정도 : 상
  - 신고 대분류 : 구급
  - 신고 소분류 : 질병(중증)
  - 신고 주소 : 산수동
 
>> Total predictions ::
  - 긴급정도 라벨   : ['상', '중', '하']
  - 긴급정도 확률   : [0.9999204874038696, 7.169481978053227e-05, 7.883282705734018e-06]
  - 신고 대분류 라벨 : ['구급', '구조', '기타', '화재']
  - 신고 대분류 확률 : [1.0, 2.8802134011129965e-09, 4.288197574808805e-10, 5.179130901922235e-09]
  - 신고 소분류 라벨 : ['기타', '기타구급', '기타구조', '기타화재', '대물사

<b></b>

## III. **Reference**




### 01. Whisper

- [Introduce](https://openai.com/index/whisper/)

- [Read Paper](https://cdn.openai.com/papers/whisper.pdf)

### 02. Gemma

- [Read Paper](https://arxiv.org/pdf/2408.00118)