# Text-to-SQL을 위한 고급 프롬프팅: Few-Shot과 Divide and Prompt
자연어 질문을 SQL로 변환하기 위한 고급 프롬프팅 기법의 활용

---

## 권장 SageMaker 환경
SageMaker 이미지: sagemaker-distribution-cpu

커널: Python 3

인스턴스 타입: ml.m5.large

---

## 목차

1. [종속성 설치](#1단계-종속성-설치)
1. [Bedrock과 데이터베이스 설정](#2단계-bedrock-클라이언트와-데이터베이스-연결-설정)
1. [데이터베이스 구축](#3단계-데이터베이스-구축)
1. [헬퍼 함수 생성](#4단계-헬퍼-함수-생성)
1. [Few-Shot 프롬프트 실행](#5단계-few-shot-프롬프트-실행)
1. [Divide & Prompt 프롬프트 생성](#6단계-divide--prompt-프롬프트-생성)
1. [Divide & Prompt 프롬프트 실행](#7단계-divide--prompt-프롬프트-실행)

---

## 학습 목표
이 노트북은 자연어 질문을 해당 질문에 답하는 SQL 쿼리로 변환하는 두 가지 서로 다른 접근법을 구현하는 데 도움이 되는 코드 스니펫을 제공합니다.

---

## Text-to-SQL 문제에 대한 접근법

이 노트북에서는 두 가지 접근법을 다룹니다:

### Few-shot text-to-SQL
Few-shot text-to-SQL은 몇 개의 학습 예시만을 사용하여 자연어 질문을 SQL 쿼리로 변환해 데이터베이스를 조회하는 방법입니다.

자연어 질문과 동일한 SQL 쿼리가 쌍을 이룬 몇 개의 예시만 제공하면, 모델이 자연어에서 SQL로의 매핑을 학습할 수 있습니다.

이 방법은 "대형 언어 모델의 Few-shot Text-to-SQL 기능 향상" 논문을 참조하며, 이 논문에서는 인컨텍스트 학습(ICL)이 다양한 자연어 처리 작업에 대한 새로운 접근법으로 떠오르고 있다고 언급합니다. 대형 언어 모델(LLM)이 몇 개의 예시나 작업별 지침으로 보완된 컨텍스트를 기반으로 예측을 수행하는 방식입니다.

참고 문헌: https://arxiv.org/abs/2305.12586

### Divide-and-Prompt
Divide-and-Prompt에서는 먼저 작업을 하위 작업으로 나누고, 각 하위 작업을 연쇄 사고(Chain-of-Thought, CoT) 추론을 통해 접근합니다. 실험 결과 이러한 프롬프트가 LLM이 더 정확한 SQL 문을 생성하도록 도움을 준다는 것을 보여줍니다.

참고 문헌: https://arxiv.org/abs/2304.11556

![Divide-and-Prompt 예시](./images/DnP.png)


### 사용 도구
langchain, Amazon Bedrock SDK (Boto3)

---

### 1단계: 종속성 설치

여기서는 이 노트북을 실행하는 데 필요한 모든 종속성을 설치합니다. 이 모듈에서 사용하지 않을 라이브러리에 대한 종속성 충돌로 인해 발생할 수 있는 **다음 오류들은 무시하셔도 됩니다**:
```
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dash 2.14.1 requires dash-core-components==2.0.0, which is not installed.
dash 2.14.1 requires dash-html-components==2.0.0, which is not installed.
dash 2.14.1 requires dash-table==5.0.0, which is not installed.
jupyter-ai 2.5.0 requires faiss-cpu, which is not installed.
amazon-sagemaker-jupyter-scheduler 3.0.4 requires pydantic==1.*, but you have pydantic 2.6.0 which is incompatible.
gluonts 0.13.7 requires pydantic~=1.7, but you have pydantic 2.6.0 which is incompatible.
jupyter-ai 2.5.0 requires pydantic~=1.0, but you have pydantic 2.6.0 which is incompatible.
jupyter-ai-magics 2.5.0 requires pydantic~=1.0, but you have pydantic 2.6.0 which is incompatible.
jupyter-scheduler 2.3.0 requires pydantic~=1.10, but you have pydantic 2.6.0 which is incompatible.
sparkmagic 0.21.0 requires pandas<2.0.0,>=0.17.1, but you have pandas 2.1.2 which is incompatible.
tensorflow 2.12.1 requires typing-extensions<4.6.0,>=3.6.6, but you have typing-extensions 4.9.0 which is incompatible.
```

In [None]:
!python -m ensurepip --upgrade
%pip install -qU boto3
%pip install -qU botocore
%pip install -qU langchain
%pip install -qU sqlalchemy
%pip install -qU mysql-connector-python

#### 이제 노트북을 실행하는 데 필요한 모듈들을 가져오겠습니다

In [None]:
import json
import time
import os
import sys
import random
import importlib
from functools import partial

from langchain import PromptTemplate
import boto3
import sqlalchemy
from sqlalchemy import create_engine
import mysql.connector

sys.path.append('../')
import utilities as u

In [None]:
model_id: str = "anthropic.claude-v2"
# model_id: str = "amazon.titan-tg1-large"
temperature: float = 0.2
top_k: int = 200

In [None]:
run_bedrock = partial(u.run_bedrock_simple_prompt,
                      system_prompts=[],
                      model_id=model_id,
                      temperature=temperature,
                      top_k=top_k)

### 2단계: Bedrock 클라이언트와 데이터베이스 연결 설정

In [None]:
bedrock_client = boto3.client(service_name='bedrock-runtime')

In [None]:
# Define variables for database connection details
DB_HOST, DB_USER, DB_PASSWORD = \
    u.extract_CF_outputs("RDSInstanceEndpoint", "DbUser", "DbPassword")

DB_HOST, DB_USER, DB_PASSWORD

In [None]:
# Establish the database connection using the variables
mydb = mysql.connector.connect(
    host=DB_HOST,
    user=DB_USER,
    password=DB_PASSWORD)
mydb

#### 이 섹션을 사용하여 테스트 데이터베이스에 이미 있는 모든 데이터베이스를 확인합니다.

In [None]:
mycursor = mydb.cursor()
mycursor.execute("SHOW DATABASES")
for x in mycursor:
  print(x)

### 3단계: 데이터베이스 구축
이제 노트북은 테스트 테이블과 테스트 데이터베이스가 존재한다면 이를 삭제합니다. 그런 다음 테이블 생성을 진행합니다.
그 후 프롬프팅 예시에서 사용할 테스트 데이터를 삽입합니다.

In [None]:
mycursor.execute("DROP TABLE IF EXISTS LLM_DEMO.TEST_EMPLOYEE_LLM")

In [None]:
mycursor.execute("DROP DATABASE IF EXISTS LLM_DEMO")

`LLM_DEMO` 데이터베이스 생성

In [None]:
mycursor.execute("CREATE DATABASE LLM_DEMO")

가상의 직원 정보를 담은 `TEST_EMPLOYEE_LLM` 테이블 생성

In [None]:
mycursor.execute("""
CREATE TABLE LLM_DEMO.TEST_EMPLOYEE_LLM -- Table name
(EMPID INT(10), -- employee id of the employee
NAME VARCHAR(20), -- name of the employee
SALARY INT(10), -- salary that the employee gets or makes
BONUS INT(10),-- bonus that the employee gets or makes
CITY VARCHAR(20), -- city where employees work from or belongs to
JOINING_DATE TIMESTAMP,-- date of joining for the employee
ACTIVE_EMPLOYEE INT(2), -- whether the employee is active(1) or in active(0)
DEPARTMENT VARCHAR(20), -- the deparment name where employee works or belongs to
TITLE VARCHAR(20) -- the title in office which employees has or holds
)
""")

테이블에 데이터 삽입

In [None]:
mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (1, 'Xyon McFluff', 50000, 10000, 'New York', '2020-01-01 10:00:00', 1, 'Engineering', 'Manager');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE) 
VALUES (2, 'Twinkle Luna', 60000, 5000, 'Chicago', '2018-05-15 11:30:00', 1, 'Sales', 'Executive');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (3, 'Zorfendorf', 45000, 2000, 'Miami', '2021-09-01 09:15:00', 1, 'Marketing', 'Associate');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)  
VALUES (4, 'Gloobinorg', 72000, 8000, 'Seattle', '2017-04-05 14:20:00', 1, 'IT', 'Manager');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (5, 'Bonkliwop', 65000, 6000, 'Denver', '2020-11-24 08:45:00', 1, 'Sales', 'Associate');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (6, 'Ploopdewoop', 55000, 4000, 'Philadelphia', '2019-03-11 10:25:00', 1, 'Marketing', 'Executive');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (7, 'Flooblelobber', 80000, 9000, 'San Francisco', '2016-08-20 12:35:00', 1, 'Engineering', 'Lead');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)  
VALUES (8, 'Blippitybloop', 57000, 3000, 'Boston', '2018-12-01 15:00:00', 1, 'Finance', 'Analyst');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (9, 'Snorkeldink', 74000, 7000, 'Atlanta', '2015-10-07 16:15:00', 1, 'IT', 'Lead');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (10, 'Wuggawugga', 69000, 5000, 'Austin', '2017-06-19 13:45:00', 1, 'Engineering', 'Manager'); """)

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (11, 'Foofletoot', 62000, 4000, 'San Diego', '2019-02-24 17:30:00', 1, 'Sales', 'Associate');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (12, 'Bonkbonk', 82000, 8000, 'Silicon Valley', '2014-12-05 09:45:00', 1, 'Engineering', 'Director');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (13, 'Zippityzoom', 78000, 7500, 'New York', '2016-03-08 11:00:00', 1, 'IT', 'Manager');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE) 
VALUES (14, 'Splatchsplatch', 90000, 9500, 'Chicago', '2013-01-26 13:15:00', 1, 'Marketing', 'Director');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)  
VALUES (15, 'Wuggles', 85000, 8000, 'Seattle', '2018-07-22 15:30:00', 1, 'Finance', 'Manager');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (16, 'Boingboing', 70000, 6000, 'Miami', '2020-04-11 16:45:00', 1, 'Sales', 'Lead');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (17, 'Zipzoom', 62000, 5000, 'Denver', '2017-09-18 18:00:00', 1, 'Engineering', 'Associate');""") 

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)  
VALUES (18, 'Wooglewoogle', 58000, 3500, 'Philadelphia', '2019-12-24 08:20:00', 1, 'IT', 'Analyst');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE)
VALUES (19, 'Flipflopglop', 75000, 7200, 'Boston', '2022-02-14 10:35:00', 1, 'Marketing', 'Lead');""")

mycursor.execute("""INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS, CITY, JOINING_DATE, ACTIVE_EMPLOYEE, DEPARTMENT, TITLE) 
VALUES (20, 'Blipblop', 68000, 6500, 'San Francisco', '2021-08-29 11:50:00', 1, 'Finance', 'Executive');""")


mydb.commit()

데이터베이스 연결이 작동하고 테이블에서 레코드를 검색할 수 있는지 확인합니다.

In [None]:
mycursor.execute("SELECT * FROM LLM_DEMO.TEST_EMPLOYEE_LLM")
myresult = mycursor.fetchall()
for x in myresult:
  print(x)

## Few-Shot Text-to-SQL
데이터베이스와 테이블에 데이터가 채워진 상태에서, 이제 Few-Shot Text-to-SQL 접근법을 살펴볼 준비가 되었습니다. 먼저 몇 가지 헬퍼 함수를 구축하겠습니다.

### 4단계: 헬퍼 함수 생성
`call_database` 함수는 SQL 쿼리를 실행하여 일반적으로 데이터베이스에서 데이터를 검색하고, 추가 처리나 표시를 위해 결과를 문자열로 포맷합니다.

In [None]:
def call_database(llm_generated_response: str) -> str:
    """
    Interact with a database using an SQL query.

    Arguments:
      llm_generated_response: A string containing an SQL query to execute.
    Returns:
      A formatted string containing the results of the executed SQL query.
    """
    mycursor = mydb.cursor()
    mycursor.execute(llm_generated_response)
    myresult = mycursor.fetchall()
    output_string = ''
    for x in myresult:
        output_string = output_string + str(x) + "\n"
        print(x)
    return output_string

`prepare_final_gen_text` 함수는 프롬프트와 데이터베이스 쿼리 결과를 결합하여 LLM을 사용해 텍스트를 생성합니다.

In [None]:
def prepare_final_gen_text(prompt_final: str, output_string: str) -> str:
    """
    Prepare the final generated text by combining a given prompt and
    database query results.

    Arguments:
      prompt_final: A string representing a prompt template for text generation.
      output_string: A string containing formatted database query results.
    Returns:
      The final generated text based on the combined prompt and database query results.
    """
    prompt_template_for_query_response = PromptTemplate.from_template(prompt_final)

    prompt_data_for_query_response = prompt_template_for_query_response.format(question=question_asked,answer=output_string)
    #print(prompt_data_for_query_response)
    final_response_text = run_bedrock(prompt=prompt_data_for_query_response)
    return final_response_text

### 4단계: Few-Shot 프롬프트 생성
여기서는 질문과 답변을 고려하고 Claude와 함께 사용하기 위해 올바르게 포맷된 프롬프트 템플릿을 설계합니다.

In [None]:
#prompt for the final generated text based on the combined prompt and database query result

prompt_final = """
Human: Based on  the question below

{question}

the answer was given below. 

{answer}

Provide answer in simple english statement and don't include table or schema names.
Assistant: 
"""

마지막 프롬프트를 기반으로, 이제 컨텍스트에 Single Shot 예시를 추가하여 응답에 대해 모델이 기대하는 바를 더 잘 힌트를 제공하겠습니다.

In [None]:
# prompt for in-context SQL generation based on NLP question

prompt = """
You are a mysql query expert whose output is a valid sql query. 

Only use the following tables:

It has the following schema:
<table_schema>
CREATE TABLE LLM_DEMO.TEST_EMPLOYEE_LLM -- Table name
(EMPID INT(10), -- employee id of the employee
NAME VARCHAR(20), -- name of the employee
SALARY INT(10), -- salary that the employee gets or makes
BONUS INT(10),-- bonus that the employee gets or makes
CITY VARCHAR(20), -- city where employees work from or belongs to
JOINING_DATE TIMESTAMP,-- date of joining for the employee
ACTIVE_EMPLOYEE INT(2), -- whether the employee is active(1) or in active(0)
DEPARTMENT VARCHAR(20), -- the deparment name where employee works or belongs to
TITLE VARCHAR(20) -- the title in office which employees has or holds
)
<table_schema>

The schema name is LLM_DEMO

And here is a sample insert statement or record for your reference : 

INSERT INTO LLM_DEMO.TEST_EMPLOYEE_LLM (EMPID, NAME, SALARY, BONUS,CITY,JOINING_DATE,ACTIVE_EMPLOYEE,DEPARTMENT,TITLE) VALUES (1, 'Stuart', 25000, 5000, 'Seattle','2023-01-21 00:00:01',1,'Applications','Sr. Developer');

Please construct a valid SQL statement to answer the following the question, 
return only the MySQL query in between <sql></sql> tags.

Question: {question}
"""

### 5단계: Few Shot 프롬프트 실행
다음 셀들은 자연어로 물어본 다양한 질문들과 LLM이 생성한 SQL을 보여줍니다. 출력은 &lt;sql&gt;&lt;/sql&gt; 태그 사이에 포함됩니다.

In [None]:
question_asked = "What is the total count of employees who are active in each department?"
prompt_template_for_query_generate = PromptTemplate.from_template(prompt)
prompt_data_for_query_generate = prompt_template_for_query_generate.format(
                                        question=question_asked)
llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)
llm_generated_sql = u.extract_tag(llm_generated_response, "sql")[0]
output_string = call_database(llm_generated_sql)
prepare_final_gen_text(prompt_final, output_string)

In [None]:
question_asked = "What is the average salary of employees in each department?"
prompt_template_for_query_generate = PromptTemplate.from_template(prompt)
prompt_data_for_query_generate = prompt_template_for_query_generate.format(question=question_asked)

llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)
llm_generated_sql = u.extract_tag(llm_generated_response, "sql")[0]
output_string = call_database(llm_generated_sql)
prepare_final_gen_text(prompt_final, output_string)

# Divide and Prompt: Text-to-SQL을 위한 연쇄 사고 프롬프팅

### 6단계: Divide & Prompt 프롬프트 생성
이제 두 번째 접근법인 Divide and Prompt로 넘어가겠습니다. 논문에서 논의된 기법들을 통합하는 프롬프트 템플릿을 생성하는 것부터 시작하겠습니다.

In [None]:
# prompt demonstrating Divide and Prompt: Chain of Thought Prompting for Text-to-SQL

prompt_check_modify_sql = """
You are a MySql query expert whose output is a valid SQL query. 

It has the following schema:
<table_schema>
CREATE TABLE LLM_DEMO.TEST_EMPLOYEE_LLM -- Table name
(EMPID INT(10), -- column employee id of the employee
NAME VARCHAR(20), -- name column of the employee
SALARY INT(10), -- salary column that the employee gets or makes
BONUS INT(10),-- bonus column that the employee gets or makes
CITY VARCHAR(20), -- city column where employees work from or belongs to
JOINING_DATE TIMESTAMP,-- date column of joining for the employee
ACTIVE_EMPLOYEE INT(2), -- whether the employee is active(1) or in active(0) column 
DEPARTMENT VARCHAR(20), -- the deparment column name where employee works or belongs to
TITLE VARCHAR(20) -- the title column in office which employees has or holds
)
<table_schema>

This is the table(columns):
LLM_DEMO.TEST_EMPLOYEE_LLM(EMPID,NAME,SALARY,BONUS,CITY,JOINING_DATE,ACTIVE_EMPLOYEE,DEPARTMENT,TITLE)

This is the text : {question_asked}

This is the reference SQL : {llm_generated_response}
The reference SQL may be correct or incorrect
If the reference SQL is correct, just say 'it is correct'
If the reference SQL is incorrect, modify the reference SQL and output the correct SQL in <sql></sql> tags.
"""

### 7단계: Divide & Prompt 프롬프트 실행
다음 셀들은 Divide and Prompt: Text-to-SQL을 위한 연쇄 사고 프롬프팅을 보여줍니다. 먼저 few-shot 템플릿을 사용하여 SQL 생성을 위한 첫 번째 시도를 하겠습니다.

In [None]:
question_asked = "Which city has the most employees with a title of Lead?"

prompt_template_for_query_generate = PromptTemplate.from_template(prompt)
prompt_data_for_query_generate = \
    prompt_template_for_query_generate.format(question=question_asked)
llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)
llm_generated_sql = u.extract_tag(llm_generated_response, "sql")[0]
llm_generated_sql

이제 Divide and Prompt를 위한 두 번째 프롬프트를 사용하겠습니다. 올바른 답변을 확인할 수 있을 것입니다.

In [None]:
prompt_template_for_query_generate = PromptTemplate.from_template(prompt_check_modify_sql)
prompt_data_for_query_generate = \
    prompt_template_for_query_generate.format(question_asked=question_asked,
                                              llm_generated_response=llm_generated_sql)
llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)

다른 질문으로 처음부터 끝까지 다시 시도해보겠습니다

In [None]:
question_asked = "What is the range of joining dates for employees in each department?"

prompt_template_for_query_generate = PromptTemplate.from_template(prompt)
prompt_data_for_query_generate = \
    prompt_template_for_query_generate.format(question=question_asked)

llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)
llm_generated_sql = u.extract_tag(llm_generated_response, "sql")[0]
prompt_template_for_query_generate = PromptTemplate.from_template(prompt_check_modify_sql)
prompt_data_for_query_generate = \
    prompt_template_for_query_generate.format(question_asked=question_asked,
                                              llm_generated_response=llm_generated_sql)
llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
llm_generated_response

여기서는 우리가 알고 있는 잘못된 하드코딩된 쿼리로 다른 질문을 시도해보겠습니다.

In [None]:
# deliberate wrong table name and column name
question_asked = "What is the range of joining dates for employees in each department?"
llm_generated_sql = """
SELECT DEPARTMENT, AGE, MIN(JOINING_DATE) AS Earliest_Join_Date, MAX(JOINING_DATE) AS Latest_Join_Date
FROM LLM_DEMO.TEST_EMPLOYEE_LLM_TEST
GROUP BY DEPARTMENT
"""
prompt_template_for_query_generate = PromptTemplate.from_template(prompt_check_modify_sql)
prompt_data_for_query_generate = \
    prompt_template_for_query_generate.format(question_asked=question_asked,
                                              llm_generated_response=llm_generated_sql)
llm_generated_response = run_bedrock(prompt=prompt_data_for_query_generate)
print(llm_generated_response)
llm_generated_sql = u.extract_tag(llm_generated_response, "sql")[0]
llm_generated_sql