# Amazon Bedrock과 LangChain을 이용한 "비즈니스 데이터 분석을 위한 자연어 기반 BI"

---

노트북 실행을 위해서 아래와 같이 노트북 환경을 Data Science 3.0, Python 3, ml.t3.medium로 지정합니다.
<img src="images/nb-env.png" width="600"/>

---

이 노트북에서는 비즈니스 데이터 분석을 위한 자연어 기반 BI 구현하기 위한 생성형 AI 기반의 Text-to-SQL 기능을 위해 [Amazon Bedrock](https://aws.amazon.com/bedrock/)과 [LangChain](https://www.langchain.com/)을 사용합니다.

Text-to-SQL 기능을 이용한다면 자연어만을 사용해서 구조화된 정보를 보유하고 있는 SQL 데이터베이스에 LLM을 연결하여 특정 비즈니스 데이터에 대한 답을 얻을 수 있습니다. 이때 LLM은 질문에 대한 답을 추출을 위해 데이터베이스에 초점을 맞추게 되며 SQL을 통해 데이터베이스에서 제공하는 정확한 정보를 반환할 수 있습니다. LangChain은 LLM을 다른 데이터소스로 연결할 수 있도록 하는 기능을 제공하여 개발자는 복잡한 애플리케이션을 쉽게 만들 수 있습니다.

<img src="images/sql-agent.png" width="800"/>
Image credits: [LangChain](https://www.langchain.com/)

### Text-to-SQL를 이용하여 아래와 같은 사례에 적용할 수 있습니다.
* 데이터베이스 데이터를 기반으로 질문에 답할 수 있는 챗봇 만들기
* 자연어 기반의 복잡한 쿼리 생성
* 비기술 인력 및 이해관계자에 대한 데이터베이스 접근성 확대

이 노트북에서는 사전에 생성된 Amazon Redshift Serverless를 사용하며 데이터베이스와 LLM 연결을 위해 LangChain을 사용합니다.
노트북은 아래와 같은 구성으로 이루어져있습니다.
1. Amazon Redshift Serverless 접속과 데이터 확인
2. LangChain의 SQLDatabaseChain 사용 

Amazon Bedrock과 LangChain을 이용한 Text-to-SQL 구조는 아래 다이어그램을 참조하세요.

<img src="images/rs-architecture.png" width="800"/>

데이터에 대한 질문을 위한 SQL 쿼리를 작성하기 위해서는 원하는 데이터와 조건 등을 지정하는 명령문을 전달합니다. 명령문은 프롬프트와 함께 전달되고 해당되는 데이터베이스에 맞는 SQL 문으로 변환됩니다. SQL문이 실행된 결과 데이터는 다시 LLM에 전달되고 프롬프트에 따라 LLM은 자연어로 결과를 제공해줍니다. 이때 자연어를 이용한 질의를 통해 복잡한 조인까지도 생성할 수 있어 SQL 쿼리를 생성하고 그 결과를 분석하는 것이 정말 쉬워집니다.

---

In [2]:
%pip install --upgrade pip
%pip install -qU redshift_connector sqlalchemy-redshift
# %pip install -qU 'sagemaker>=2.15.0' 'PyAthena==1.10.7' 'SQLAlchemy==1.3.13'

Collecting pip
  Obtaining dependency information for pip from https://files.pythonhosted.org/packages/47/6a/453160888fab7c6a432a6e25f8afe6256d0d9f2cbd25971021da6491d899/pip-23.3.1-py3-none-any.whl.metadata
  Downloading pip-23.3.1-py3-none-any.whl.metadata (3.5 kB)
Downloading pip-23.3.1-py3-none-any.whl (2.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m19.4 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.2.1
    Uninstalling pip-23.2.1:
      Successfully uninstalled pip-23.2.1
Successfully installed pip-23.3.1
[0mNote: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.


In [3]:
import os
import boto3

In [4]:
my_region = os.environ["AWS_DEFAULT_REGION"]   # E.g. "us-east-1"
os.environ["BEDROCK_ENDPOINT_URL"] = f"https://bedrock-runtime.{my_region}.amazonaws.com"  # E.g. "https://..."

STACK_NAME = 'genai-workshop' # Cloudformation의 Stack Name을 입력합니다.

In [5]:
# Redshift Serverless로 연결을 위한 Endpoint를 Cloudformation에서 읽어옵니다.

cf_client = boto3.client('cloudformation')
response = cf_client.describe_stacks()

for output in response["Stacks"]:
    stackName = output["StackName"]
    if stackName.find('Redshift') > 0:
        response = cf_client.describe_stacks(StackName=stackName)
        for output in response["Stacks"][0]["Outputs"]:
            keyName = output["OutputKey"]
            if keyName == "ServerlessWorkgroup":
                WORK_GROUP = output["OutputValue"]
            if keyName == "ServerlessNamespace":
                NAME_SPACE = output["OutputValue"]

rs_client = boto3.client('redshift-serverless')
response = rs_client.get_workgroup(
    workgroupName=WORK_GROUP
)

redshift_endpoint = response['workgroup']['endpoint']['address']

redshift_endpoint

'workshop-workgroup.023751917724.us-east-1.redshift-serverless.amazonaws.com'

# Redshift Serverless 테이블 생성과 데이터 로딩

### 본 노트북에서 사용되는 일반적인 리테일 회사에서 사용될 수 있는 가상의 데이터를 참고했습니다.
- region (P: R_REGIONKEY): 지역 정보; 지역 이름과 설명
- nation (P: N_NATIONKEY): 국가 정보; 국가 이름, 지역 키, 설명
- customer (P: C_CUSTKEY): 고객 정보; 고객의 이름, 주소, 국가 키, 전화번호, 계정 잔액 
- orders (P: O_ORDERKEY): 주문 정보; 주문 상태, 총 가격, 주문일자, 우선 순위; 주문키(distkey), 주문일자(sortkey)
- part (P: P_PARTKEY): 제품 정보; 제품 이름, 제조사, 브랜드, 유형, 크기, 가격
- supplier (P: S_SUPPKEY): 공급업체 정보; 업체 이름, 주소, 국가 키, 전화번호, 계정 잔액
- lineitem (P: L_ORDERKEY): 주문 라인 항목 정보; 주문 제품, 공급업체, 라인 번호, 수량, 확장 가격, 할인, 세금; 주문키(distkey), 수령 일자(sortkey)
- partsupp (P: PS_PARTKEY, PS_SUPPKEY): 제품-공급업체 관계 정보; 제품의 가용 수량, 공급 비용, 설명

## 중요!: 본 실습에 필요한 Redshift Serverless 권한 적용을 위해서는 사용할 IAM Role을 Default Type으로 지정해야합니다.

#### AWS 콘솔에서 Redshift 메뉴로 이동합니다. 그리고 아래와 같이 Redshift Serverless의 Namespace중 `workshop-namespace`를 클릭합니다. 그리고 `Security and encryption`탭에서 실습에 사용할 Role인 `SageMakerUserExecutionRole`을 Default Role Type으로 만들기 위해서 `Manage IAM Role`을 클릭합니다.

<img src="images/namespace.png" width="800"/>

#### `SageMakerUserExecutionRole`을 선택하고 상단의 `Set default`에서 `Make default`를 클릭하여 Role Type을 Default로 만듭니다. 그리고 `반드시` `Save changes`를 클릭합니다.

<img src="images/make-default.png" width="800"/>

In [6]:
import redshift_connector

conn = redshift_connector.connect(
     host=redshift_endpoint,
     database='dev',
     port=5439,
     user="admin",
     password="Admin1234!"
)

conn.autocommit = True
cursor = conn.cursor()

## Redshift Serverless에 테이블 생성

In [7]:
cursor.execute("DROP TABLE IF EXISTS region")

create_statement = """
CREATE TABLE region (
  R_REGIONKEY bigint NOT NULL,
  R_NAME varchar(25),
  R_COMMENT varchar(152))
diststyle all;
"""
cursor.execute(create_statement)


<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [8]:
cursor.execute("DROP TABLE IF EXISTS nation")

create_statement = """
CREATE TABLE nation (
  N_NATIONKEY bigint NOT NULL,
  N_NAME varchar(25),
  N_REGIONKEY bigint,
  N_COMMENT varchar(152))
diststyle all;
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [9]:
cursor.execute("DROP TABLE IF EXISTS customer")

create_statement = """
create table customer (
  C_CUSTKEY bigint NOT NULL,
  C_NAME varchar(25),
  C_ADDRESS varchar(40),
  C_NATIONKEY bigint,
  C_PHONE varchar(15),
  C_ACCTBAL decimal(18,4),
  C_MKTSEGMENT varchar(10),
  C_COMMENT varchar(117))
diststyle all;
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [10]:
cursor.execute("DROP TABLE IF EXISTS orders")

create_statement = """
create table orders (
  O_ORDERKEY bigint NOT NULL,
  O_CUSTKEY bigint,
  O_ORDERSTATUS varchar(1),
  O_TOTALPRICE decimal(18,4),
  O_ORDERDATE Date,
  O_ORDERPRIORITY varchar(15),
  O_CLERK varchar(15),
  O_SHIPPRIORITY Integer,
  O_COMMENT varchar(79))
distkey (O_ORDERKEY)
sortkey (O_ORDERDATE);
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [11]:
cursor.execute("DROP TABLE IF EXISTS part")

create_statement = """
create table part (
  P_PARTKEY bigint NOT NULL,
  P_NAME varchar(55),
  P_MFGR  varchar(25),
  P_BRAND varchar(10),
  P_TYPE varchar(25),
  P_SIZE integer,
  P_CONTAINER varchar(10),
  P_RETAILPRICE decimal(18,4),
  P_COMMENT varchar(23))
diststyle all;
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [12]:
cursor.execute("DROP TABLE IF EXISTS supplier")

create_statement = """
create table supplier (
  S_SUPPKEY bigint NOT NULL,
  S_NAME varchar(25),
  S_ADDRESS varchar(40),
  S_NATIONKEY bigint,
  S_PHONE varchar(15),
  S_ACCTBAL decimal(18,4),
  S_COMMENT varchar(101))
diststyle all;                                                              
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [13]:
cursor.execute("DROP TABLE IF EXISTS lineitem")

create_statement = """
create table lineitem (
  L_ORDERKEY bigint NOT NULL,
  L_PARTKEY bigint,
  L_SUPPKEY bigint,
  L_LINENUMBER integer NOT NULL,
  L_QUANTITY decimal(18,4),
  L_EXTENDEDPRICE decimal(18,4),
  L_DISCOUNT decimal(18,4),
  L_TAX decimal(18,4),
  L_RETURNFLAG varchar(1),
  L_LINESTATUS varchar(1),
  L_SHIPDATE date,
  L_COMMITDATE date,
  L_RECEIPTDATE date,
  L_SHIPINSTRUCT varchar(25),
  L_SHIPMODE varchar(10),
  L_COMMENT varchar(44))
distkey (L_ORDERKEY)
sortkey (L_RECEIPTDATE);
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [14]:
cursor.execute("DROP TABLE IF EXISTS partsupp")

create_statement = """
create table partsupp (
  PS_PARTKEY bigint NOT NULL,
  PS_SUPPKEY bigint NOT NULL,
  PS_AVAILQTY integer,
  PS_SUPPLYCOST decimal(18,4),
  PS_COMMENT varchar(199))
diststyle even;
"""
cursor.execute(create_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

## Redshift Serverless에 데이터 로딩

### Note: S3부터 데이터 Copy를 위해서는 반드시 앞에서 설명한 Set Default IAM Role in Redshift Serverless을 수행해야합니다.

In [15]:
load_data_statement = """
COPY region FROM 's3://redshift-immersionday-labs/data/region/region.tbl.lzo'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [16]:
load_data_statement = """
COPY nation FROM 's3://redshift-immersionday-labs/data/nation/nation.tbl.'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [17]:
load_data_statement = """
copy customer from 's3://redshift-immersionday-labs/data/customer/customer.tbl.'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [18]:
load_data_statement = """
copy orders from 's3://redshift-immersionday-labs/data/orders/orders.tbl.'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [19]:
load_data_statement = """
copy part from 's3://redshift-immersionday-labs/data/part/part.tbl.'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [20]:
load_data_statement = """
copy supplier from 's3://redshift-immersionday-labs/data/supplier/supplier.json' manifest
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [21]:
load_data_statement = """
copy lineitem from 's3://redshift-immersionday-labs/data/lineitem-part/'
iam_role default
region 'us-west-2' gzip delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

In [22]:
load_data_statement = """
copy partsupp from 's3://redshift-immersionday-labs/data/partsupp/partsupp.tbl.'
iam_role default
region 'us-west-2' lzop delimiter '|' COMPUPDATE PRESET;
"""
cursor.execute(load_data_statement)

<redshift_connector.cursor.Cursor at 0x7f74416a3d00>

Redshift Serverless 데이터 로딩 확인

In [23]:
%pip install --quiet ipython-sql
%pip install --quiet --upgrade SQLAlchemy==1.*
%pip install -qU psycopg2-binary

import pandas as pd
import sqlalchemy

username = "admin" 
password = "Admin1234!" 
host = redshift_endpoint 
port = "5439"
mydatabase = "dev"

redshift_uri = f"postgresql+psycopg2://{username}:{password}@{host}:{port}/{mydatabase}"

engine = sqlalchemy.create_engine(redshift_uri)
df = pd.read_sql('select * from orders limit 10', engine)
df.head()

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
sqlalchemy-redshift 0.8.14 requires SQLAlchemy<2.0.0,>=0.9.2, but you have sqlalchemy 2.0.23 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython-sql 0.5.0 requires sqlalchemy>=2.0, but you have sqlalchemy 1.4.50 which is incompatible.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
[0mNote: you may need to restart the kernel to use updated packages.


Unnamed: 0,o_orderkey,o_custkey,o_orderstatus,o_totalprice,o_orderdate,o_orderpriority,o_clerk,o_shippriority,o_comment
0,28009541,13096240,F,118898.15,1992-01-01,3-MEDIUM,Clerk#000014614,0,"beans. express, furious accounts wake slyly a..."
1,216029091,2951965,F,49717.77,1992-01-01,4-NOT SPECIFIED,Clerk#000094912,0,ounts haggle blithely about the multipliers. e...
2,292068259,4419706,F,79728.74,1992-01-01,5-LOW,Clerk#000071174,0,y regular foxes are. quietly ironic notornis n...
3,232075587,9762419,F,202931.54,1992-01-01,5-LOW,Clerk#000087238,0,eposits. slyly ironic ideas use quickly. final
4,124184099,12030896,F,25796.89,1992-01-01,1-URGENT,Clerk#000002857,0,"es. furiously ironic accounts wake. even, even..."


# LangChain SQLDatabase 설정

In [24]:
%pip install -qU langchain langchain-experimental

[0mNote: you may need to restart the kernel to use updated packages.


In [35]:
import os
import sys
import boto3

from langchain.llms.bedrock import Bedrock

BEDROCK_ENDPOINT_URL = os.environ["BEDROCK_ENDPOINT_URL"] 

session = boto3.Session(
    profile_name=os.environ.get("AWS_PROFILE")
) 

bedrock = session.client(
    service_name='bedrock-runtime', 
    region_name=os.environ.get("AWS_DEFAULT_REGION"),
    endpoint_url=BEDROCK_ENDPOINT_URL
) 

# - create the Anthropic Model
llm = Bedrock(model_id="anthropic.claude-v2", client=bedrock, model_kwargs={'max_tokens_to_sample':1000, 'temperature':0})

In [26]:
from langchain import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

db = SQLDatabase.from_uri(redshift_uri)

In [27]:
print(db.table_info)


CREATE TABLE customer (
	c_custkey BIGINT NOT NULL, 
	c_name VARCHAR(25), 
	c_address VARCHAR(40), 
	c_nationkey BIGINT, 
	c_phone VARCHAR(15), 
	c_acctbal NUMERIC(18, 4), 
	c_mktsegment VARCHAR(10), 
	c_comment VARCHAR(117)
)

/*
3 rows from customer table:
c_custkey	c_name	c_address	c_nationkey	c_phone	c_acctbal	c_mktsegment	c_comment
2600001	Customer#002600001	IDBPyh28e 428wU7rltzJg1AHUPaJIX	18	28-537-886-4232	-71.7600	AUTOMOBILE	s are. regular, regular deposits detect 
2600002	Customer#002600002	GkRZAQNIGrOMY6QsJt9TFC	4	14-843-677-7959	1801.0100	BUILDING	y special realms cajole carefully. slyly special deposits cajole furiousl
2600003	Customer#002600003	xzIw0SH7rePUj3PT	17	27-680-483-8411	6454.3600	BUILDING	ht haggle finally alongside of the requests. slyly bold requests according to the quickly regul
*/


CREATE TABLE lineitem (
	l_orderkey BIGINT NOT NULL, 
	l_partkey BIGINT, 
	l_suppkey BIGINT, 
	l_linenumber INTEGER NOT NULL, 
	l_quantity NUMERIC(18, 4), 
	l_extendedprice NUME

## 프롬프트에 따라 사용자 질문을 기반으로 SQL 쿼리를 작성하고 실행하는 체인을 생성하는 코드 블록

In [28]:
PROMPT = """ 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: {question}

Assistant:"""

db_chain = SQLDatabaseChain.from_llm(llm=llm, db=db, verbose=True) # verbose=True로 실행되는 SQL을 상세하게 보여줍니다.

question = "공급업체의 평균 계정 잔액은 어떻게 돼?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 공급업체의 평균 계정 잔액은 어떻게 돼?

Assistant:
SQLQuery:[32;1m[1;3mSELECT AVG(s_acctbal)
FROM supplier;[0m
SQLResult: [33;1m[1;3m[(Decimal('4500.0568'),)][0m
Answer:[32;1m[1;3m공급업체의 평균 계정 잔액은 4500.0568 돼.[0m
[1m> Finished chain.[0m


'공급업체의 평균 계정 잔액은 4500.0568 돼.'

#### 쿼리 실행없이 SQL문 확인

In [36]:
from langchain.chains import create_sql_query_chain

chain = create_sql_query_chain(llm, db)
response = chain.invoke({"question":"공급업체의 평균 계정 잔액은 어떻게 돼?"})

print(response)

공급업체의 평균 계정 잔액을 구하기 위해 다음과 같은 SQL 쿼리를 사용할 수 있습니다:

```sql
SELECT AVG("s_acctbal") AS "avg_acctbal"
FROM supplier
```

이 쿼리는 supplier 테이블에서 s_acctbal 컬럼의 평균값을 계산합니다.

SQLQuery:
SELECT AVG("s_acctbal") AS "avg_acctbal"
FROM supplier


#### 데이터베이스에서 SQL문 실행

In [37]:
sql_statement = "SELECT AVG(s_acctbal) AS avg_acct_balance FROM supplier"
df = pd.read_sql(sql_statement, engine)
df.head()

Unnamed: 0,avg_acct_balance
0,4500.0568


In [38]:
question = "어떤 국가의 고객이 가장 많이 주문했나요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 국가의 고객이 가장 많이 주문했나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT n.n_name as nation, count(o.o_orderkey) as order_count
FROM nation n
JOIN customer c ON n.n_nationkey = c.c_nationkey  
JOIN orders o ON c.c_custkey = o.o_custkey
GROUP BY n.n_name
ORDER BY order_count DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('INDONESIA', 3049795)][0m
Answer:[32;1m[1;3m인도네시아[0m
[1m> Finished chain.[0m


'인도네시아'

In [39]:
question = "주문 수가 가장 많은 국가는 어디인가요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문 수가 가장 많은 국가는 어디인가요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT n.n_name
FROM nation n JOIN customer c ON n.n_nationkey = c.c_nationkey 
             JOIN orders o ON c.c_custkey = o.o_custkey
GROUP BY n.n_name
ORDER BY COUNT(*) DESC
LIMIT 1;[0m
SQLResult: [33;1m[1;3m[('INDONESIA',)][0m
Answer:[32;1m[1;3m인도네시아[0m
[1m> Finished chain.[0m


'인도네시아'

In [40]:
question = "고객의 평균 계정 잔액은 어떻게 되나요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 고객의 평균 계정 잔액은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT AVG(c_acctbal) AS avg_acct_bal
FROM customer;[0m
SQLResult: [33;1m[1;3m[(Decimal('4498.5519'),)][0m
Answer:[32;1m[1;3m고객의 평균 계정 잔액은 4498.5519 입니다.[0m
[1m> Finished chain.[0m


'고객의 평균 계정 잔액은 4498.5519 입니다.'

In [41]:
question = "어떤 고객이 가장 많은 주문을 했나요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 고객이 가장 많은 주문을 했나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT c_name 
FROM customer c JOIN orders o ON c.c_custkey = o.o_custkey
GROUP BY c_name
ORDER BY COUNT(*) DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('Customer#012340771',)][0m
Answer:[32;1m[1;3mCustomer#012340771[0m
[1m> Finished chain.[0m


'Customer#012340771'

In [42]:
sql_statement = "select * from customer  where c_name like 'Customer#009100001'"
df = pd.read_sql(sql_statement, engine)
df.head()

Unnamed: 0,c_custkey,c_name,c_address,c_nationkey,c_phone,c_acctbal,c_mktsegment,c_comment
0,9100001,Customer#009100001,"IIGY3g9IyWC,nYmA60Df0T",0,10-520-278-2536,8143.07,HOUSEHOLD,"to the express requests wake regular, f"


In [43]:
query_statement = """
SELECT c_name 
FROM customer JOIN orders ON o_custkey = c_custkey
GROUP BY c_name
ORDER BY COUNT(*) DESC
LIMIT 1
"""
df = pd.read_sql(query_statement, engine)
df.head()

Unnamed: 0,c_name
0,Customer#012340771


In [44]:
question = "마케팅 세그먼트별 고객이 주문하는 양은 어떻게 다른가요? "

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 마케팅 세그먼트별 고객이 주문하는 양은 어떻게 다른가요? 

Assistant:
SQLQuery:[32;1m[1;3mSELECT c_mktsegment, AVG(l_quantity) as avg_quantity
FROM customer c JOIN orders o ON c.c_custkey = o.o_custkey
JOIN lineitem l ON o.o_orderkey = l.l_orderkey
GROUP BY c_mktsegment
ORDER BY avg_quantity DESC;[0m
SQLResult: [33;1m[1;3m[('BUILDING', Decimal('25.5031')), ('AUTOMOBILE', Decimal('25.4996')), ('FURNITURE', Decimal('25.4990')), ('HOUSEHOLD', Decimal('25.4986')), ('MACHINERY', Decimal('25.4983'))][0m
Answer:[32;1m[1;3mBUILDING 마케팅 세그먼트의 고객이 주문하는 양이 가장 많습니다.[0m
[1m> Finished chain.[0m


'BUILDING 마케팅 세그먼트의 고객이 주문하는 양이 가장 많습니다.'

In [45]:
question = "마케팅 세그먼트별 고객이 주문하는 양을 알려주세요"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 마케팅 세그먼트별 고객이 주문하는 양을 알려주세요

Assistant:
SQLQuery:[32;1m[1;3mSELECT c_mktsegment, SUM(l_quantity) AS total_ordered
FROM customer c
JOIN orders o ON c.c_custkey = o.o_custkey
JOIN lineitem l ON o.o_orderkey = l.l_orderkey
GROUP BY c_mktsegment
ORDER BY total_ordered DESC;[0m
SQLResult: [33;1m[1;3m[('FURNITURE', Decimal('1546275558.0000')), ('AUTOMOBILE', Decimal('1545790383.0000')), ('HOUSEHOLD', Decimal('1545155899.0000')), ('BUILDING', Decimal('1544821487.0000')), ('MACHINERY', Decimal('1544594058.0000'))][0m
Answer:[32;1m[1;3m마케팅 세그먼트별로 주문량이 가장 많은 순서는 FURNITURE, AUTOMOBILE, HOUSEHOLD, BUILDING, MACHINERY 입니다.[0m
[1m> Finished chain.[0m


'마케팅 세그먼트별로 주문량이 가장 많은 순서는 FURNITURE, AUTOMOBILE, HOUSEHOLD, BUILDING, MACHINERY 입니다.'

In [47]:
question = "주문 상태별 주문 수와 총 매출은 어떻게 되나요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문 상태별 주문 수와 총 매출은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT o_orderstatus, COUNT(*) as order_count, SUM(o_totalprice) as total_price 
FROM orders
GROUP BY o_orderstatus
ORDER BY o_orderstatus;[0m
SQLResult: [33;1m[1;3m[('F', 37024122, Decimal('5562126040598.7000')), ('O', 37030175, Decimal('5563129791682.3000')), ('P', 1945703, Decimal('359572088111.3400'))][0m
Answer:[32;1m[1;3m주문 상태 F의 주문 수는 37024122개,  총 매출은 5562126040598.7000 입니다.
주문 상태 O의 주문 수는 37030175개, 총 매출은 5563129791682.3000 입니다.
주문 상태 P의 주문 수는 1945703개,  총 매출은 359572088111.3400 입니다.[0m
[1m> Finished chain.[0m


'주문 상태 F의 주문 수는 37024122개,  총 매출은 5562126040598.7000 입니다.\n주문 상태 O의 주문 수는 37030175개, 총 매출은 5563129791682.3000 입니다.\n주문 상태 P의 주문 수는 1945703개,  총 매출은 359572088111.3400 입니다.'

In [48]:
query_statement = """
SELECT o_orderstatus, COUNT(*) AS order_count, SUM(o_totalprice) AS total_price
FROM orders 
GROUP BY o_orderstatus;
"""
df = pd.read_sql(query_statement, engine)
df.head()

Unnamed: 0,o_orderstatus,order_count,total_price
0,O,37030175,5563130000000.0
1,P,1945703,359572100000.0
2,F,37024122,5562126000000.0


In [49]:
question = "주문 금액이 가장 큰 주문 금액과 주문 정보 알려주세요"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문 금액이 가장 큰 주문 금액과 주문 정보 알려주세요

Assistant:
SQLQuery:[32;1m[1;3mSELECT o_totalprice, o_orderkey, o_orderdate, o_orderpriority
FROM orders
ORDER BY o_totalprice DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[(Decimal('571417.4800'), 84927619, datetime.date(1997, 6, 29), '1-URGENT')][0m
Answer:[32;1m[1;3m주문 금액이 가장 큰 주문은 571417.4800입니다. 해당 주문의 주문번호(o_orderkey)는 84927619이고, 주문일자(o_orderdate)는 1997-06-29이며, 주문우선순위(o_orderpriority)는 1-URGENT입니다.[0m
[1m> Finished chain.[0m


'주문 금액이 가장 큰 주문은 571417.4800입니다. 해당 주문의 주문번호(o_orderkey)는 84927619이고, 주문일자(o_orderdate)는 1997-06-29이며, 주문우선순위(o_orderpriority)는 1-URGENT입니다.'

In [50]:
query_statement = """
SELECT o_orderkey, o_totalprice, o_orderdate, o_orderpriority 
FROM orders
ORDER BY o_totalprice DESC
LIMIT 1;
"""
df = pd.read_sql(query_statement, engine)
df.head()

Unnamed: 0,o_orderkey,o_totalprice,o_orderdate,o_orderpriority
0,84927619,571417.48,1997-06-29,1-URGENT


In [51]:
question = "주문일자에 따른 주문량 및 매출의 월별 추이는 어떻게 되나요?"

db_chain.run(PROMPT.format(question=question))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문일자에 따른 주문량 및 매출의 월별 추이는 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT 
    date_trunc('month', o_orderdate) AS order_month,
    COUNT(o_orderkey) AS order_count,
    SUM(o_totalprice) AS total_price
FROM orders
GROUP BY order_month
ORDER BY order_month;[0m
SQLResult: [33;1m[1;3m[(datetime.datetime(1992, 1, 1, 0, 0), 979560, Decimal('148129874897.0900')), (datetime.datetime(1992, 2, 1, 0, 0), 915335, Decimal('138422683095.2500')), (datetime.datetime(1992, 3, 1, 0, 0), 976463, Decimal('147508636866.2100')), (datetime.datetime(1992, 4, 1, 0, 0), 947568, Decimal('143143147554.5200')), (datetime.datetime(1992, 5, 1, 0, 0), 978514, Decimal('147885626940.1800')), (datetime.datetime(1992, 6, 1, 0, 0), 947846, Decimal('143258491318.7200')), (datetime.datetime(

'1992년 1월에는 979560건, 14.8 억달러의 매출이 발생했고,\n1993년 2월에는 885683건, 13.3 억달러의 매출이 나왔습니다. \n매출과 주문 건수가 가장 많았던 시기는 1992년 1월이고, 가장 적었던 시기는 1998년 8월입니다.'

In [52]:
query_statement = """
SELECT
  DATE_TRUNC('month', o_orderdate) AS month,
  SUM(l_quantity) AS total_qty,
  SUM(l_extendedprice * (1 - l_discount)) AS total_amount
FROM orders o
JOIN lineitem l ON o.o_orderkey = l.l_orderkey  
GROUP BY 1
ORDER BY 1;
"""
df = pd.read_sql(query_statement, engine)
df.head(100)

Unnamed: 0,month,total_qty,total_amount
0,1992-01-01,99663205.0,1.419620e+11
1,1992-02-01,93107547.0,1.326408e+11
2,1992-03-01,99238813.0,1.413739e+11
3,1992-04-01,96293391.0,1.371863e+11
4,1992-05-01,99497085.0,1.417110e+11
...,...,...,...
75,1998-04-01,96304953.0,1.371860e+11
76,1998-05-01,99702770.0,1.420418e+11
77,1998-06-01,96278050.0,1.371642e+11
78,1998-07-01,99558670.0,1.418191e+11


In [53]:
db_chain.run(PROMPT.format(question="주문일자에 따른 주문량 및 매출의 월별 추이는 어떻게 되나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문일자에 따른 주문량 및 매출의 월별 추이는 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT
    EXTRACT(YEAR FROM o_orderdate) AS year, 
    EXTRACT(MONTH FROM o_orderdate) AS month,
    COUNT(o_orderkey) AS order_count,
    SUM(o_totalprice) AS total_sales
FROM orders
GROUP BY year, month
ORDER BY year, month[0m
SQLResult: [33;1m[1;3m[(1992, 1, 979560, Decimal('148129874897.0900')), (1992, 2, 915335, Decimal('138422683095.2500')), (1992, 3, 976463, Decimal('147508636866.2100')), (1992, 4, 947568, Decimal('143143147554.5200')), (1992, 5, 978514, Decimal('147885626940.1800')), (1992, 6, 947846, Decimal('143258491318.7200')), (1992, 7, 980138, Decimal('148087324674.0900')), (1992, 8, 979530, Decimal('147873822059.1900')), (1992, 9, 946657, Decimal('142945418990.7100')), (199

'1992년 1월 ~ 1998년 12월까지 월별 주문 수와 매출은 꾸준히 증가하고 있습니다. 주문 수는 약 90만 건~98만 건, 매출은 약 1,300억원~1,480억원 범위 내에서 변동하고 있습니다.'

In [54]:
query_statement = """
SELECT EXTRACT(MONTH FROM o_orderdate) AS order_month, 
       SUM(l_extendedprice * (1 - l_discount)) AS revenue,
       COUNT(o_orderkey) AS order_count
FROM orders o
JOIN lineitem l ON o.o_orderkey = l.l_orderkey
GROUP BY order_month
ORDER BY order_month;
"""
df = pd.read_sql(query_statement, engine)
df.head(12)

Unnamed: 0,order_month,revenue,order_count
0,1,993316100000.0,27344530
1,2,906070300000.0,24944177
2,3,992228700000.0,27314034
3,4,960726200000.0,26450594
4,5,992752100000.0,27330612
5,6,961147400000.0,26460707
6,7,992810000000.0,27330157
7,8,860404600000.0,23687868
8,9,822655800000.0,22650852
9,10,850698400000.0,23415496


In [55]:
db_chain.run(PROMPT.format(question="어떤 주문 우선 순위가 가장 일반적인가요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 주문 우선 순위가 가장 일반적인가요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT o_orderpriority, COUNT(*) AS order_count
FROM orders 
GROUP BY o_orderpriority
ORDER BY order_count DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('4-NOT SPECIFIED', 15202181)][0m
Answer:[32;1m[1;3m'4-NOT SPECIFIED'[0m
[1m> Finished chain.[0m


"'4-NOT SPECIFIED'"

In [56]:
db_chain.run(PROMPT.format(question="주문 수가 가장 많은 점원은 어떻게 되나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문 수가 가장 많은 점원은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT o_clerk 
FROM orders 
GROUP BY o_clerk
ORDER BY COUNT(*) DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('Clerk#000049229',)][0m
Answer:[32;1m[1;3mClerk#000049229[0m
[1m> Finished chain.[0m


'Clerk#000049229'

In [57]:
db_chain.run(PROMPT.format(question="어떤 제품 카테고리가 가장 많이 주문되었나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 제품 카테고리가 가장 많이 주문되었나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT p_type, COUNT(*) AS num_orders
FROM lineitem l JOIN part p ON l.l_partkey = p.p_partkey
GROUP BY p_type
ORDER BY num_orders DESC
LIMIT 1;[0m
SQLResult: [33;1m[1;3m[('SMALL ANODIZED TIN', 2035649)][0m
Answer:[32;1m[1;3mSMALL ANODIZED TIN[0m
[1m> Finished chain.[0m


'SMALL ANODIZED TIN'

In [58]:
db_chain.run(PROMPT.format(question="가장 비싼 제품은 무엇이며, 어떤 주문에서 판매되었나요? "))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 가장 비싼 제품은 무엇이며, 어떤 주문에서 판매되었나요? 

Assistant:
SQLQuery:[32;1m[1;3mSELECT l.l_orderkey, p.p_name, p.p_retailprice  
FROM part p
JOIN lineitem l ON p.p_partkey = l.l_partkey
ORDER BY p.p_retailprice DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[(35754244, 'goldenrod drab brown salmon mint', Decimal('2098.9900'))][0m
Answer:[32;1m[1;3mThe most expensive product is goldenrod drab brown salmon mint. It was sold in order 35754244.[0m
[1m> Finished chain.[0m


'The most expensive product is goldenrod drab brown salmon mint. It was sold in order 35754244.'

In [59]:
db_chain.run(PROMPT.format(question="제품 크기에 따른 주문량은 어떻게 다른가요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 제품 크기에 따른 주문량은 어떻게 다른가요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT p_size, COUNT(*) AS num_orders
FROM part p JOIN lineitem l ON p.p_partkey = l.l_partkey 
GROUP BY p_size
ORDER BY p_size;[0m
SQLResult: [33;1m[1;3m[(1, 6062773), (2, 6043286), (3, 6065119), (4, 6061562), (5, 6063176), (6, 6049574), (7, 6069786), (8, 6070453), (9, 6072133), (10, 6068272), (11, 6063010), (12, 6057022), (13, 6072878), (14, 6055116), (15, 6059402), (16, 6061363), (17, 6048716), (18, 6050287), (19, 6071033), (20, 6063818), (21, 6045115), (22, 6070893), (23, 6073164), (24, 6063351), (25, 6065164), (26, 6049634), (27, 6060175), (28, 6064904), (29, 6066592), (30, 6042047), (31, 6057695), (32, 6052386), (33, 6059987), (34, 6054078), (35, 6075823), (36, 6059881), (37, 6050385), (38, 60

'제품 크기가 작을수록 주문량이 더 많습니다. 제품 크기 1인 제품의 주문량이 가장 많고, 크기가 커질수록 주문량이 감소하는 추세입니다.'

In [60]:
db_chain.run(PROMPT.format(question="가장 많이 재주문된 제품은?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 가장 많이 재주문된 제품은?

Assistant:
SQLQuery:[32;1m[1;3mSELECT p_name
FROM part
INNER JOIN lineitem ON p_partkey = l_partkey
WHERE l_returnflag = 'R'
GROUP BY p_name
ORDER BY COUNT(*) DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('burnished navajo smoke dim maroon',)][0m
Answer:[32;1m[1;3mburnished navajo smoke dim maroon[0m
[1m> Finished chain.[0m


'burnished navajo smoke dim maroon'

In [61]:
db_chain.run(PROMPT.format(question="어떤 공급업체가 가장 많은 제품을 공급하였나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 공급업체가 가장 많은 제품을 공급하였나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT s.s_name, COUNT(ps.ps_partkey) AS num_parts 
FROM partsupp ps JOIN supplier s ON ps.ps_suppkey = s.s_suppkey
GROUP BY s.s_name
ORDER BY num_parts DESC
LIMIT 1;[0m
SQLResult: [33;1m[1;3m[('Supplier#000802140', 160)][0m
Answer:[32;1m[1;3m'Supplier#000802140'이 가장 많은 제품을 공급하였습니다.[0m
[1m> Finished chain.[0m


"'Supplier#000802140'이 가장 많은 제품을 공급하였습니다."

In [62]:
db_chain.run(PROMPT.format(question="공급업체의 평균 계정 잔액은 어떻게 되나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 공급업체의 평균 계정 잔액은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT AVG(s_acctbal) AS "Average Account Balance"
FROM supplier;[0m
SQLResult: [33;1m[1;3m[(Decimal('4500.0568'),)][0m
Answer:[32;1m[1;3m공급업체의 평균 계정 잔액은 4500.0568입니다.[0m
[1m> Finished chain.[0m


'공급업체의 평균 계정 잔액은 4500.0568입니다.'

In [63]:
db_chain.run(PROMPT.format(question="공급업체의 평균 공급 비용은 어떻게 되나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 공급업체의 평균 공급 비용은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT AVG(ps_supplycost) AS avg_supplycost
FROM partsupp;[0m
SQLResult: [33;1m[1;3m[(Decimal('500.4743'),)][0m
Answer:[32;1m[1;3m평균 공급 비용은 500.4743입니다.[0m
[1m> Finished chain.[0m


'평균 공급 비용은 500.4743입니다.'

In [64]:
db_chain.run(PROMPT.format(question="어떤 주문 라인 항목이 가장 높은 할인을 받았나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 주문 라인 항목이 가장 높은 할인을 받았나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT MAX(l_discount) AS max_discount
FROM lineitem;[0m
SQLResult: [33;1m[1;3m[(Decimal('0.1000'),)][0m
Answer:[32;1m[1;3m가장 높은 할인율은 0.1 (10%) 입니다.[0m
[1m> Finished chain.[0m


'가장 높은 할인율은 0.1 (10%) 입니다.'

In [65]:
db_chain.run(PROMPT.format(question="주문 라인 항목의 평균 수량은 어떻게 되나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 주문 라인 항목의 평균 수량은 어떻게 되나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT AVG(l_quantity) AS avg_quantity
FROM lineitem;[0m
SQLResult: [33;1m[1;3m[(Decimal('25.4997'),)][0m
Answer:[32;1m[1;3m주문 라인 항목의 평균 수량은 25.4997입니다.[0m
[1m> Finished chain.[0m


'주문 라인 항목의 평균 수량은 25.4997입니다.'

In [66]:
db_chain.run(PROMPT.format(question="어떤 배송 방법이 가장 많이 사용되었나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query execution and return only the answer.  
The query: 어떤 배송 방법이 가장 많이 사용되었나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT l_shipmode, COUNT(*) AS num_shipments 
FROM lineitem  
GROUP BY l_shipmode
ORDER BY num_shipments DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('RAIL', 43294267)][0m
Answer:[32;1m[1;3mRAIL[0m
[1m> Finished chain.[0m


'RAIL'

In [67]:
PROMPT = """ 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query and return the detailed answer in Korean.  
The query: {question}

Assistant:"""

In [68]:
db_chain.run(PROMPT.format(question="어떤 배송 방법이 가장 많이 사용되었나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query and return the detailed answer in Korean.  
The query: 어떤 배송 방법이 가장 많이 사용되었나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT l_shipmode, COUNT(*) AS count
FROM lineitem 
GROUP BY l_shipmode
ORDER BY count DESC
LIMIT 1;[0m
SQLResult: [33;1m[1;3m[('RAIL', 43294267)][0m
Answer:[32;1m[1;3m레일 (RAIL) 배송 방법이 가장 많이 사용되었습니다.[0m
[1m> Finished chain.[0m


'레일 (RAIL) 배송 방법이 가장 많이 사용되었습니다.'

In [69]:
PROMPT = """ 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query and return the detailed answer as much as possible based on the result.  
The query: {question}

Assistant:"""

db_chain.run(PROMPT.format(question="어떤 배송 방법이 가장 많이 사용되었나요?"))



[1m> Entering new SQLDatabaseChain chain...[0m
 
Human: Given an input question, create a syntactically correct postgresql query to run,
then look at the results of the query and return the detailed answer as much as possible based on the result.  
The query: 어떤 배송 방법이 가장 많이 사용되었나요?

Assistant:
SQLQuery:[32;1m[1;3mSELECT l_shipmode, COUNT(*) AS shipmode_count
FROM lineitem 
GROUP BY l_shipmode
ORDER BY shipmode_count DESC
LIMIT 1[0m
SQLResult: [33;1m[1;3m[('RAIL', 43294267)][0m
Answer:[32;1m[1;3m가장 많이 사용된 배송 방법은 'RAIL' 입니다. 'RAIL'로 43294267건의 배송이 이루어졌습니다.[0m
[1m> Finished chain.[0m


"가장 많이 사용된 배송 방법은 'RAIL' 입니다. 'RAIL'로 43294267건의 배송이 이루어졌습니다."