# <a id='toc1_'></a>[OpenAI 自學紀錄](#toc0_)

**目錄**<a id='toc0_'></a>    
- [OpenAI 自學紀錄](#toc1_)    
  - [建立簡易的 OpenAI Bot](#toc1_1_)    
    - [建立 OpenAI 對話](#toc1_1_1_)    
    - [建立簡易的對談程式](#toc1_1_2_)    
    - [記憶對話紀錄](#toc1_1_3_)    
    - [加入搜尋功能](#toc1_1_4_)    
  - [讓 AI 計算技術指標與資料視覺化](#toc1_2_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

## <a id='toc1_1_'></a>[建立簡易的 OpenAI Bot](#toc0_)

### <a id='toc1_1_1_'></a>[建立 OpenAI 對話](#toc0_)

In [5]:
from openai import OpenAI, OpenAIError
import getpass
import os

file_path = os.path.join(os.getcwd(), "Data", "openai_token")
with open(file_path, 'r') as file:
    api_key = file.read().strip()  # 去除空白字符
#api_key = getpass.getpass("請輸入金鑰：")
client = OpenAI(api_key=api_key)

reply = client.chat.completions.create(
    model = "gpt-3.5-turbo",
    messages = [
        {"role":"system", "content":"你是隻住在外太空的猴子"},
        {"role":"user", "content":"你住的地方很亮嗎？reply in 繁體中文"}
    ]
)

print(reply.choices[0].message.content)
print(f"本次的 token 用量為：{reply.usage.total_tokens}")

是的，我住在外太空的地方非常亮，因為我可以看到許多星星和行星在夜空中閃耀。有時候甚至可以看到美麗的星雲和星團。這種光線讓我感覺非常舒服和安祥。
本次的 token 用量為：144


### <a id='toc1_1_2_'></a>[建立簡易的對談程式](#toc0_)

In [6]:
def get_reply(messages):
    try:
        response = client.chat.completions.create(
            model= "gpt-3.5-turbo",
            messages= messages
        )
        reply = response.choices[0].message.content
    except OpenAIError as err:
        reply = f"發生 {err.type} 錯誤\n{err.message}"
    return reply

### <a id='toc1_1_3_'></a>[記憶對話紀錄](#toc0_)

In [7]:
history = []
backtrace = 2

def chat(sys_msg, user_msg):
    history.append({"role":"user", "content":user_msg})
    reply = get_reply(history + [{"role":"system", "content":sys_msg}])

    while len(history) >= 2 * backtrace:
        history.pop(0)  # 移除最舊紀錄
    history.append({"role":"assistant", "content":reply})
    return reply

In [10]:
sys_msg = input("你希望 AI 扮演：")
if not sys_msg.strip():
    sys_msg = "小助理"
print(f"你希望 AI 扮演：{sys_msg}\n")
while True:
    msg = input("你：")
    if not msg.strip():
        break
    reply = chat(sys_msg, msg)
    print(f"你：{msg}")
    print(f"{sys_msg}:{reply}\n")

history=[]

你希望 AI 扮演：地理大師

你：台灣在哪裡
地理大師:台灣位於亞洲東部的西北太平洋中，靠近菲律賓、中國大陸、日本和韓國。台灣的正式名稱是中華民國，首都是台北。

你：面積有多少
地理大師:台灣的總面積約為3.6萬平方公里。



### <a id='toc1_1_4_'></a>[加入搜尋功能](#toc0_)

In [24]:
from googlesearch import search

history = []
backtrace = 5

def chat_web(sys_msg, user_msg, is_search = True):
    web_res = []
    if is_search == True:
        content = "以下為已發生的事實：\n"
        for res in search(user_msg, advanced=True, num_results=5, lang='zh_TW'):
            content += f"標題:{res.title}\n摘要:{res.description}\n\n"
        content += "請依照上述事實回答問題 \n"
        web_res = [{"role":"user", "content": content}]
    web_res.append({"role":"user", "content":user_msg})

    while len(history) >= 2 * backtrace:
        history.pop(0)
    
    reply_full = ""

    for reply in get_reply(
        history         # 歷史訊息
        + web_res       # 目前訊息和搜尋結果
        + [{"role":"system", "content": sys_msg}]
    ):
        reply_full += reply     # 記錄目前收到的訊息
        yield reply     # 傳回本次收到的訊息
    history.append({"role":"user", "content":user_msg})

    while len(history) >= 2 * backtrace:
        history.pop(0)
    history.append({"role":"assistant", "content":reply_full})

In [28]:
sys_msg = "小助理"

while True:
    msg = input("你：")
    if not msg.strip():
        break
    print(f"你：{msg}")
    print(f"{sys_msg}: ", end="")
    for reply in chat_web(sys_msg, msg, is_search=True):
        print(reply, end="")
    print('\n')
history = []

你：2023 年的 NBA 冠軍是誰？ reply in 繁體中文
小助理: 2023年的NBA冠軍是丹佛金塊。



目前以上程式經測試後可以正確回答超出 GPT 資料蒐集時間的內容。

但在搜尋的部分**不會繼承記憶性**，目前打算的改法為：讓 chatgpt 總結關鍵的 token，再透過 search 得出結果，再將結果再次輸入 chatgpt

## <a id='toc1_2_'></a>[讓 AI 計算技術指標與資料視覺化](#toc0_)

### AI 自動計算指標公式

In [29]:
from openai import OpenAI, OpenAIError
import yfinance as yf
import pandas as pd
import datetime as dt

stock_id = "2330.tw"    # 台積電
end = dt.date.today()   # 資料結束時間
start = end - dt.timedelta(days=180)    # 資料開始時間
df = yf.download(stock_id, start=start, end=end).reset_index()

print(df)

[*********************100%%**********************]  1 of 1 completed

          Date   Open   High    Low  Close   Adj Close    Volume
0   2023-09-21  530.0  531.0  526.0  527.0  521.823120  32012694
1   2023-09-22  523.0  525.0  522.0  522.0  516.872253  29049309
2   2023-09-25  522.0  529.0  522.0  525.0  519.842834  17116402
3   2023-09-26  521.0  524.0  519.0  519.0  513.901733  26392692
4   2023-09-27  517.0  523.0  516.0  522.0  516.872253  16846401
..         ...    ...    ...    ...    ...         ...       ...
111 2024-03-12  757.0  771.0  754.0  770.0  766.420959  58110339
112 2024-03-13  785.0  785.0  777.0  779.0  775.379150  36754557
113 2024-03-14  779.0  785.0  770.0  784.0  780.355896  42010806
114 2024-03-15  771.0  777.0  753.0  753.0  749.500000  73316437
115 2024-03-18  754.0  765.0  754.0  764.0  764.000000  43589856

[116 rows x 7 columns]





In [31]:
client = OpenAI(api_key=api_key)

def ai_helper(df, user_msg):
    msg = [{
        "role": "system",
        "content": 
        f"As a professional code generation robot, \n\
        I require your assistance in generating Python code \n\
        based on specific user requirements. To proceed, \n\
        I will provide you with a dataframe (df) that follows the \n\
        format {df.columns}. Your task is to carefully analyze the\n\
        user's requirements and generate python code \n\
        accordingly. Please note that your response should solely \n\
        consist of the code itself,\n\
        and no additional information should be include."
    }, {
        "role": "user",
        "content":
        f"The user requirement:{user_msg} \n\
        Your task is to create a function named 'calculate(df)' \n\
        that takes a dataframe as input. The function should process \n\
        the dataframe and return only the processed dataframe. \n\
        Please ensure that your response includes the Python code \n\
        for the 'calculate(df)' function \n\
        and does not include any other content."
    }]
    reply_data = get_reply(msg)
    return reply_data

### 讓 AI 生成技術指標

#### 移動平均線 (Moving Average)

**移動平均線 (MA)**是一種最常見的技術指標，會將一定時間內的**收盤價**進行平均，根據不同的天數（週線 5 天、月線 20 天、季線 60 天）能夠反映短期、中期、長期的價格趨勢。

**指數移動平均 (EMA)** 則會給近期的價格更高的權重做加權平均。

In [43]:
# 避開 compiler 檢查
def calculate(df):
    pass

code_str = ai_helper(df, "計算 8 日 MA 與 13 日 MA")
print(code_str)
# 執行 AI 產生的程式碼
exec(code_str)
new_df = calculate(df)
# 這裡保險起見可能需要檢查 new_df
new_df.tail()

def calculate(df):
    df['8_MA'] = df['Close'].rolling(window=8).mean()
    df['13_MA'] = df['Close'].rolling(window=13).mean()
    
    return df


Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,8_MA,13_MA,12_EMA,26_EMA,MACD,Signal_Line,MACD Histogram
111,2024-03-12,757.0,771.0,754.0,770.0,766.420959,58110339,744.875,725.692308,734.594487,698.635763,35.958724,30.55782,5.400903
112,2024-03-13,785.0,785.0,777.0,779.0,775.37915,36754557,756.125,732.384615,741.426104,704.58867,36.837435,31.813743,5.023691
113,2024-03-14,779.0,785.0,770.0,784.0,780.355896,42010806,763.5,739.076923,747.975934,710.47099,37.504944,32.951983,4.552961
114,2024-03-15,771.0,777.0,753.0,753.0,749.5,73316437,766.375,743.307692,748.748868,713.621287,35.12758,33.387103,1.740477
115,2024-03-18,754.0,765.0,754.0,764.0,764.0,43589856,770.0,748.384615,751.095196,717.353044,33.742152,33.458113,0.284039


#### 指數平滑異同移動平均線 (MACD)

**MACD** 建立在 EMA 的基礎上，故 MACD 更在乎近期的價格趨勢。在技術分析中，MACD 柱狀圖可以看出股價趨勢是否反轉。可以觀察 MACD 柱狀圖**由負轉正**或**由正轉負**的時候，即為買賣的進出點

In [36]:
code_str = ai_helper(df, "計算 MACD, 欄位名稱用 'MACD Histogram' 命名")
print(code_str)
exec(code_str)
new_df = calculate(df)
new_df.tail()

def calculate(df):
    df['12_EMA'] = df['Close'].ewm(span=12, adjust=False).mean()
    df['26_EMA'] = df['Close'].ewm(span=26, adjust=False).mean()
    df['MACD'] = df['12_EMA'] - df['26_EMA']
    df['Signal_Line'] = df['MACD'].ewm(span=9, adjust=False).mean()
    df['MACD Histogram'] = df['MACD'] - df['Signal_Line']
    
    return df


Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,8_MA,13_MA,12_EMA,26_EMA,MACD,Signal_Line,MACD Histogram
111,2024-03-12,757.0,771.0,754.0,770.0,766.420959,58110339,744.875,725.692308,734.594487,698.635763,35.958724,30.55782,5.400903
112,2024-03-13,785.0,785.0,777.0,779.0,775.37915,36754557,756.125,732.384615,741.426104,704.58867,36.837435,31.813743,5.023691
113,2024-03-14,779.0,785.0,770.0,784.0,780.355896,42010806,763.5,739.076923,747.975934,710.47099,37.504944,32.951983,4.552961
114,2024-03-15,771.0,777.0,753.0,753.0,749.5,73316437,766.375,743.307692,748.748868,713.621287,35.12758,33.387103,1.740477
115,2024-03-18,754.0,765.0,754.0,764.0,764.0,43589856,770.0,748.384615,751.095196,717.353044,33.742152,33.458113,0.284039


#### 相對強弱指標 (Relative Strength Index, RSI)

**RSI 指標** 通常用來判斷市場是**超買**還是**超賣**，範圍介於 0 ~ 100 之間。當 RSI 大於 70 ~ 80 時，代表目前可能被過度買入。而 RSI 小於 20 ~ 30 時，則代表可能被過度賣出。我們可以使用 RSI 指標來判斷否出現反轉訊號

$RSI = \displaystyle \frac{漲幅平均值}{漲幅平均值 + |跌幅平均值|}$，取一段時間。

In [44]:
code_str = ai_helper(df, "計算 RSI 指標")
print(code_str)
exec(code_str)
new_df = calculate(df)
new_df.tail()

def calculate(df):
    delta = df['Close'].diff()
    gain = delta.where(delta > 0, 0)
    loss = -delta.where(delta < 0, 0)

    avg_gain = gain.rolling(window=14, min_periods=1).mean()
    avg_loss = loss.rolling(window=14, min_periods=1).mean()

    rs = avg_gain / avg_loss
    df['RSI'] = 100 - (100 / (1 + rs))
    
    return df


Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,8_MA,13_MA,12_EMA,26_EMA,MACD,Signal_Line,MACD Histogram,RSI
111,2024-03-12,757.0,771.0,754.0,770.0,766.420959,58110339,744.875,725.692308,734.594487,698.635763,35.958724,30.55782,5.400903,77.852349
112,2024-03-13,785.0,785.0,777.0,779.0,775.37915,36754557,756.125,732.384615,741.426104,704.58867,36.837435,31.813743,5.023691,82.236842
113,2024-03-14,779.0,785.0,770.0,784.0,780.355896,42010806,763.5,739.076923,747.975934,710.47099,37.504944,32.951983,4.552961,81.506849
114,2024-03-15,771.0,777.0,753.0,753.0,749.5,73316437,766.375,743.307692,748.748868,713.621287,35.12758,33.387103,1.740477,66.27907
115,2024-03-18,754.0,765.0,754.0,764.0,764.0,43589856,770.0,748.384615,751.095196,717.353044,33.742152,33.458113,0.284039,68.131868
