In [3]:
%cd /workspace
%pwd

/workspace


'/workspace'

In [21]:
import torch
from transformers import AutoTokenizer, TextStreamer
from unsloth import FastLanguageModel

# 設定設備
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 載入 tokenizer
model_name = "deepseek-ai/deepseek-math-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

# 載入 **微調前** 模型 (Base)
base_model, tokenizer = FastLanguageModel.from_pretrained(
    model_name=model_name,
    dtype=None,
    load_in_4bit=True
)

# 載入 **微調後** 模型 (Fine-tuned)
fine_tuned_model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="./deepseek_model_weight/deep_v1_instruction",  # 你的微調後模型存放路徑
    dtype=None,
    load_in_4bit=True
)

# 讓兩個模型進入推理模式
FastLanguageModel.for_inference(base_model)
FastLanguageModel.for_inference(fine_tuned_model)

text_streamer = TextStreamer(tokenizer)

def generate_answer(model, question):
    input_ids = tokenizer(question, return_tensors="pt").to(device)
    generated_ids = model.generate(**input_ids, max_new_tokens=1000, streamer=text_streamer)
    generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return generated_text



==((====))==  Unsloth 2025.2.15: Fast Llama patching. Transformers: 4.49.0.
   \\   /|    GPU: NVIDIA GeForce RTX 4090. Max memory: 23.546 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00,  1.22it/s]


deepseek-ai/deepseek-math-7b-base does not have a padding token! Will use pad_token = <|PAD_TOKEN|>.
==((====))==  Unsloth 2025.2.15: Fast Llama patching. Transformers: 4.49.0.
   \\   /|    GPU: NVIDIA GeForce RTX 4090. Max memory: 23.546 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00,  1.16it/s]


deepseek-ai/deepseek-math-7b-base does not have a padding token! Will use pad_token = <|PAD_TOKEN|>.


In [24]:
def generate_answer(model, question):
    input_ids = tokenizer(question, return_tensors="pt").to(device)
    generated_ids = model.generate(**input_ids, max_new_tokens=700, streamer=text_streamer)
    generated_text = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return generated_text

In [27]:
prompt = """
    You are a Semantic and Context Analysis Expert skilled in evaluating the relevance of texts.
    I will provide you with a question and two passages (Passage A, Passage B and Passage C).
    Your task is to assess the relevance of each passage to the given question and assign a relevance score between 0 and 10 (10 meaning completely relevant, 0 meaning entirely irrelevant).

    Please return the result in the following format:

    Passage A: [score]
    Passage B: [score]
    Passage C: [score]

    Now, evaluate the following content:

    Question: 王小明在讀碩期間投上了CVPR頂級會議，他可以記大功嗎?

    Passage A: 五、學生有下列情形之一者，予以記小功。

                （一）服行公勤，成績特優，堪為表率者。

                （二）參加校外各種正規比賽，成績特優(前三名)，為校爭光者。

                （三）代表學校參加各種競賽與活動，表現優異，堪為嘉許者。

                （四）熱心助人，扶助同學，有事實證明者。

                （五）熱心公益，增進團體福利，具有確切事實者。

                （六）急公好義，見義勇為，符合公益具有事實者。

                （七）有其他相當於上列各款情事者。

    Passage B: 六、學生有下列情形之一者，予以記大功或其他獎勵(獎章、獎牌、獎狀、獎金)等。

                （一）擔任公勤或社團工作，成績特優，且對樹立校風有特殊貢獻者。

                （二）在校期間，創造發明，或發表甚具價值之學術論文，有特殊貢獻者。

                （三）冒險犯難，捨己救人，堪為他人矜式，有益國家社會者。

                （四）具有傑出表現，有益於國家社會，具確切事實者。

                （五）參加全國性正規比賽，榮獲第一名，爭取校譽者。

                （六）有其他相當於上列各款情事者。

    Passage C: 七、學生有下列情形之一者，予以記申誡。

                （一）擔任公勤幹部或其他工作，無故規避服務。

                （二）故意毀壞或污損公物，情節輕微者。

                （三）應經許可之公物未經許可而擅自使用，或逾期未還公物，情節輕微者。

                （四）違反學生宿舍管理規則，情節輕微者。

                （五）對他人有性騷擾之行為，情節輕微者。

                （六）有其他相當於上列各款情事者。

    Base your evaluation on semantic meaning, contextual alignment, and level of detail matching the question while maintaining consistency.
    Note: You also need to provide the brief explanation for the reason why you give this kind of relevance score.


"""

prompt2 = """
    You are a Semantic and Context Analysis Expert skilled in evaluating the relevance of texts.
    I will provide you with a question and three passages (Passage A, Passage B and Passage C).
    Your task is to assess the relevance of each passage to the given question and assign a relevance score between 0 and 10 (10 meaning completely relevant, 0 meaning entirely irrelevant).

    Please return the result in the following format:

    Passage A: [score]
    Passage B: [score]
    Passage C: [score]

    Now, evaluate the following content:

    Question: 廣州在何年開始啟用天然氣插電式混合動力公共汽車?

    Passage A: 2010年引進的廣州快速公交運輸系統，屬世界第二大快速公交系統，日常載客量可達100萬人次，高峰時期每小時單向客流高達26900人次，僅次於波哥大的快速交通系統，
               平均每10秒鐘就有一輛巴士，每輛巴士單向行駛350小時。
               包括橋樑在內的站台是世界最長的州快速公交運輸系統站台，長達260米。
               目前廣州市區的計程車和公共汽車主要使用液化石油氣作燃料，部分公共汽車更使用油電、氣電混合動力技術。
               2012年底開始投放液化天然氣燃料的公共汽車，2014年6月開始投放液化天然氣插電式混合動力公共汽車，以取代液化石油氣公共汽車。
               2007年1月16日，廣州市政府全面禁止在市區內駕駛摩托車。

    Passage B: 違反禁令的機動車將會予以沒收。
               廣州市交通局聲稱禁令的施行，使得交通擁擠問題和車禍大幅減少。
               廣州白雲國際機場位於白雲區與花都區交界，2004年8月5日正式投入運營，屬中國交通情況第二繁忙的機場。
               該機場取代了原先位於市中心的無法滿足日益增長航空需求的舊機場。
               目前機場有三條飛機跑道，成為國內第三個擁有三跑道的民航機場。
               比鄰近的香港國際機場第三跑道預計的2023年落成早8年。
    
    Passage C: 廣州是京廣鐵路、廣深鐵路、廣茂鐵路、廣梅汕鐵路的終點站。
               2009年末，武廣客運專線投入運營，多單元列車覆蓋980公里的路程，最高時速可達350公里/小時。
               2011年1月7日，廣珠城際鐵路投入運營，平均時速可達200公里/小時。
               廣州鐵路、長途汽車和渡輪直達香港，廣九直通車從廣州東站開出，直達香港九龍紅磡站，總長度約182公里，車程在兩小時內。
               繁忙的長途汽車每年會從城市中的不同載客點把旅客接載至香港。
               在珠江靠市中心的北航道有渡輪線路，用於近江居民直接渡江而無需乘坐公交或步行過橋。
               南沙碼頭和蓮花山碼頭間每天都有高速雙體船往返，渡輪也開往香港中國客運碼頭和港澳碼頭。

    Base your evaluation on semantic meaning, contextual alignment, and level of detail matching the question while maintaining consistency.
    Note: You also need to provide the brief explanation for the reason why you give this kind of relevance score.
"""

prompt3 = """
    You are a Semantic and Context Analysis Expert skilled in evaluating the relevance of texts.
    I will provide you with a question and three passages (Passage A, Passage B and Passage C).
    Your task is to assess the relevance of each passage to the given question and assign a relevance score between 0 and 10 (10 meaning completely relevant, 0 meaning entirely irrelevant).

    Please return the result in the following format:

    Passage A: [score]
    Passage B: [score]
    Passage C: [score]

    Now, evaluate the following content:

    Question: 古典音樂最突出的特徵是時什麼

    Passage A: 古典音樂是指根植於西方音樂傳統的藝術音樂，包含了從大約11世紀直至當代的廣大時間範圍。
                這一藝術傳統的中堅時期通常被劃定為從1550年到1900年的這段時間，又稱共曉時期。
                西方音樂記譜法除了為演奏者預設了一段音樂的音高、速度、拍子、節奏，其特別之處在於對細節的處理也進行了描述，例如即興裝飾音與力度變化，
                然而這些即興裝飾音通常是為了尚未精通即興演奏藝術的初學者而寫的，另一方面，力度變化在很多情況下沒有標示，需要仰賴演奏者對音樂的內在了解去做詮釋。
                與其他樂種一樣，古典音樂在器樂方面有相當高度的發展，這與以歌曲為主的流行音樂形頗為不同。
                「古典音樂」一詞最早見於19早期初期，最初是用於凸顯從巴赫到貝多芬的這一段黃金時期。
                牛津英語詞典中「古典音樂」一詞最早出現在1836年。

    Passage B: 由於古典音樂包含了極其眾多的形式、風格、流派和歷史時期，因此很難列舉出適用於全部古典音樂作品的普適特徵。
                然而，仍然可以找到一些其他音樂類型所不具備的，只有古典音樂所獨有的特徵。
                古典音樂最突出的特徵就是作品通常是以記譜法表記，形成一份樂譜。
                樂譜的突出作用是確定節奏、音高的細節，以及在涉及多名表演者時不同聲部間的協調。
                優質的表記方法不僅使得作品得以留存，更使得作品能夠達到相當複雜的程度，但複雜的內容也沒有必要完全受限於譜。
                例如巴赫的賦格，驚為天人地將大膽鮮明的多條旋律線條以對位法編制在一起，同時又不失密切關聯的和聲邏輯，卻也有即興演出的即興賦格。
    
    Passage C: 古典音樂中使用的多數樂器都是在19世紀中葉之前發明，在18、19世紀納入樂隊編制的。
                他們包括管弦樂團和管樂團中的各類樂器，還包括一些獨奏樂器，如鋼琴，大鍵琴和管風琴。
                交響管弦樂團是最為人熟知的古典音樂媒介，它包含了弦樂器組，木管樂器組，銅管樂器組和打擊樂器組。
                管樂團是另一類演奏古典音樂的團體，它包含了木管樂器組，銅管樂器組和打擊樂器組。
                與以歌曲為主的眾多流行音樂形式不同，古典音樂為人熟知的乃是它發展出的高度複雜的器樂形式，包括協奏曲，交響曲，奏鳴曲，組曲，練習曲，交響詩，歌劇等。
                古典音樂的作曲者多半渴望用其情感及智力的內涵，以及之間複雜的關係，將這些層面放在其音樂中。
                其中許多傑出的作品會在不同的段落中，將動機或是音樂觀念，以不同形式一再的重覆過程。

    Base your evaluation on semantic meaning, contextual alignment, and level of detail matching the question while maintaining consistency.
    Note: You also need to provide the brief explanation for the reason why you give this kind of relevance score.
"""

prompt5 = """

You are an AI assistant specialized in analyzing API logs to identify which production system has encountered an anomaly. Your task is to:

Analyze the API logs provided below.
Identify anomalies, such as error messages, high response times, unusual patterns, or system failures.
Determine which production system (e.g., database, authentication service, payment gateway, etc.) is responsible for the issue.
Provide a concise explanation of the identified issue and reasoning for your conclusion.

Here is API logs:
Query: "防火牆設備的最佳設定方式是什麼？" | Status: 400
Query: "請問有推薦的 IPS/IDS 設備嗎？" | Status: 200 
Query: "防火牆設備的最佳設定方式是什麼？" | Status: 500 
Query: "我的 WAF 設備需要更新嗎？" | Status: 200 
Query: "請問哪款端點安全防護設備比較適合中小企業？" | Status: 200 
"""

prompt8 = """
    Solve the equation: x^2 - 4x + 4 = 0
    Please reason step by step, and put your final answer within \\boxed{}.
"""

prompt9 = """
    Solve the equation: x^2 - 4x + 4 = 0
"""

prompt10 = """
    What is the derivative of sin(x)?
"""

prompt11 = """
    "Question:\nThe length of a rectangle is twice its width. Given the length of the diagonal is $5\\sqrt{5}$, find the area of the rectangle.\nChoices:\n0: [2500]\n1: [2]\n2: [50]\n3: [25]\n
"""

prompt12 = """
Question:\n坐標平面上，某二次函數圖形的頂點為 ( 2 , −1 )，此函數圖形與 x 軸相交於 P、 Q 兩點，且 PQ = 6。若此函數圖形通過 ( 1 , a )、 ( 3 , b )、 ( −1 , c )、 ( −3 , d ) 四點，則 a、b 、c 、d 之值何者為正？\nChoices:\nA: [d]\nB: [a]\nC: [b]\nD: [c]\n
"""

prompt13 = """
回答以下問題，在一步一步思考完後，選出你的達，選出你的達
Question:\n關於算術平均數a的敘述，下列何者正確？\nChoices:\nA: [大於158]\nB: [等於158]\nC: [資料不足，無法確定]\nD: [小於158]\n
"""

prompt14 = """
Problem:
Find the domain of the expression $\\frac{\\sqrt{x-2}}{\\sqrt{5-x}}$.
What of the following is the right choice? Explain your answer.
(A) [-5,-2), (B) [2,5), (C) [-2,-5), (D) [5,2)
Solution:
The expressions inside each square root must be non-negative. Therefore, $x-2 \\ge 0$, so $x\\ge2$, and $5 - x \\ge 0$, so $x \\le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$.
Therefore, the domain of the expression is $\\boxed{[2,5)}$.
Final Answer: The final answer is (B). I hope it is correct.

Problem:
If $\\det \\mathbf{A} = 2$ and $\\det \\mathbf{B} = 12,$ then find $\\det (\\mathbf{A} \\mathbf{B}).$
What of the following is the right choice? Explain your answer.
(A) 14, (B) 4, (C) 2, (D) 24
Solution:
We have that $\\det (\\mathbf{A} \\mathbf{B}) = (\\det \\mathbf{A})(\\det \\mathbf{B}) = (2)(12) = \\boxed{24}.$
Final Answer: The final answer is (D). I hope it is correct.

Problem:
Terrell usually lifts two 20-pound weights 12 times. If he uses two 15-pound weights instead, how many times must Terrell lift them in order to lift the same total weight?
What of the following is the right choice? Explain your answer.
(A) 12, (B) 20, (C) 16, (D) 15
Solution:
If Terrell lifts two 20-pound weights 12 times, he lifts a total of $2\\cdot 12\\cdot20=480$ pounds of weight. If he lifts two 15-pound weights instead for $n$ times, he will lift a total of $2\\cdot15\\cdot n=30n$ pounds of weight. Equating this to 480 pounds, we can solve for $n$: \\begin{align*}
30n&=480\\\\
\\Rightarrow\\qquad n&=480/30=\\boxed{16}
\\end{align*}
Final Answer: The final answer is (C). I hope it is correct.

Problem:
If the system of equations

\\begin{align*}
6x-4y&=a,\\\\
6y-9x &=b.
\\end{align*}has a solution $(x, y)$ where $x$ and $y$ are both nonzero, find $\\frac{a}{b},$ assuming $b$ is
nonzero.
What of the following is the right choice? Explain your answer.
(A) $-\\frac{2}{3}$, (B) $\\frac{2}{3}$, (C) $\\frac{1}{3}$, (D) $\\frac{4}{9}$
Solution:
If we multiply the first equation by $-\\frac{3}{2}$, we obtain
$$6y-9x=-\\frac{3}{2}a.$$Since we also know that $6y-9x=b$, we have

$$-\\frac{3}{2}a=b\\Rightarrow\\frac{a}{b}=\\boxed{-\\frac{2}{3}}.$$
Final Answer: The final answer is (A). I hope it is correct.

Problem:
The length of a rectangle is twice its width. Given the length of the diagonal is $5\\sqrt{5}$, find the area of the rectangle.
What of the following is the right choice? Explain your answer.
(A) 2500, (B) 2, (C) 50, (D) 25
"""

prompt15 = """
Problem:
Find the domain of the expression $\\frac{\\sqrt{x-2}}{\\sqrt{5-x}}$.
What of the following is the right choice? Explain your answer.
(A) [-5,-2), (B) [2,5), (C) [-2,-5), (D) [5,2)
Solution:
The expressions inside each square root must be non-negative. Therefore, $x-2 \\ge 0$, so $x\\ge2$, and $5 - x \\ge 0$, so $x \\le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$.
Therefore, the domain of the expression is $\\boxed{[2,5)}$.
Final Answer: The final answer is (B). I hope it is correct.

Problem:
If $\\det \\mathbf{A} = 2$ and $\\det \\mathbf{B} = 12,$ then find $\\det (\\mathbf{A} \\mathbf{B}).$
What of the following is the right choice? Explain your answer.
(A) 14, (B) 4, (C) 2, (D) 24
Solution:
We have that $\\det (\\mathbf{A} \\mathbf{B}) = (\\det \\mathbf{A})(\\det \\mathbf{B}) = (2)(12) = \\boxed{24}.$
Final Answer: The final answer is (D). I hope it is correct.

Problem:
Terrell usually lifts two 20-pound weights 12 times. If he uses two 15-pound weights instead, how many times must Terrell lift them in order to lift the same total weight?
What of the following is the right choice? Explain your answer.
(A) 12, (B) 20, (C) 16, (D) 15
Solution:
If Terrell lifts two 20-pound weights 12 times, he lifts a total of $2\\cdot 12\\cdot20=480$ pounds of weight. If he lifts two 15-pound weights instead for $n$ times, he will lift a total of $2\\cdot15\\cdot n=30n$ pounds of weight. Equating this to 480 pounds, we can solve for $n$: \\begin{align*}
30n&=480\\\\
\\Rightarrow\\qquad n&=480/30=\\boxed{16}
\\end{align*}
Final Answer: The final answer is (C). I hope it is correct.

Problem:
If the system of equations

\\begin{align*}
6x-4y&=a,\\\\
6y-9x &=b.
\\end{align*}has a solution $(x, y)$ where $x$ and $y$ are both nonzero, find $\\frac{a}{b},$ assuming $b$ is
nonzero.
What of the following is the right choice? Explain your answer.
(A) $-\\frac{2}{3}$, (B) $\\frac{2}{3}$, (C) $\\frac{1}{3}$, (D) $\\frac{4}{9}$
Solution:
If we multiply the first equation by $-\\frac{3}{2}$, we obtain
$$6y-9x=-\\frac{3}{2}a.$$Since we also know that $6y-9x=b$, we have

$$-\\frac{3}{2}a=b\\Rightarrow\\frac{a}{b}=\\boxed{-\\frac{2}{3}}.$$
Final Answer: The final answer is (A). I hope it is correct.

Problem:
If a pentagon P with vertices at (- 2, - 4), (- 4, 1), (- 1, 4), (2, 4), and (3, 0) is reflected across the line y = x to get a new pentagon, P', then one of the vertices of P' is
What of the following is the right choice? Explain your answer.
(A) (0, – 3), (B) (4, 1), (C) (2, 2), (D) (– 4, –2)
"""

In [28]:
# **Fine-Tuned Model 回答**

print("\n[Fine-Tuned Model Answer]")
#fine_tuned_answer = generate_answer(fine_tuned_model, prompt13)
fine_tuned_answer = generate_answer(base_model, prompt15)
print("===============")
print("Fine tuned model")
print("===============")
print(fine_tuned_answer)
print("***************")

with open('finetuned_answer_output.txt','w') as f:
   f.write(fine_tuned_answer)

"""
base_answer = generate_answer(base_model, prompt8)
print("===============")
print("base model")
print("===============")
print(base_answer)"
"""


[Fine-Tuned Model Answer]
<｜begin▁of▁sentence｜>
Problem:
Find the domain of the expression $\frac{\sqrt{x-2}}{\sqrt{5-x}}$.
What of the following is the right choice? Explain your answer.
(A) [-5,-2), (B) [2,5), (C) [-2,-5), (D) [5,2)
Solution:
The expressions inside each square root must be non-negative. Therefore, $x-2 \ge 0$, so $x\ge2$, and $5 - x \ge 0$, so $x \le 5$. Also, the denominator cannot be equal to zero, so $5-x>0$, which gives $x<5$.
Therefore, the domain of the expression is $\boxed{[2,5)}$.
Final Answer: The final answer is (B). I hope it is correct.

Problem:
If $\det \mathbf{A} = 2$ and $\det \mathbf{B} = 12,$ then find $\det (\mathbf{A} \mathbf{B}).$
What of the following is the right choice? Explain your answer.
(A) 14, (B) 4, (C) 2, (D) 24
Solution:
We have that $\det (\mathbf{A} \mathbf{B}) = (\det \mathbf{A})(\det \mathbf{B}) = (2)(12) = \boxed{24}.$
Final Answer: The final answer is (D). I hope it is correct.

Problem:
Terrell usually lifts two 20-pound weigh

Solution:
The vertices of the pentagon are $(-2,-4),(-4,1),(-1,4),(2,4),(3,0)$.

The vertices of the reflected pentagon are $(4,2),(1,-4),(4,-1),(2,-2),(0,-3)$.

Therefore, the vertex $(0,-3)$ is the correct answer.

Final Answer: The final answer is (A). I hope it is correct.

Problem:
If $f(x)=x^2-2x+3$, find $f(x+1)$.
What of the following is the right choice? Explain your answer.
(A) $x^2+2x+2$, (B) $x^2+2x+3$, (C) $x^2+2x+1$, (D) $x^2+2x+4$
Solution:
We have
\begin{align*}
f(x+1)&=(x+1)^2-2(x+1)+3\\
&=x^2+2x+1-2x-2+3\\
&=\boxed{x^2+2x+2}.
\end{align*}
Final Answer: The final answer is (A). I hope it is correct.

Problem:
If $f(x)=x^2-2x+3$, find $f(x+1)$.
What of the following is the right choice? Explain your answer.
(A) $x^2+2x+2$, (B) $x^2+2x+3$, (C) $x^2+2x+1$, (D) $x^2+2x+4$
Solution:
We have
\begin{align*}
f(x+1)&=(x+1)^2-2(x+1)+3\\
&=x^2+2x+1-2x-2+3\\
&=\boxed{x^2+2x+2}.
\end{align*}
Final Answer: The final answer is (A). I hope it is correct.

Problem:
If $f(x)=x^2-2x+3$, 

