In [1]:
from sentence_transformers import SentenceTransformer

# Load a pre-trained embedding model
embedding_model = SentenceTransformer("all-MiniLM-L6-v2")

In [2]:
import os
import json

directory = "data_korean_history"
documents = []

for root, _, files in os.walk(directory):
    for file in files:
        data_dir = os.path.join(root, file)

        with open(data_dir, "r", encoding="utf-8") as file:
            data = json.load(file)
        
        text_data = data['query']['pages'][list(data['query']['pages'].keys())[0]]['extract'].split('\n\n\n')
        text_title = data['query']['pages'][list(data['query']['pages'].keys())[0]]['title']
        for t in text_data:
            if len(t) > 300:
                documents.append(text_title + '\n\n' + t)
        

    #with open('data/'+f, 'r', encoding='utf-8') as file:
        #content = file.read()
        #content = content.split('\n\n')
        #documents = documents + content


In [3]:
import faiss
import numpy as np

# Convert documents to embeddings
doc_embeddings = np.array(embedding_model.encode(documents), dtype=np.float32)

# Create a FAISS index
dimension = doc_embeddings.shape[1]
index = faiss.IndexFlatL2(dimension) # Index - Vector Embedding Mapping
index.add(doc_embeddings)


In [4]:
def retrieve_similar_documents(query, top_k=5):
    query_embedding = np.array([embedding_model.encode(query)], dtype=np.float32)
    distances, indices = index.search(query_embedding, top_k)
    
    results = [documents[i] for i in indices[0]]
    return results

In [5]:
def context_query(query):
    context = "\n\n".join(retrieve_similar_documents(query, top_k=1))
    prompt = f"Context: {context}\n\nQuestion: {query}\nAnswer:"
    return prompt

## RAG with Gemma3-1b

In [6]:
from transformers import pipeline
import torch
pipe = pipeline("text-generation", model="google/gemma-3-1b-it", device="cuda", torch_dtype=torch.bfloat16)

Device set to use cuda


In [None]:
# Chatbot Demonstration
messages = [[]]
query_chain = ""

while True:
    query = input(">>")
    
    query_chain += query + '\n\n\n'
    messages[0].append({
            "role": "user",
            "content": [{"type": "text", "text": context_query(query_chain)},]})
    
    output = pipe(messages, max_new_tokens=1000)
    
    query_chain += output[0][0]['generated_text'][-1]['content'] + '\n\n\n'
    messages[0].append({
        "role": "assistant",
        "content": [{"type": "text", "text": output[0][0]['generated_text'][-1]['content']},]})
    
    print(output[0][0]['generated_text'][-1]['content'])
    print()

>> Tell me about the Joseon Dynasty


Okay, here’s a breakdown of the Joseon Dynasty, based on the provided context:

The Joseon Dynasty was a powerful and complex state in Joseon Korea. Here’s a summary of what’s key:

*   **Tributary State:** Joseon was a tributary state to the Ming and Qing dynasties, meaning it was ruled by foreign powers.
*   **High Degree of Independence:** Despite being under foreign control, Joseon maintained a significant level of autonomy and sovereignty.
*   **Limited Monarch Power:** The ruling Yi family established institutions that limited the power of the monarchy.
*   **Contradictory Status:** Modern scholars believe this system created a situation where Joseon’s sovereignty seemed to be contradictory – a position of high power within a tributary relationship.
*   **Significant Foreign Relations:** Joseon maintained relationships with numerous neighboring states, including Japan, Vietnam, Ryukyu, Burma, Thailand, Laos, Brunei, and the Philippines.
*   **Tribute and Trade:** It received trib

>> Tell me about its economy


Okay, let’s delve into the Joseon Dynasty’s economy. Here’s a breakdown based on the context:

The Joseon Dynasty’s economy was heavily influenced by its tributary relationships and its position as a regional power. Here’s a look at key aspects:

*   **Tributary Revenue:** A large portion of the Joseon economy derived from tribute payments to the Ming and Qing dynasties. This was a crucial source of revenue, especially during times of peace and prosperity.
*   **Agriculture:** Agriculture remained the backbone of the economy. Rice was the primary crop, and the dynasty cultivated various other grains and vegetables.
*   **Manufacturing:**  Joseon developed a relatively advanced manufacturing sector, particularly in textiles (especially silk), ceramics, and paper.  The silk industry, in particular, was highly regarded.
*   **Trade:**  Joseon engaged in extensive trade with neighboring countries – primarily Japan, Vietnam, Ryukyu, Burma, Thailand, Laos, Brunei, and the Philippines – facil

>> how did the joseon dynasty fall?


Okay, let’s address how the Joseon Dynasty fell.

The Joseon Dynasty ultimately collapsed due to a confluence of factors, primarily stemming from internal weaknesses and external pressures. Here’s a breakdown of the key contributing elements:

*   **Internal Weaknesses:** Over time, the Joseon Dynasty suffered from increasing internal dissent. The Yi family’s centralized rule, while initially successful, eventually alienated local officials and landowners. Disputes over land and resources exacerbated these tensions.
*   **The Rebellion of the White Lotus (1796-1800):** This was arguably the pivotal moment. The White Lotus rebellion, a coalition of Korean rebels, targeted the Joseon court, seizing control of key territories and attempting to overthrow the dynasty. Their actions severely weakened the government and exposed its vulnerabilities.
*   **The Taeguk Rebellion (1894-1895):** This rebellion, led by the powerful Taeguk clan, further destabilized the country. It involved a large-s