This is an experiment with MoA approach to LLM-powered synthetic data generation. The objective is to create a high-quality dataset that can be used to train specific LLMs for structured information extraction (IE) task.

The code below utilizes a basic MoA approach with 2 "small" LLMs as generators, each generating synthetic biographies based on the available JSON records of elite's biographical information. The generated biographies are then evaluated and merged together by a much larger LLM.

MoA can bring together the collective strengths of multiple LLM agents, leveraging the diversity (and fast inference time) of the smaller LLMs while still employing the larger LLM's powerful knowledge base and generative capability to create a final version of biographical texts that are human-like and high quality.

Ironically, however, this approach (at least in this basic form) tend to create final documents that are too concise. Because of such linguistic efficiency, these final documents often combined or remove redundant pieces of information, making their structures and/or contents to no longer match those in the JSON records. As such, using a dataset containing "too perfect" biographical documents paired with structured JSON records would not teach LLMs the desired behavior - especially for structured IE tasks.

** Preliminary Conclusion:** To save time and resources, I will use a multi-LLM approach (2 LLMs) to synthetic data generation. In other words, I will use 2 powerful LLMs to generate synthetic biographies and use human eveluation (rather than a LLM aggregator) to select the better version & enhance the documents. <u>The code for 2-LLM data augmentation is found in another notebook</u>.

While being interesting & having high potential, ways to improve the MoA (or better MoA systems) may be further explored in the future.

# Generate Synthetic Biographies Using Mixture of Agents Approach

*   **Proposer models**: Ministral-8B-Instruct-2501 & Llama-3.2-3B-Instruct
*   **Aggregator model**: Llama-3.3-70B-Instruct-Turbo (quantized by Together AI)



#### Setting Up

In [None]:
!pip install distilabel pydantic pandas openpyxl llama-cpp-python

Collecting distilabel
  Downloading distilabel-1.5.3-py3-none-any.whl.metadata (15 kB)
Collecting llama-cpp-python
  Downloading llama_cpp_python-0.3.7.tar.gz (66.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.7/66.7 MB[0m [31m15.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting datasets>=2.16.0 (from distilabel)
  Downloading datasets-3.3.2-py3-none-any.whl.metadata (19 kB)
Collecting multiprocess>=0.70 (from distilabel)
  Downloading multiprocess-0.70.17-py311-none-any.whl.metadata (7.2 kB)
Collecting portalocker>=2.8.2 (from distilabel)
  Downloading portalocker-3.1.1-py3-none-any.whl.metadata (8.6 kB)
Collecting universal-pathlib>=0.2.2 (from distilabel)
  Downloading universal_pathlib-0.2.6-py3-none-any.whl.metadata (25 kB)
Collecti

In [None]:
import json
import pandas as pd
from pydantic import BaseModel
from typing import List, Optional

# Step 1: Load the JSONL file - file named *final3.jsonl
jsonl_file = "/content/drive/elitenet_vn_cleaned4synthetic_final3.jsonl"

def load_jsonl(file_path):
    data = []
    with open(file_path, "r", encoding="utf-8") as f:
        for line in f:
            data.append(json.loads(line.strip()))
    return data

# Load JSON records
elites_data = load_jsonl(jsonl_file)
print(f"Loaded {len(elites_data)} records.")

# Step 2: Create an Excel table with "Input_JSON" and "Bio_Synthetic"
df = pd.DataFrame({
    "Input_JSON": [json.dumps(elite, ensure_ascii=False) for elite in elites_data],
    "Bio_Synthetic": [""] * len(elites_data)  # Empty column for aggregated biographies
})

# Save to an Excel file
excel_path = "/content/drive/elitenet_synthetic2.xlsx"
df.to_excel(excel_path, index=False, engine='openpyxl')
print(f"Excel file saved: {excel_path}")



Loaded 91 records.
Excel file saved: /content/drive/MyDrive/EliteNet_SyntheticData_2025/elitenet_synthetic2.xlsx


In [None]:
# Step 3: Define the Pydantic schema (same as before)
class Birthplace(BaseModel):
    City: str
    City_OtherName: Optional[List[str]]
    Region: str
    Region_OtherName: Optional[List[str]]

class Education(BaseModel):
    School: str
    OtherName: Optional[List[str]]
    Start: str
    End: str
    Level: str
    Field: str
    Location: str

class MilitaryTitle(BaseModel):
    Title: str
    YearReceived: str

class Organization(BaseModel):
    MainOrg: str
    MainOrg_OtherName: Optional[str]
    SubOrg: Optional[str]
    Unit: Optional[str]

class Job(BaseModel):
    Start: str
    End: str
    Position: str
    Organization: Organization
    Location: Optional[List[str]]

class CareerEvent(BaseModel):
    Date: str
    Detail: str

class PersonRelation(BaseModel):
    Name: str
    Relation: str

class Colleague(BaseModel):
    FirstMention_Year: str
    Name: List[str]

class Rival(BaseModel):
    FirstMention_Year: str
    Name: List[str]

class Biography(BaseModel):
    Name: str
    Name_Other: Optional[List[str]]
    BirthYear: str
    DeathYear: Optional[str]
    Birthplace: Birthplace
    Ethnicity: Optional[List[str]]
    Education: Optional[List[Education]]
    MilitaryTitle: Optional[List[MilitaryTitle]]
    Job: List[Job]
    Retired: Optional[List[CareerEvent]]
    Dismissed: Optional[List[CareerEvent]]
    Resigned: Optional[List[CareerEvent]]
    Arrested: Optional[List[CareerEvent]]
    Exiled: Optional[List[CareerEvent]]
    Killed: Optional[List[CareerEvent]]
    FamilyMember: Optional[List[PersonRelation]]
    Colleague: Optional[List[Colleague]]
    Rival: Optional[List[Rival]]
    synthetic_biography: str  # Final aggregated biography

In [None]:
# Logging in Hugging Face
from huggingface_hub import login

# Log in using your Hugging Face token (get it from https://huggingface.co/settings/tokens)
login("yourHFtoken")

# Clear the Hugging Face model cache
!rm -rf ~/.cache/huggingface

In [None]:
# Step 4: Load the LLMs (Proposers & Aggregator)
from distilabel.models.llms.huggingface import InferenceEndpointsLLM
# from distilabel.models.llms import MistralLLM
from distilabel.models.llms import TransformersLLM
from transformers import AutoTokenizer
import torch

HF_TOKEN = "yourHFtoken"

# Initialize Proposer 1 (Mistral Small 8B)
proposer1 = TransformersLLM(
    model="mistralai/Ministral-8B-Instruct-2410",
    tokenizer="mistralai/Ministral-8B-Instruct-2410",
    device_map="auto",  # Auto-assign to GPU
    torch_dtype="float16",  # Use 16-bit floating point for efficiency
    token=HF_TOKEN # Force authentication
)


proposer1.load()

# Initialize Proposer 2 (Llama 3.2-3B)
proposer2 = TransformersLLM(
    model="meta-llama/Llama-3.2-3B-Instruct",
    tokenizer="meta-llama/Llama-3.2-3B-Instruct",
    device_map="auto",  # Auto-assign to GPU
    torch_dtype="float16",  # Use 16-bit floating point for efficiency
    token=HF_TOKEN # Force authentication
)
proposer2.load()

config.json:   0%|          | 0.00/624 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/26.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.07G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/181k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama_fast.LlamaTokenizerFast'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 - if you loaded a llama tokenizer from a GGUF file you can ignore this message.
Device set to use cuda:0


config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/4.97G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

Device set to use cuda:0


In [None]:
# Load the aggregator
from distilabel.models.llms import TogetherLLM

aggregator = TogetherLLM(model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
                         api_key="yourTogetherLLM-APIkey")

# Check if initialization is successful
if aggregator is None:
    print("❌ ERROR: Model initialization failed!")
else:
    print("✅ Model initialized successfully.")

# Check if the model loads successfully
try:
    aggregator.load()
    print("✅ Model loaded successfully.")
except Exception as e:
    print(f"❌ ERROR: Model failed to load! \n{e}")

✅ Model initialized successfully.
✅ Model loaded successfully.


In [None]:
import os
import json
import torch

test_json = df["Input_JSON"][4]
test_data = json.loads(test_json)  # Convert string to dictionary
print(test_data)

{'Name': 'Dam Quang Trung', 'Name_Other': ['Dam Ngoc Luu'], 'BirthYear': '1921', 'DeathYear': '1995', 'Birthplace': {'City': 'Ha Quang', 'City_OtherName': [], 'Region': 'Cao Bang', 'Region_OtherName': []}, 'Ethnicity': ['Tay'], 'Education': [{'School': 'Whampoa Military Academy', 'OtherName': [], 'Start': '1941', 'End': '1943', 'Level': '', 'Field': 'Military Studies', 'Location': ['China']}, {'School': 'Frunze Military Academy', 'OtherName': [], 'Start': '1957', 'End': '1957', 'Level': '', 'Field': 'Military Studies', 'Location': ['USSR']}], 'MilitaryTitle': [{'Title': 'Colonel', 'YearReceived': '1958'}, {'Title': 'Major General', 'YearReceived': '1974'}, {'Title': 'Lieutenant General', 'YearReceived': '1980'}, {'Title': 'Senior Lieutenant General', 'YearReceived': '1984'}], 'Job': [{'Start': '1941', 'End': '1943', 'Position': 'Student', 'Organization': {'MainOrg': 'Whampoa Military Academy', 'MainOrg_OtherName': '', 'SubOrg': '', 'Unit': ''}, 'Location': ['China']}, {'Start': '1944',

In [None]:
## Test ##
# Enable PyTorch Memory Optimizations
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"
test_json = df["Input_JSON"][4]
test_data = json.loads(test_json)  # Convert string to dictionary

testprompt = f"""
You are an expert historian and writer specializing in biographies of Vietnam's political elites.
Generate a detailed biography based on the following elite's JSON record:

{test_data}

Important Notes:
- The biography should be around 400 to 800 words, depending on the JSON record's length and amount of details.
- Write the biography like a Wikipedia entry, with a natural way and neutral tone, but without a References section.
- Use all available information pieces in the JSON record, these JSON fields include: the elite's education, working experiences, dismissal, retirement, titles (if any), their family members, colleagues, rivals, and other details of their relationships (such as their family relation type or when they are first reported to work together and/or become rivals).
- However, do not modify or make up the contents of such JSON fields. If the content of a JSON field (e.g, education, family members, colleague, etc.) is empty, then don't use it and don't make it up. This essentially means copy-paste the all the avaialble content in the JSON fields (if they are not empty) into your sentences and paragraphs when you write the biography.
"""
test_pro1_response = proposer1.generate(
    [[{"role": "user", "content": testprompt}]],
    max_new_tokens=1100,
    temperature=0.7
)

test_pro2_response = proposer2.generate(
    [[{"role": "user", "content": testprompt}]],
    max_new_tokens=1100,
    temperature=0.65
)


aggregation_prompt = f"""
You are an expert historian specializing in biographies of Vietnam's political elites. Combine and refine the two biographies below into a single high-quality and Wikipedia-style biography (but without the References section).
Important Note: Make sure that the output biography has as much biographical details of the elite as possible. This includes any relevant information: birth year (and death year if any), birthplace, ethnicity, education, working experiences, dismissal, retirement, arrest, titles (if any), their family members, colleagues, rivals, and other details of their relationships (such as their family relation type or when they are first reported to work together and/or become rival).
DO NOT touch any of these pieces of biographical information. You can enrich the biography by further adding context if necessary, but NEVER trim down the existing relevant biographical information for the sake of concision.
The final biography should have a neutral tone.
Biography 1: {test_pro1_response}
Biography 2: {test_pro2_response}
"""

test_agg_response = aggregator.generate(
    [[{"role": "user", "content": aggregation_prompt}]],
    max_new_tokens=1200,
    temperature=0.6
)


In [None]:
print(test_pro1_response[0]['generations'][0])

### Dam Quang Trung

**Dam Quang Trung**, born **Dam Ngoc Luu** in **1921** in **Ha Quang**, Cao Bang, was a prominent Vietnamese politician and military leader who served in various positions within the Vietnam People's Army (VPA) and played significant roles throughout the country's history.

#### Early Life and Education
Dam Quang Trung was born into the Tay ethnic minority group. He received his early military training at the **Whampoa Military Academy** in China from **1941** to **1943**. Subsequently, he studied further at the **Frunze Military Academy** in the USSR between **1957** and **1957**.

#### Military Career
His military career began in China during World War II, where he was a student at Whampoa Military Academy from **1941** to **1943**. After the war, he returned to Vietnam and joined the VPA, serving in various leadership and command roles.

From **1944** to **1945**, he held the position of Leadership within the VPA. His role expanded to Head of Hanoi Special Zone 

In [None]:
print(test_pro2_response[0]['generations'][0])

Dam Quang Trung (1921-1995) was a senior Vietnamese military leader who played a significant role in the country's struggle for independence from France and later against the United States.

Early Life and Education

Dam Quang Trung was born in Ha Quang, Cao Bang Province, Vietnam in 1921. His birth name is Dam Ngoc Luu. He came from a Tay ethnic minority background. Quang Trung received his early education at Whampoa Military Academy in China from 1941 to 1943. After completing his studies, he attended Frunze Military Academy in the USSR from 1957.

Career

Quang Trung began his military service in 1941, during the Japanese occupation of northern Vietnam. In 1944, he joined the Viet Minh, a coalition of nationalist groups fighting against French colonial rule. At that time, he worked under the leadership of Ho Chi Minh, the president of the Democratic Republic of Vietnam. 

In 1945, Quang Trung became the head of the Hanoi Special Zone of the Vietnam People's Army. Later, he held vari

In [None]:
print(test_agg_response[0]['generations'][0])

Dam Quang Trung (1921-1995) was a prominent Vietnamese politician and military leader who served in various positions within the Vietnam People's Army (VPA) and played significant roles throughout the country's history. He was born as Dam Ngoc Luu in Ha Quang, Cao Bang Province, Vietnam, and belonged to the Tay ethnic minority group.

Dam Quang Trung received his early military training at the Whampoa Military Academy in China from 1941 to 1943. Subsequently, he studied further at the Frunze Military Academy in the USSR between 1957. His military career began in China during World War II, where he was a student at Whampoa Military Academy from 1941 to 1943. After the war, he returned to Vietnam and joined the VPA, serving in various leadership and command roles.

From 1944 to 1945, he held the position of Leadership within the VPA. His role expanded to Head of Hanoi Special Zone under the VPA from 1945 to 1946. In 1946, he became the Head of the Military Committee for the VPA, statione

In [None]:
!nvidia-smi

Fri Mar  7 06:44:38 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:00:04.0 Off |                    0 |
| N/A   31C    P0             52W /  400W |   22817MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                

In [None]:
torch.cuda.empty_cache()

In [None]:
# Step 5: Generating Synthetic Biographies
import os
import json
import time
import torch

# Enable PyTorch Memory Optimizations
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

BATCH_SIZE = 4  # Adjust based on available VRAM
DEBUG = False  # Set to True for detailed logs

print("📌 Initial DataFrame Preview:")
print(df.head())

prompts, batch_indices, failed_indices = [], [], []

def safe_generate(model, prompt, max_tokens, temp):
    """ Wrapper to handle API failures gracefully and extract text properly. """
    try:
        response = model.generate(
            [[{"role": "user", "content": prompt}]],
            max_new_tokens=max_tokens,
            temperature=temp
        )

        # Ensure response is a dictionary and extract text
        if isinstance(response, list) and len(response) > 0 and isinstance(response[0], dict):
            generations = response[0].get("generations", [])
            if isinstance(generations, list) and len(generations) > 0:
                return generations[0]  # Correct extraction
            else:
                print("❌ Empty `generations` list.")
        else:
            print("❌ Unexpected response format:", response)

        return ""  # Return empty string if extraction fails

    except Exception as e:
        print(f"❌ Generation error: {e}")
        return ""

for index, row in df.iterrows():
    elites_json = row["Input_JSON"]
    elites_data = json.loads(elites_json)  # Convert string to dictionary

    prompt = f"""
    You are an expert historian and writer specializing in biographies of Vietnam's political elites.
    Generate a detailed biography based on the following elite's JSON record:

    {json.dumps(elites_data, indent=2)}

    Important Notes:
    - The biography should be around 400 to 800 words, depending on the JSON record's length and amount of details.
    - Write the biography like a Wikipedia entry, with a natural way and neutral tone, but without a References section.
    - Use all available information pieces in the JSON record, these JSON fields include: the elite's education, working experiences, dismissal, retirement, titles (if any), their family members, colleagues, rivals, and other details of their relationships (such as their family relation type or when they are first reported to work together and/or become rivals).
    - However, do not modify or make up the contents of such JSON fields. If the content of a JSON field (e.g, education, family members, colleague, etc.) is empty, then don't use it and don't make it up. This essentially means copy-paste the all the avaialble content in the JSON fields (if they are not empty) into your sentences and paragraphs when you write the biography.
    """

    prompts.append(prompt)
    batch_indices.append(index)

    if len(prompts) == BATCH_SIZE or index == len(df) - 1:
        try:
            print("\n📌 Processing batch...")

            responses1, responses2 = [], []

            for i, p in enumerate(prompts):
                print(f"\n🔹 Generating biography {i+1}/{len(prompts)}...")

                text1 = safe_generate(proposer1, p, 1000, 0.7)
                text2 = safe_generate(proposer2, p, 1000, 0.6)

                print(f"📌 Proposer 1 Output (First 200 chars): {text1[:200]}" if text1 else "❌ Proposer 1 returned an empty response!")
                print(f"📌 Proposer 2 Output (First 200 chars): {text2[:200]}" if text2 else "❌ Proposer 2 returned an empty response!")

                responses1.append(text1)
                responses2.append(text2)

            # Aggregation Step
            final_responses = []
            for bio1, bio2 in zip(responses1, responses2):
                if not (bio1 or bio2):
                    final_responses.append("")
                    failed_indices.append(batch_indices[i])  # Log failed attempts
                    continue

                aggregation_prompt = f"""
                You are an expert historian specializing in biographies of Vietnam's political elites.
                Combine and refine the following two biographies into a single high-quality and Wikipedia-style biography (but without the References section).
                Important Note: DO NOT touch any pieces of biographical information of the elite. This includes any relevant information: birth year (and death year if any), birthplace, ethnicity, education, working experiences, dismissal, retirement, arrest, titles (if any), their family members, colleagues, rivals, and other details of their relationships (such as their family relation type or when they are first reported to work together and/or become rival). These information pieces were literally copy-pasted into two biographies, so you can only work the way the language (sentences and paragraphs) is used to convey these information pieces. You can also enrich the biography by further adding context if necessary, but never trim down the existing relevant biographical information.
                The final biography should be around 450 to 1000 words, with a natural but neutral tone, not overly positive or negative.
                Biography 1: {bio1}
                Biography 2: {bio2}
                """

                final_text = safe_generate(aggregator, aggregation_prompt, 1200, 0.6)
                final_responses.append(final_text)

                print(f"📌 Aggregated Biography (First 200 chars): {final_text[:200]}...\n" if final_text else "❌ Aggregation failed.")

            # Store results
            for i, final_bio in enumerate(final_responses):
                if final_bio.strip():
                    df.loc[batch_indices[i], "Bio_Synthetic"] = final_bio
                    print(f"✅ Saved Biography for index {batch_indices[i]}.")
                else:
                    print(f"❌ Biography for index {batch_indices[i]} is empty, skipping save.")
                    failed_indices.append(batch_indices[i])  # Track failures

            print("\n📌 DataFrame Preview After Saving:")
            print(df.loc[batch_indices])

            # Free memory efficiently after each batch
            del responses1, responses2, final_responses

            # Introduce smart delay after processing a batch
            time.sleep(3)  # Wait 3 seconds to avoid rate limits

        except Exception as e:
            print(f"❌ Batch processing error: {e}")
            torch.cuda.empty_cache()

        # Clear prompts and indices for next batch
        prompts, batch_indices = [], []

# Final confirmation
print("\n📌 Final DataFrame Preview:")
print(df.head())

# Log failed attempts for debugging
if failed_indices:
    print(f"⚠️ Failed to generate {len(failed_indices)} biographies. Consider retrying these indices: {failed_indices}")

df.to_excel(excel_path, index=False, engine='openpyxl')
print(f"✅ Updated Excel file saved: {excel_path}")

📌 Initial DataFrame Preview:
                                          Input_JSON Bio_Synthetic
0  {"Name": "Bui Danh Luu", "Name_Other": ["Quoc ...              
1  {"Name": "Bui Thien Ngo", "Name_Other": [], "B...              
2  {"Name": "Cao Si Kiem", "Name_Other": ["Cao Sy...              
3  {"Name": "Chu Tuan Nha", "Name_Other": [], "Bi...              
4  {"Name": "Dam Quang Trung", "Name_Other": ["Da...              

📌 Processing batch...

🔹 Generating biography 1/4...
📌 Proposer 1 Output (First 300 chars): **Bùi Đánh Lư**

**Early Life and Education**
Bùi Đánh Lư was born in Thanh Thùy, Phu Tho Province, Vietnam in 1935. His early life and family background remain relatively obscure. He obtained his Doc
📌 Proposer 2 Output (First 300 chars): Bui Danh Luu (Quoc Linh)

Bui Danh Luu, better known by his pen name Quoc Linh, was a Vietnamese politician who played a significant role in the country's infrastructure development during the late 20

🔹 Generating biography 2/4...
📌 Pr

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset



📌 Processing batch...

🔹 Generating biography 1/4...
📌 Proposer 1 Output (First 300 chars): ### Do Muoi

**Do Muoi**, born **in 1917** in Hanoi, was a prominent Vietnamese communist leader who played significant roles in the political landscape of Vietnam, particularly during the Vietnam War
📌 Proposer 2 Output (First 300 chars): Do Muoi (Nguyen Duy Cong)

Do Muoi (Nguyen Duy Cong), also known as Nguyen Duy Cong, was born in 1917 in Hanoi, Vietnam. His early life and education are not well-documented.

In 1941, Do Muoi was arr

🔹 Generating biography 2/4...
📌 Proposer 1 Output (First 300 chars): **Do Nguyen Phuong**

**Born:** Hanoi, 1937

**Died:** 2008

**Education:**
- **Hanoi Medical University**, 1955–1960 (University degree in Medicine)
- **USSR Academy of Social Sciences**, 1980–1984 (
📌 Proposer 2 Output (First 300 chars): Do Nguyen Phuong: A Life of Service and Leadership in Vietnam's Modern Era

Do Nguyen Phuong was born in 1937 in Hanoi, Vietnam, where he would go on to lea