In [1]:
def get_plaintext_wikipedia_page(title):
    import requests
    url = f"https://en.wikipedia.org/w/api.php"
    params = {
        "action": "query",
        "prop": "extracts",
        "explaintext": True,
        "titles": title,
        "format": "json"
    }
    response = requests.get(url, params=params)
    data = response.json()
    page = next(iter(data['query']['pages'].values()))
    return page['extract']

In [2]:
plaintext_contents = get_plaintext_wikipedia_page("Nuclear_fusion")
print(plaintext_contents[:250], "...")

Nuclear fusion is a reaction in which two or more atomic nuclei, usually deuterium and tritium (hydrogen isotopes), combine to form one or more different atomic nuclei and subatomic particles (neutrons or protons). The difference in mass between the  ...


In [3]:
import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    pipeline,
    BitsAndBytesConfig,
)

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
model_id = "microsoft/Phi-3-mini-128k-instruct"

In [5]:
tokenizer = AutoTokenizer.from_pretrained(model_id)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


In [6]:
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True, 
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    llm_int8_enable_fp32_cpu_offload=True,
)

In [7]:
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="cuda",
    torch_dtype="auto",
    trust_remote_code=True,
    attn_implementation="flash_attention_2",
    quantization_config=quantization_config,
)

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00,  1.37s/it]


In [8]:
pipe = pipeline( 
    "text-generation", 
    model=model, 
    tokenizer=tokenizer, 
) 

In [9]:
result = pipe(
    [
        {"role": "system", "content": "You are a helpful nuclear fusion expert."}, 
        {"role": "user", "content": "\n".join([
            "Summarize the following article succinctly in less than 250 words:",
            "---- BEGIN ARTICLE ----",
            plaintext_contents,
            "---- END ARTICLE ----",
        ])}
    ],
    max_new_tokens=300,
)
print(result[0]["generated_text"][-1]["content"])

 Nuclear fusion is the process where atomic nuclei combine to form heavier nuclei, releasing energy due to the difference in nuclear binding energy. It powers stars and synthesizes elements, with hydrogen fusing into helium in the Sun's core. The process requires overcoming the Coulomb barrier, which can be achieved through high temperatures and quantum tunneling. The most promising reactions for energy production are those that are exothermic, involve low atomic number nuclei, and produce neutrons for tritium breeding. The ITER project aims to create a toroidal reactor for controlled fusion, while private companies are developing commercial technologies. Break-even fusion was achieved in 2022, and research continues to overcome challenges like plasma confinement and energy extraction. Fusion offers a potential reduction in carbon footprint and a high energy density compared to chemical reactions.
