In [1]:
import torch
import transformers

from transformers import LlamaForCausalLM, LlamaTokenizer

In [2]:
model_dir = "/data/yingfei/models/llm/llama2/llama/llama-2-7b-hf"
model = LlamaForCausalLM.from_pretrained(model_dir)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

In [3]:
tokenizer = LlamaTokenizer.from_pretrained(model_dir)

In [4]:
pipeline = transformers.pipeline(
"text-generation",

model=model,

tokenizer=tokenizer,

torch_dtype=torch.float16,

device_map="auto",

)

In [5]:
sequences = pipeline(
    'I have tomatoes, basil and cheese at home. What can I cook for dinner?\n',

    do_sample=True,

    top_k=10,

    num_return_sequences=1,

    eos_token_id=tokenizer.eos_token_id,

    max_length=400,
)

for seq in sequences:
    print(f"{seq['generated_text']}")

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


I have tomatoes, basil and cheese at home. What can I cook for dinner?
You can make this pasta dish! And it takes 20-30 minutes to make.
This one looks soooo good! I can’t wait to give it a try.
Oh wow. I love this recipe and the photos! It’s so hard to find great recipes with healthy options, so thank you! Can’t wait to make this!
I’m so excited to try this recipe. I’m always on the look-out for something to cook that I can make in under 30 minutes but that’s actually a good dinner! Thanks.
I’m so excited to try this recipe! I always need a new recipe I can use in my busy schedule! Thanks!
This looks amazingly delicious! My family and I will definitely be trying this one out.
I’m trying to get my husband and kids to eat more veggies so I love this recipe!
I’m going to try this recipe. I think it would be perfect for my family!
This pasta dish looks absolutely incredible! I can’t wait to try it!
This looks delicious! My daughter and I love spaghetti and this looks like the perfect reci

In [10]:
### test the cancer drug recommendation prompt
sequences = pipeline(
    'Think step by step and decide the best drug option for the cell line with given mutations: [Drug Name], [Reasoning].\
Drug 1: The drug is BRYOSTATIN-1. The drug SMILES structure is CCCC=CC=CC(=O)OC1C(=CC(=O)OC)CC2CC(OC(=O)CC(CC3CC(C(C(O3)(CC4CC(=CC(=O)OC)CC(O4)C=CC(C1(O2)O)(C)C)O)(C)C)OC(=O)C)O)C(C)O. Drug target is Unknown. Drug target pathway is Unknown.\
Drug 2: The drug is KIN001-135. The drug SMILES structure is COC1=C(C=C2C(=C1)N=CN2C3=CC(=C(S3)C#N)OCC4=CC=CC=C4S(=O)(=O)C)OC. Drug target is IKK. Drug target pathway is Other, kinases.\
Drug 3: The drug is GSK2606414. The drug SMILES structure is CN1C=C(C2=C(N=CN=C21)N)C3=CC4=C(C=C3)N(CC4)C(=O)CC5=CC(=CC=C5)C(F)(F)F. Drug target is PERK. Drug target pathway is Metabolism.\
The mutations of the cell line are NOTCH1, NOTCH3, PIK3R1, PPP2R1A, TP53, TSC2, WHSC1L1.\
What is the best drug option? Why?',

    do_sample=True,

    top_k=10,

    num_return_sequences=1,

    eos_token_id=tokenizer.eos_token_id,

    max_length=1000,
)

for seq in sequences:
    print(f"{seq['generated_text']}") ### not useful content

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


Think step by step and decide the best drug option for the cell line with given mutations: [Drug Name], [Reasoning].Drug 1: The drug is BRYOSTATIN-1. The drug SMILES structure is CCCC=CC=CC(=O)OC1C(=CC(=O)OC)CC2CC(OC(=O)CC(CC3CC(C(C(O3)(CC4CC(=CC(=O)OC)CC(O4)C=CC(C1(O2)O)(C)C)O)(C)C)OC(=O)C)O)C(C)O. Drug target is Unknown. Drug target pathway is Unknown.Drug 2: The drug is KIN001-135. The drug SMILES structure is COC1=C(C=C2C(=C1)N=CN2C3=CC(=C(S3)C#N)OCC4=CC=CC=C4S(=O)(=O)C)OC. Drug target is IKK. Drug target pathway is Other, kinases.Drug 3: The drug is GSK2606414. The drug SMILES structure is CN1C=C(C2=C(N=CN=C21)N)C3=CC4=C(C=C3)N(CC4)C(=O)CC5=CC(=CC=C5)C(F)(F)F. Drug target is PERK. Drug target pathway is Metabolism.The mutations of the cell line are NOTCH1, NOTCH3, PIK3R1, PPP2R1A, TP53, TSC2, WHSC1L1.What is the best drug option? Why?
Drug 1: The drug is BRYOSTATIN-1. The drug SMILES structure is CCCC=CC=CC(=O)OC1C(=CC(=O)OC)CC2CC(OC(=O)CC(CC3CC(C(C(O3)(CC4CC(=CC(=O)OC)CC(O4)C=CC(