In [1]:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

The model initialization is taken from https://github.com/lm-sys/FastChat/blob/0e958b852a14f4bef5f0e9d7a5e7373477329cf2/fastchat/serve/inference.py#L81

The parameter

```
torch_dtype=torch.float16
```

ensures that 16 bit float precision is used instead of 32 bit, which theoretically reduces the memory requirement to less than 16G of GPU RAM. However, in practice, 
it turned out that sometimes 16G is not sufficient perhaps due to loading the input on CUDA as well, so move up to 2 T4 or increase total GPU RAM by other means to beyond 16G. Alternatively, perhaps there's a way to clear the GPU RAM before generating to make space.

In [2]:
tokenizer = AutoTokenizer.from_pretrained("/home/jupyter/vicuna-7b", use_fast=False)
model = AutoModelForCausalLM.from_pretrained("/home/jupyter/vicuna-7b", low_cpu_mem_usage=True, 
                                             torch_dtype=torch.float16,
                                             device_map="auto"
                                            )

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The FastChat code did not seem to use much parameter besides temperature. However, it may be worth trying the original llama parameters, which are 

```
do_sample=True, top_p=0.95, temperature=0.8
```

The temperature controls the stochasticity or "creativity" of the models. A higher temperature implies more stochastic behavior, while a lower temperature implies lower stochastic behavior.

In [3]:
def generate(prompt, max_new_tokens=1024, temperature=0.7):
    torch.cuda.empty_cache()
    # Create the prompt according to the pre-trained format: see https://github.com/lm-sys/FastChat/blob/0e958b852a14f4bef5f0e9d7a5e7373477329cf2/fastchat/conversation.py#L187
    system = "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. "
    template = system + "USER: %s "
    input_start = template % prompt
    input_final = template % prompt + "ASSISTANT:"
    
    inputs = tokenizer(input_final, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=max_new_tokens, 
                             temperature=temperature)
    result = tokenizer.batch_decode(outputs, skip_special_tokens=True, spaces_between_special_tokens=False)
    
    # Since the task is Autoregressive LM, it's gonna repeat the prompt in the output. Thus, the prompt itself is removed from the output.
    
    return result[0][len(input_start):]

In [4]:
print(generate("How to make a BLT?"))

ASSISTANT: To make a BLT, you will need the following ingredients:

* 2 slices of bread
* 8 large bacon strips, cooked and crumbled
* 8 large lettuce leaves
* 8 slices of tomato
* 2 tablespoons of mayonnaise

Instructions:

1. Toast the 2 slices of bread until they are golden brown.
2. In a bowl, mix together the crumbled bacon and mayonnaise.
3. Lay one slice of bread on a plate and spoon a small amount of the bacon mixture onto the bread.
4. Top the bacon mixture with a few lettuce leaves, then a few slices of tomato.
5. Repeat the process with the second slice of bread, placing the bacon mixture on top of the bread.
6. Cut the sandwich into diagonal slices and serve.


In [5]:
article = """Alzheimer’s disease treatment stocks are focused on Alzheimer’s disease, a degenerative brain disorder that results in declining memory and thinking skills and typically affects people in their mid-60s.According to the Alzheimer’s Association, neurons in other areas of the brain also begin to deteriorate as Alzheimer’s disease gets worse, resulting in the loss of basic human functions and overall cognitive impairment.This condition affects more than 6.7 million people in the US alone; it’s also the most common form of dementia and is the sixth leading cause of death in America. Treatments are available to alleviate Alzheimer’s disease symptoms, but because they do not affect the underlying causes of this neurodegenerative disease, they’re only a bandaid solution.Therapies approved by the US Food and Drug Administration (FDA) include: rivastigmine by Novartis (NYSE:NVS); galantamine, developed by Janssen, a division of Johnson & Johnson (NYSE:JNJ); donepezil by Pfizer (NYSE:PFE); and AbbVie's (NYSE:ABBV) memantine.Since there is no cure for Alzheimer’s disease, death is often the result for patients as the ailment devastates the brain. And unfortunately, Alzheimer’s disease is rising in prevalence — a report from Research and Markets suggests that the global Alzheimer’s disease treatment market will be worth a significant US$9.64 billion by 2028 as more patients need treatment, and as more investments are made in biomarkers for diagnosis and drug development.Here the Investing News Network takes a look at a few of the biggest Alzheimer’s disease treatment stocks on the NASDAQ. Companies are listed in order of market cap from largest to smallest, and all data was current as of April 19, 2023.. Biogen (NASDAQ:BIIB) Company Profile.Market cap: US$42.27 billion; current share price: US$292.80.Massachusetts-based Biogen, a pioneer in the field of neuroscience, is developing, manufacturing and marketing therapies aimed at treating serious neurological, neurodegenerative, autoimmune and rare diseases.The global biotechnology firm’s research areas include Alzheimer's disease and dementia. However, Biogen’s launch of its FDA-approved Alzheimer’s disease drug Aduhelm faced a lot of pushback in 2022, both from the market and from Congress, over what was viewed as a hasty fast-track approval process and exorbitant costs to patients.In early 2023, Biogen is looking to give it another go, this time with LEQEMBI (lecanemab-irmb) for the treatment of Alzheimer's disease, which the FDA has approved under its accelerated approval pathway. The drug was jointly developed by Biogen and Tokyo-based pharmaceutical company Eisai (OTC Pink:ESALF,TSE:4523). Intra-Cellular Therapies (NASDAQ:ITCI).Company Profile.Market cap: US$5.96 billion; current share price: US$62.59.As its name suggests, biopharma company Intra-Cellular Therapies is taking an intracellular approach to developing therapies for patients with complex psychiatric and neurologic diseases, including Parkinson’s disease and Alzheimer’s disease.Intra-Cellular Therapies is currently evaluating ITI-1284-ODT-SL, a deuterated form of lumateperone that comes in an oral disintegrating tablet; ITI-1284-ODT-SL is in Phase 1 studies, including drug-drug interaction studies. The company expects to commence Phase 2 clinical trials in agitation in patients with probable Alzheimer’s disease in 2023. 3. Axsome Therapeutics (NASDAQ:AXSM).Company Profile.Market cap: US$3.16 billion; current share price: US$72.64.New York-based Axsome Therapeutics is developing and commercializing a portfolio of novel therapies for disorders of the central nervous system with limited treatment options. Products currently on the market from the biopharmaceutical company include Auvelity, a rapid-acting oral antidepressant for the treatment of major depressive disorder; and Sunosi, a once-daily prescription medicine to treat excessive daytime sleepiness due to obstructive sleep apnea or narcolepsy.Axsome’s product pipeline includes AXS-05, a NMDA receptor antagonist with multimodal activity. It has breakthrough therapy designation from the FDA and is currently in Phase 3 clinical studies for the treatment of Alzheimer’s disease agitation. """

response = generate("You are an equity researcher specialized in the health and pharmaceutical industry. Write a summary for the following article. Focus on the information that will impact investment decisions on the companies mentioned.\n" + article)
print(response)