<a href="https://colab.research.google.com/github/softmurata/colab_notebooks/blob/main/llm/llama_deus.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Installation

In [None]:
!pip install git+https://github.com/huggingface/transformers.git
!pip install peft gradio accelerate bitsandbytes
!pip install sentencepiece

Load model

In [None]:
import torch
from peft import PeftModel
import transformers
import gradio as gr
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig

# colab pro以上でのプランでA100を使用しないと動かないかも

BASE_MODEL = "decapoda-research/llama-7b-hf"
# BASE_MODEL = "decapoda-research/llama-13b-hf"
# BASE_MODEL = "decapoda-research/llama-30b-hf"
# BASE_MODEL = "decapoda-research/llama-65b-hf"

tokenizer = LlamaTokenizer.from_pretrained(BASE_MODEL,device_map={'': 0})

LORA_WEIGHTS = "teknium/llama-deus-7b-v3-lora"
# LORA_WEIGHTS ="kunishou/Japanese-Alpaca-LoRA-13b-v0"
# LORA_WEIGHTS = "kunishou/Japanese-Alpaca-LoRA-30b-v0"
# LORA_WEIGHTS = "kunishou/Japanese-Alpaca-LoRA-65b-v0"

if BASE_MODEL == "decapoda-research/llama-7b-hf":
  model_param = "7B"
elif BASE_MODEL == "decapoda-research/llama-13b-hf":
  model_param = "13B"
elif BASE_MODEL == "decapoda-research/llama-30b-hf":
  model_param = "30B"
else:
  model_param = "65B"

if torch.cuda.is_available():
    device = "cuda"
else:
    device = "cpu"

try:
    if torch.backends.mps.is_available():
        device = "mps"
except:
    pass

if device == "cuda":
    model = LlamaForCausalLM.from_pretrained(
        BASE_MODEL,
        load_in_8bit=True,
        torch_dtype=torch.float16,
        # device_map="auto",
        device_map={'': 0},
    )
    model = PeftModel.from_pretrained(model, LORA_WEIGHTS, torch_dtype=torch.float16, device_map={'': 0},)
elif device == "mps":
    model = LlamaForCausalLM.from_pretrained(
        BASE_MODEL,
        # device_map={"": device},
        device_map={'': 0},
        torch_dtype=torch.float16,
    )
    model = PeftModel.from_pretrained(
        model,
        LORA_WEIGHTS,
        # device_map={"": device},
        device_map={'': 0},
        torch_dtype=torch.float16,
    )
else:
    model = LlamaForCausalLM.from_pretrained(
        BASE_MODEL,
        # device_map={"": device},
        device_map={'': 0},
        low_cpu_mem_usage=True
    )
    model = PeftModel.from_pretrained(
        model,
        LORA_WEIGHTS,
        # device_map={"": device},
        device_map={'': 0},
    )


def generate_prompt(instruction, input=None):
    if input:
        return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:"""
    else:
        return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:"""


model.eval()
if torch.__version__ >= "2":
    model = torch.compile(model)


def evaluate(
    instruction,
    input=None,
    temperature=0.1,
    top_p=0.75,
    top_k=40,
    num_beams=4,
    max_new_tokens=256,
    **kwargs,
):
    prompt = generate_prompt(instruction, input)
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].to(device)
    generation_config = GenerationConfig(
        temperature=temperature,
        top_p=top_p,
        top_k=top_k,
        num_beams=num_beams,
        no_repeat_ngram_size=3,
        **kwargs,
    )

    with torch.no_grad():
        generation_output = model.generate(
            input_ids=input_ids,
            generation_config=generation_config,
            return_dict_in_generate=True,
            output_scores=True,
            max_new_tokens=max_new_tokens,
        )
    s = generation_output.sequences[0]
    output = tokenizer.decode(s)
    return output.split("### Response:")[1].strip()


In [5]:
instruction = """
I want you act as a professional scenario writer. 
"""
input = """
Please write 20 minutes presentations about the latest generative AI.
"""

response = evaluate(instruction, input, max_new_tokens=1024)
print(response)

Title: Generative Artificial Intelligence: The Future of Innovation

Synopsis:
In the 21st century, advancements in technology have revolutionized the way we live, work, and interact with one another. Among these innovations, generative artificial intelligence (GAI) has emerged as a game-changing technology, poised to reshape industries and redefine the very essence of what it means to be intelligent. In this presentation, we will explore the latest developments in GAI, examining its potential applications, benefits, and challenges. We will delve into the fascinating world of machine learning, deep learning, and neural networks, uncovering how these cutting-edge technologies are pushing the boundaries of human creativity and ingenuity. Whether you are a seasoned professional, a budding entrepreneur, or simply someone curious about the future of innovation, this presentation is sure to leave you inspired and invigorated by the limitless possibilities that GAI holds for us all.


In [7]:
instruction = """
### Instruction
I want you act as a professional scenario writer. Please write 20 minutes presentations about the latest generative AI.
"""


response = evaluate(instruction, max_new_tokens=2048)
print(response)

Slide 1: Introduction to Generative Artificial Intelligence (GAI)
- Definition of GAI
- Explanation of how GAI works
- Brief overview of its applications

Slides 2-5: Key Features and Capabilities of Generative GAIs
- Discussion of unsupervised and supervised learning
- Description of the use of neural networks and deep learning techniques
- Overview of the ability of GAEs to generate new and creative content
- Example use cases and applications of generative GAI in fields such as art, music, writing, and design.
- Implications and potential future developments in the realm of GGAI. 
- Conclusion and summary of the key features and capabilities of a generative artificial intelligence.
