# Install Dependencies and Import Modules

We must build AutoGPTQ from source.

In [None]:
!pip install transformers accelerate einops sentencepiece

In [None]:
!git clone https://github.com/PanQiWei/AutoGPTQ
!pip install ./AutoGPTQ/

In [None]:
import torch
from transformers import AutoTokenizer
from auto_gptq import AutoGPTQForCausalLM

# Load the Model and Tokenizer

This may take a while. It also requires around 3.5GB of RAM and 7.5GB of GPU. It will use 28 GB of disk space.

In [None]:
model_path = "TheBloke/WizardLM-Uncensored-Falcon-7B-GPTQ"

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)

In [None]:
model = AutoGPTQForCausalLM.from_quantized(model_path, device="cuda:0", use_triton=False, use_safetensors=True, torch_dtype=torch.float32, trust_remote_code=True)

Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.


# Your Prompt

You only need to change the prompt below.

In [None]:
prompt = "Describe a painting of a falcon hunting a llama in a very detailed way." #@param
prompt_template = f"### Instruction: {prompt}\n### Response:"

This will generate the output:

In [None]:
tokens = tokenizer(prompt_template, return_tensors="pt").input_ids.to("cuda:0")
output = model.generate(input_ids=tokens, max_new_tokens=256, do_sample=True, temperature=0.8)

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


In [None]:
# Print the generated text
print(tokenizer.decode(output[0]))

### Instruction: Describe a painting of a falcon hunting a llama in a very detailed way.
### Response:The painting depicts a falcon perched on a rocky outcropping, with a mountainous landscape stretching out behind it. The falcon is perched on a dead llama carcass, its beady black eyes scanning the horizon for prey. The llama's fur is a dull brown color, and its eyes are half-closed in death.
The falcon's feathers are ruffled in the wind, and its wings are spread wide, ready for flight. Its beak is sharp and curved, and its talons are poised to strike. The falcon's head is turned towards the llama, its gaze fierce and determined.
The falcon's body is sleek and muscular, and its wingspan is immense. A long, curved talon is extended from one of its wings, ready to grab hold of its prey. The falcon's feathers are a mix of shades of brown and black, blending seamlessly with the rocky outcropping behind it.
In the distance, the peaks of the mountains loom large, their snow-capped tops glist