# Running Qwen1.5 on Google Colab

The Qwen1.5 models can be run on Google Colab with ease. Here, we provide a simple example of running Qwen1.5 on a free T4 GPU.
First, make sure that you have installed  ``transformers>=4.37.0 ``.


In [1]:
!pip install transformers>=4.37.0

Next, you can try to generate some texts with  ``transformers ``.
For free T4 instances, due to limited RAM and GPU memory capacity, we test a small version of the Qwen1.5 models. It is suggested to select advanced GPU instances to play with larger models.  

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda"
model_name = "Qwen/Qwen1.5-0.5B-Chat"

model = AutoModelForCausalLM.from_pretrained(model_name).to(device)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to the Qwen model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

The response should be something like ``Qwen is a large language model developed by Alibaba Cloud. It was designed to be powerful, efficient, and user-friendly, allowing users to interact with it using natural language processing techniques. Qwen uses advanced machine learning algorithms and a combination of deep neural networks to generate text that is both human-like and grammatically correct. With its ability to understand multiple languages, it can be used in various fields such as customer service, education, and entertainment.``.