# Qwen - Getting Started and Fine-Tuning

Qwen is a series of powerful models trained on massive multilingual and multimodal datasets.  Developed by Alibaba Cloud, Qwen is pushing AI to new levels, making it smarter and more useful for all natural language processing, computer vision, and audio understanding.



These models are capable of performing a wide range of tasks, including:
Text generation and understanding
Question answering
Image captioning and analysis
Visual question answering
Audio processing
Tool use and task planning

The Qwen models are pre-trained on diverse data sources and further refined through post-training on high-quality data.


In this tutorial, we will:

1. Explore the key features of Qwen models, including their multilingual capabilities and multimodal processing abilities.
2. Guide you through the process of accessing and installing Qwen models.
3. Demonstrate practical examples of using Qwen for tasks such as text generation and question answering.
4. Explore fine-tuning Qwen models on custom datasets for specialized applications.

By the end of this tutorial, you'll have a comprehensive understanding of Qwen's capabilities and how to use it for your own projects and research.


## Installation Steps and Getting Started
This section, I will guide you through the process of using the Qwen-7B language model via Hugging Face. We'll cover setting up your environment, logging in to Hugging Face, and running the model.

### Prerequisites
- Python 3.7 or later
- pip (Python package installer)

### Step 1: Install Required Libraries
First, install the necessary libraries:

```
pip install transformers torch huggingface_hub
```
### Step 2: Log in to Hugging Face
To access the Qwen-7B model, you need to log in to your Hugging Face account. This is necessary because some models might have usage restrictions, but you can still access private models belonging to yourself or your organization. This helps in monitoring API usage and implementing rate limits if necessary.

To log in, follow these steps:

1. Go to Hugging Face
2. Go to your profile settings and create an access token.
3. In your terminal, run:
```
huggingface-cli login
```

4. When prompted, enter your access token.


## Getting Starter

### Import Packages
Now, we will create a Python file or Jupyter Notebook file where we will start writing our code.

First, we will import two classes from the `transformers` library:
`AutoModelForCausalLM`: This class automatically selects the appropriate model architecture based on the model name you provide.
`AutoTokenizer`: This class loads the correct tokenizer for the model you're using.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

### Specify the Model Name
Now, we're specifying which model we want to use. "Qwen/Qwen-7B" refers to the 7 billion parameter version of the Qwen model hosted on the Hugging Face model hub, but feel free to use other Qwen models.


In [None]:
model_name = "Qwen/Qwen-7B"

### Load the Tokenizer
Next, we need to load the tokenizer.


In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

This line loads the tokenizer associated with the Qwen-7B model. The `trust_remote_code=True` parameter is necessary because Qwen models may require custom code to be executed during loading.

### Load the Model
The next step is to load the model


In [None]:
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

This loads the actual Qwen-7B model. Again, we use `trust_remote_code=True` to allow any custom code associated with the model to run.

### Example Test
Once we have loaded the model, we can test if it works or not by generating some text given an input.


In [None]:
input_text = "Once upon a time"
inputs = tokenizer(input_text, return_tensors="pt")

Here, we're preparing the input for the model:
- We define a string `input_text` as our prompt.
- We use the tokenizer to convert this text into a format the model can understand. The `return_tensors="pt"` argument specifies that we want the output in PyTorch tensor format.

Now, we generate the text:


In [None]:
outputs = model.generate(**inputs, max_new_tokens=50)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)


The first line of code tells the model to generate text:
-We pass our tokenized `inputs` to the `generate` method.
-`**inputs` unpacks the dictionary returned by the tokenizer.
-`max_new_tokens=50` limits the generation to a maximum of 50 new tokens.

In [None]:
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

This step converts the model's output (which is in token ID format) back into readable text:
-`outputs` selects the first (and in this case, only) generated sequence.
-`skip_special_tokens=True` tells the decoder to ignore special tokens like padding or end-of-sequence tokens.

Finally, we print the generated text to see the result.

In [None]:
print(generated_text)

Notes and Tips
1. The Qwen-7B model is large (7 billion parameters). Ensure your system has sufficient RAM and preferably a GPU for faster processing. 
2. If you encounter memory issues, consider using a smaller model.
3. The `trust_remote_code=True` parameter is necessary for Qwen models as they require custom code to run properly.
4. Always review the model's license and usage restrictions on its Hugging Face page.

You have successfully set up and run the Qwen model! Feel free to experiment with different input texts and explore the capabilities of the model.

## Example Usage

Let's explore some practical examples of using Qwen for text generation and question answering tasks.

### Text Generation

Qwen is really good at generating coherent and contextually relevant text based on given prompts. Let’s look at some examples:


#### Basic Text Completion

In [None]:
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=50)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

#### Creative Writing

In [None]:
prompt = "Write a short poem about the changing seasons:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=100, temperature=0.7)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

#### Code Generation

In [None]:
prompt = "Write a Python function to calculate the Fibonacci sequence:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=200, temperature=0.2)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

### Question Answering
Qwen can be used to answer a wide range of questions, from factual queries to more open-ended or analytical questions. Here are some examples:


#### Factual Question

In [None]:
question = "What is the capital of France?"
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=50)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(f"Q: {question}\nA: {answer}")


#### Open-ended Question

In [None]:
question = "What are the potential ethical concerns surrounding artificial intelligence?"
inputs = tokenizer(question, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=200, temperature=0.7)
answer = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(f"Q: {question}\nA: {answer}")


These examples demonstrate how Qwen can handle various types of text generation and question answering tasks. You can adjust the parameters like `max_new_tokens` and `temperature`, in order to control the length and creativity of the generated responses to suit your specific needs.

## Advanced Usage and Fine-tuning
Fine-tuning Qwen models allows you to adapt them to specific tasks, potentially improving their performance for your particular use case. This process involves training the pre-trained model on a custom dataset, allowing it to learn task-specific knowledge while retaining its general language understanding capabilities.

In this section, we'll walk through the process of fine-tuning the Qwen-7B model on a custom dataset. We'll use efficient fine-tuning techniques to make this process manageable even for large models.  In our example, we're fine-tuning the model to improve its performance on translation tasks and answering factual questions. This process allows the model to learn from a custom dataset while retaining its general language understanding capabilities.


### Prerequisites

Before we begin, make sure you have the following installed:

```
pip install datasets torch accelerate peft
```


### Step 1: Prepare Your Dataset

First, prepare your dataset in a JSON format. Each entry should have a "prompt" and a "completion" field. Save this as `custom_dataset.json`. We will use this example:


In [1]:
[
  {
    "prompt": "Translate to French: 'Hello, how are you?'",
    "completion": "Bonjour, comment allez-vous?"
  },
  {
    "prompt": "What is the capital of Spain?",
    "completion": "The capital of Spain is Madrid."
  }
]


[{'prompt': "Translate to French: 'Hello, how are you?'",
  'completion': 'Bonjour, comment allez-vous?'},
 {'prompt': 'What is the capital of Spain?',
  'completion': 'The capital of Spain is Madrid.'}]

### Step 2: Set Up the Fine-tuning Script

We now add the following imports:


In [None]:
import torch
from transformers import TrainingArguments, Trainer
from datasets import load_dataset
from peft import LoraConfig, get_peft_model

### Step 3: Prepare the Dataset
Next, we'll define a function to preprocess our dataset and load it using the Hugging Face datasets library:

In [None]:
def preprocess_function(examples):
    inputs = [f"{prompt}\n" for prompt in examples["prompt"]]
    targets = [f"{completion}\n" for completion in examples["completion"]]
    model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length")
    labels = tokenizer(targets, max_length=512, truncation=True, padding="max_length")
    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

dataset = load_dataset("json", data_files="custom_dataset.json")
tokenized_dataset = dataset["train"].train_test_split(test_size=0.1)
tokenized_dataset = tokenized_dataset.map(preprocess_function, batched=True, remove_columns=dataset["train"].column_names)

### Step 4: Set up LoRA for Efficient Fine-tuning

To make fine-tuning more efficient, we'll use LoRA (Low-Rank Adaptation). This technique allows us to fine-tune large models with fewer parameters:

In [None]:
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)
model = get_peft_model(model, lora_config)

### Step 5: Define Training Arguments
Now, we'll set up the training arguments that control various aspects of the fine-tuning process:

In [None]:
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=10,
    evaluation_strategy="steps",
    eval_steps=500,
    save_strategy="steps",
    save_steps=1000,
    learning_rate=1e-4,
    fp16=True,
    gradient_checkpointing=True,
    gradient_accumulation_steps=4,
)

### Step 6: Create Trainer and Start Fine-tuning
With our dataset and training arguments prepared, we can now create a Trainer object and start the fine-tuning process:


In [None]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset["train"],
    eval_dataset=tokenized_dataset["test"],
    tokenizer=tokenizer,
)

trainer.train()

### Step 8: Save the Fine-tuned Model
After fine-tuning is complete, we'll save our model so we can use it later:

In [None]:
trainer.save_model("./fine_tuned_qwen")

Fine-tuning the Qwen-7B model on your custom dataset, will improve its performance on tasks similar to those in your training data. In this case, the model should become better at translating English to French and answering factual questions about capital cities.

## Using the Fine-tuned Model

After fine-tuning, you can use the model for inference:


In [None]:
fine_tuned_model = AutoModelForCausalLM.from_pretrained("./fine_tuned_qwen", trust_remote_code=True)
fine_tuned_tokenizer = AutoTokenizer.from_pretrained("./fine_tuned_qwen", trust_remote_code=True)

def generate_response(prompt):
    inputs = fine_tuned_tokenizer(prompt, return_tensors="pt").to(fine_tuned_model.device)
    outputs = fine_tuned_model.generate(**inputs, max_new_tokens=50)
    return fine_tuned_tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
prompt = "Translate to French: 'Good morning, have a nice day!'"
response = generate_response(prompt)
print(f"Prompt: {prompt}\nResponse: {response}")

## Tips for Effective Fine-tuning

- Ensure your custom dataset is high-quality and representative of the task you're targeting.
- While larger datasets generally yield better results, even a few hundred examples can be beneficial for specific tasks.
- Experiment with different learning rates, batch sizes, and number of epochs to optimize performance.
- Regularly evaluate your model on a held-out validation set to prevent overfitting.
- If you're working with limited GPU memory, consider using gradient accumulation to simulate larger batch sizes.
- Use mixed precision training to speed up the fine-tuning process and reduce memory usage.
- For domain-specific tasks, consider continued pre-training on domain-specific data before fine-tuning on your task-specific dataset.

Fine-tuning Qwen on your custom dataset allows you to create a specialized model that excels at your specific task while retaining the broad knowledge and capabilities of the original pre-trained model. This approach uses the power of large language models for your unique applications and domains.
