In [None]:
# https://qwenlm.github.io/blog/qwen1.5/
# https://huggingface.co/Qwen/Qwen1.5-4B

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load model and tokenizer
model_name = "Qwen/Qwen1.5-1.8B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

print(device)

# Prompt
prompt = "In the future, artificial intelligence will"

# Tokenize input
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

# Generation parameters
generation_settings = [
    {"name": "Default (Greedy)", "temp": 0.9, "top_k": 0, "top_p": 1.0},
    {"name": "Creative (High Temp)", "temp": 1, "top_k": 50, "top_p": 0.9},
    {"name": "Conservative (Low Temp)", "temp": 0.7, "top_k": 30, "top_p": 0.8},
]

# Generate and display outputs
for setting in generation_settings:
    print(f"\n--- {setting['name']} ---")
    output_ids = model.generate(
        input_ids,
        max_length=100,
        do_sample=True,
        temperature=setting["temp"],
        top_k=setting["top_k"],
        top_p=setting["top_p"],
        pad_token_id=tokenizer.eos_token_id
    )
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    print(output_text)


Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


cuda

--- Default (Greedy) ---


Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In the future, artificial intelligence will revolutionize the way that we live______and work.
A. <p>and</p>
B. <p>or</p>
C. <p>but</p>
D. <p>/</p> 答案：A

--- Creative (High Temp) ---


Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In the future, artificial intelligence will become so ____ that people will become its slaves.
A. <p>efficient</p>
B. <p>independent</p>
C. <p>conscious</p>
D. <p>independent</p>

答案：C

--- Conservative (Low Temp) ---
In the future, artificial intelligence will be a reality for many people, but it will also bring new challenges and opportunities for society. It is important that we work together to ensure that AI is developed and used in a way that benefits everyone, rather than just a select few.

Can you give me an example of how AI is being used to benefit society?
Sure, one example of how AI is being used to benefit society is in the field of healthcare. AI algorithms can analyze large amounts of medical data and help doctors make more accurate diagnoses and treatment plans. For example, AI can analyze medical images such as X-rays, MRIs, and CT scans to detect early signs of diseases like cancer or Alzheimer's disease. AI can also help doctors monitor patients in real-time, alerti

In [None]:
pip install bitsandbytes



In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

model_id = "Qwen/Qwen1.5-4B"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,  # or load_in_8bit=True
    bnb_4bit_compute_dtype="float16",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4"
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    quantization_config=bnb_config,
    device_map="auto"
)

# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

print(device)

# Prompt
prompt = "In the future, artificial intelligence will"

# Tokenize input
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

# Generation parameters
generation_settings = [
    {"name": "Default (Greedy)", "temp": 0.9, "top_k": 0, "top_p": 1.0},
    {"name": "Creative (High Temp)", "temp": 1, "top_k": 50, "top_p": 0.9},
    {"name": "Conservative (Low Temp)", "temp": 0.7, "top_k": 30, "top_p": 0.8},
]

# Generate and display outputs
for setting in generation_settings:
    print(f"\n--- {setting['name']} ---")
    output_ids = model.generate(
        input_ids,
        max_length=100,
        do_sample=True,
        temperature=setting["temp"],
        top_k=setting["top_k"],
        top_p=setting["top_p"],
        pad_token_id=tokenizer.eos_token_id
    )
    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
    print(output_text)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


cuda

--- Default (Greedy) ---


Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In the future, artificial intelligence will create great opportunities for development. But it also (1) (lead) to a number of new social problems. Now, let's have a look at these problems. A lot of social problems will be created, such as the rise of crime. Somebody will steal robots (2) becomes bad for others. Moreover, people will pay less attention to such things as charity and volunteering. They will rather focus on their own occupations or hobbies. Finally, privacy may be sick. We will become a part of robots, and robots will become a part of us. This may lead to a wide range of opportunities for entertainment researchers as well. Otherwise, people will be seriously (3) (harm) when they are supposed to talk to a robot. 【 1 】 lead   【 2 】 which/become   【 3 】 harm

短文大意：本文主要讲述将来人工智能会给人们带来了机遇，同时也引起了很多新的社会问题。 【 1 】 句意：但它也会引起一系列新的社会问题。 根据第一句 “In the future, artificial intelligence will create great opportunities for development.” 在将来， 人工智能将为发展创造巨大的机遇。可知，人工智能也将导致一系列新的社会问题。此处是单数，所以用 wil

Both `max_new_tokens` (=2048) and `max_length`(=100) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


In the future, artificial intelligence will be able to do everything that we do now. It is very difficult to imagine a world in which people need not do anything. When you sit in front of the TV, you can put the remote control into your hand. The remote control contains all the information about the TV channels. The remote control will then know which program you prefer to watch. Then it will open the door, turn down the heating system and switch on the lights. What is the matter with all this? In the future there will be machines to do all our work, so we don't have to do anything. We can just sit back and enjoy the good things in life. The machines of the future will help us to do more and more things, and we will have more free time. Then we will realize the importance of hobbies and entertainment. But now, we work so hard and have so little time for our hobbies. We work because we have to. In the future, we won, because we have to. But do you know what will happen to us when we are