# Data model generation and structured output(Camel Agent using Qwen) 

LLMs speaks natural languages, as human do. While the applications speak structured language, like JSON. So, it’s important to equip LLM with the ability to speak structured language, so that they can communicate with both human and other applications.

There are a few different high level strategies that are used to do this:

- Prompting: This is when you ask the LLM (very nicely) to return output in the desired format (JSON, XML). This is nice because it works with all LLMs. It is not nice because there is no guarantee that the LLM returns the output in the right format.
- Function calling: This is when the LLM is fine-tuned to be able to not just generate a completion, but also generate a function call. The functions the LLM can call are generally passed as extra parameters to the model API. The function names and descriptions should be treated as part of the prompt (they usually count against token counts, and are used by the LLM to decide what to do).
- JSON mode: This is when the LLM is guaranteed to return JSON.

In Camel, the agent uses both prompting and function calling. For models does not support tool calling, camel will use prompt engineering for it to generate valid structure output.

## Qwen data generation

[Qwen](https://www.alibabacloud.com/help/en/model-studio/developer-reference/use-qwen-by-calling-api) is a good example in Camel of using prompt engineering for structure output. It offers powerful models like **Qwen-max**, **Qwen-coder**, but yet not support structure output by itself. We can then make use of its own ability to generate structured data. 

Import necessary libraries, define the Qwen agent, and define the Pydantic classes. 

In [4]:
from pydantic import BaseModel, Field

from camel.agents import ChatAgent
from camel.messages import BaseMessage
from camel.models import ModelFactory
from camel.types import ModelPlatformType, ModelType
from camel.configs import QwenConfig

from dotenv import load_dotenv
import os
load_dotenv() 

# Define Qwen model
qwen_model = ModelFactory.create(
    model_platform=ModelPlatformType.QWEN,
    model_type=ModelType.QWEN_PLUS,
    model_config_dict=QwenConfig().as_dict(),
)

qwen_agent = ChatAgent(
    model=qwen_model,
    message_window_size=10,
)


# Define Pydantic models
class Student(BaseModel):
    name: str
    age: str
    email: str




First, let's try if we don't specific format just in prompt. 

In [5]:
assistant_sys_msg = BaseMessage.make_assistant_message(
    role_name="Assistant",
    content="You are a helpful assistant in helping user to generate necessary data information.",
)

user_msg = """Help me 1 student info in JSON format, with the following format:
{
    "name": "string",
    "age": "string",
    "email": "string"
}"""

response = qwen_agent.step(user_msg)
print(response.msgs[0].content)




2024-12-18 16:13:25,265 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
2024-12-18 16:13:25,267 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': 'Help me 1 student info in JSON format, with the following format:\n{\n    "name": "string",\n    "age": "string",\n    "email": "string"\n}'}]
Certainly! Below is an example of a student's information formatted in JSON as you requested:

```json
{
    "name": "John Doe",
    "age": "20",
    "email": "johndoe@example.com"
}
```

If you need more specific details or another example, feel free to let me know!


It did it, but we need to expand our prompts, and the result still has some annoying extra texts, and we still need to parse it into valid JSON object by ourselves. 

A more elegant way is to use the `response_format` argument in `.step()` function:

In [9]:
qwen_agent.reset()
user_msg = "Help me 1 student info in JSON format"
response = qwen_agent.step(user_msg, response_format=Student)
print(response.msgs[0].content)

2024-12-18 16:20:15,988 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
2024-12-18 16:20:15,991 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': "\n    Given the user message, please generate a JSON response adhering to the following JSON schema:\n{'properties': {'name': {'title': 'Name', 'type': 'string'}, 'age': {'title': 'Age', 'type': 'string'}, 'email': {'title': 'Email', 'type': 'string'}}, 'required': ['name', 'age', 'email'], 'title': 'Student', 'type': 'object'}\nMake sure the JSON response is valid and matches the EXACT structure defined in the schema. Your result should only be a valid json object, without any other text or comments.\n\n    User message: Help me 1 student info in JSON format\n\n    "}]
{
  "name": "John Doe",
  "age": "20",
  "email": "johndoe@example.com"
}


And we can directly extract the Pydantic object in `response.msgs[0].parsed` field:

In [11]:
print(type(response.msgs[0].parsed))
print(response.msgs[0].parsed)


<class '__main__.Student'>
name='John Doe' age='20' email='johndoe@example.com'


Hooray, now we successfully generate 1 entry of student, suppose we want to generate more, we can still achieve this easily.

In [13]:
class StudentList(BaseModel):
    studentList: list[Student]

user_msg = "Help me 5 random student info in JSON format"
response = qwen_agent.step(user_msg, response_format=StudentList)
print(response.msgs[0].content)
print(response.msgs[0].parsed)



2024-12-18 16:24:04,985 - httpx - INFO - HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
2024-12-18 16:24:04,988 - camel.agents.chat_agent - INFO - Model qwen-plus, index 0, processed these messages: [{'role': 'user', 'content': 'Help me 1 student info in JSON format'}, {'role': 'assistant', 'content': '{\n  "name": "John Doe",\n  "age": "20",\n  "email": "johndoe@example.com"\n}'}, {'role': 'user', 'content': 'Help me 3 random student info in JSON format'}, {'role': 'assistant', 'content': '{\n  "studentList": [\n    {\n      "name": "Alice Johnson",\n      "age": "22",\n      "email": "alice.johnson@example.com"\n    },\n    {\n      "name": "Bob Smith",\n      "age": "21",\n      "email": "bob.smith@example.com"\n    },\n    {\n      "name": "Charlie Brown",\n      "age": "23",\n      "email": "charlie.brown@example.com"\n    }\n  ]\n}'}, {'role': 'user', 'content': "\n    Given the user message, please generate a JSON response

That's it! We just generate 5 random students out of nowhere by using Qwen Camel agent!