[Feature Request] Support Synthetic Data Generation via Self-Instruct

### Required prerequisites

- [x] I have searched the [Issue Tracker](https://github.com/camel-ai/camel/issues) and [Discussions](https://github.com/camel-ai/camel/discussions) that this hasn't already been reported. (+1 or comment there if it has.)
- [ ] Consider asking first in a [Discussion](https://github.com/camel-ai/camel/discussions/new).

### Motivation

**Motivation:**  
To improve training data diversity, we need a robust synthetic data generation pipeline. The goal is to leverage the Self-Instruct methodology to automatically create new, high-quality datapoints using a combination of human-provided seed examples and machine-generated content.

By introducing a `SelfInstructGenerator` that supports:
- Few-shot prompting for novel question generation,
- Code-based rationale generation,


### Solution

_No response_

### Alternatives

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Support Synthetic Data Generation via Self-Instruct #2012

Required prerequisites

Motivation

Solution

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Support Synthetic Data Generation via Self-Instruct #2012

Description

Required prerequisites

Motivation

Solution

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions