Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to generate code-trajectory data with GPT4? #1

Open
SeungyounShin opened this issue Jul 23, 2023 · 5 comments
Open

How to generate code-trajectory data with GPT4? #1

SeungyounShin opened this issue Jul 23, 2023 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@SeungyounShin
Copy link
Owner

Creation of SFT data for

User :
Assistant :


<Thinking, GPT4>
<Debug...>


...

How to automate this process? with GPT4 and collect data efficiently?

@SeungyounShin SeungyounShin added documentation Improvements or additions to documentation enhancement New feature or request labels Jul 23, 2023
@SeungyounShin SeungyounShin self-assigned this Jul 23, 2023
@theblackcat102
Copy link

@SeungyounShin Hi, I am currently working on a very similar project, mainly generating a dataset to use tools. One of the dataset I am working include using code interpreter tool. My method was basically starting with a few dozen of instruction and asked GPT-4 to generate more similar instructions. Using this slightly larger instructions set, I use the evol instruct [1] method to generate more instructions. So far I had only 4,628 instructions set about using code interpreter.

[1] WizardLM: Empowering Large Language Models to Follow Complex Instructions

@SeungyounShin
Copy link
Owner Author

SeungyounShin commented Jul 24, 2023

TSLA_90days

Here's an output of the code generated by GPT-4 from my repository. The task was: "Can you plot the Tesla's 90-day volume with the mean of the closing price and a marker at 't' where the mean until 't-1' plus the standard deviation until 't-1' is less than the price at 't'?" The performance of GPT-4 is impressive but the data collection process tends to be slow. This is primarily because it operates in an iterative manner: generating code, executing it, then debugging and modifying the code, and repeating the process. This can lead to considerable latency.

Your method is a valuable alternative, but I believe the real-time execution of code between GPT-4 calls is critical for this task. I've encountered a second challenge: GPT-4 is effective at debugging but often struggles with generating the final answer. #2 I'm not entirely sure why this happens. I would appreciate any thoughts or suggestions on how to improve this process. Thank you so much! @theblackcat102

I would greatly appreciate any further discussion on this topic. Please feel free to share your insights or suggestions.

@theblackcat102
Copy link

@SeungyounShin oh I had a code execution module as well, just the initial questions are generated via augmentation. Each round typically took me 20-120 seconds depends on complexity. My progress usually slows down due to bad for loop or training a 500M huggingface model on my mac.

What's the exact issue with #2 ? Could you provide more insights to the weird answer problem? An example would be nice 😊

@SeungyounShin
Copy link
Owner Author

@theblackcat102

I recently explored the concept of Evo-Instruct and found it quite fascinating. Inspired, I crafted my own version of Evo-Instruct. In the process, I observed that a significant number of human-engineered prompts are required. In addition, I noticed that GPT often tends to prompt with instructions like "Write ~" to create a Python function but does not actively check the result or implement it itself. It then appears to congratulate itself on completing the task.

One thing that stood out to me was that Evo-Instruct seems to perform better than Self-Instruct. It not only produces higher quality prompts but also a diverse range of them. While generating high-quality prompts is comparatively simpler (for instance, we could just request "more difficult one"), generating diverse prompts is quite challenging. Transitioning from one topic to another can potentially lead to significant deviations, such as moving from a simple '1+1=?' to a complex 'Use CAD to...'.

Considering these observations, it seems that maintaining a balance between diversity and quality could be an interesting research topic.

@SeungyounShin
Copy link
Owner Author

[Still in progress]

How we can enhance the generation of trajectories (code gen, exec, debug from it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
Development

No branches or pull requests

2 participants