How to improve model performance without limiting user input, and generating high data throughput #23

kislerdm · 2023-02-08T11:27:08Z

kislerdm
Feb 8, 2023
Maintainer

Context

MVP evolves around the completion model which requires a bulky input including a "train sample" and the prompt for prediction. The approach has many disadvantages:

Overfit to the train sample which does not exceed a dozen of data points;
Hard limit on the user's input length caused by the OpenAI quota;
Throughput of 40RPM, see details here;
Significant latency.

Fine-tune of the model using a significant number of training samples is the way to address the bottlenecks.

Problem

The OpenAI documentation suggests few hundreds data points as a reasonable training sample size:

To fine-tune a model that performs better than using a high-quality prompt with our base models, you should provide at least a few hundred high-quality examples, ideally vetted by human experts.

It imposes the bottleneck: how to generate such amount of quality data?

Proposed Solutions

Internally

Every contributor commits to generate few dozens of training data points.

Pros:

Fine control over generated data (DQA)
Gain of the core product understanding by the team

Cons:

Slow
High cognitive load because the task is tedious

Crowdsourcing

Involve community to generate data and share with us through PR and other coms channels.

Requirements:

Community

Pros:

Scales well
Community growth
Tool advertisment

Cons:

No auto-DQA
Inbound dependency with no control over it
Effort to attract experts who may not end up using the app

Users input

It's an extension of the crowdsourcing approach leveraging the incentive to improve the product by its users.

Requirements:

Users
Need to set T&C and maintain the SLA at early stage
Authentication layer
State management making the app more confusing for the user and for us to develop and maintain

Pros:

Natural leverage of users feedback
DQA catered by user
User behaviour analytics

Cons:

We need to have enough traction
Cost of initial onboarding for users at the early stage
Risk of low quality DQA

Summary

I'd like to propose a combination of the "Internally" and the "Users input" approaches:

We make the seed by generating one hundred data points manually, and develop initial fine-tuned model. Bet: 2023-02-19.
We enhance the MVP solution by introducing authentication layer to build the users base. Bet: 2023-02-23.
We introduce the "raw edit" functionality for end users to be able to correct the model output. Bet: 2023-02-25.
We re-train the model using provided input. Bet: 2023-03-15.

ColeDrain · 2023-02-08T12:06:47Z

ColeDrain
Feb 8, 2023
Collaborator

The approaches you suggest seem okay, I think OpenAI also used such approach on ChatGPT and it's kinda working for them.
I don't think we should crowd source yet..
Is there a possibility to automate the generation of the data points..

5 replies

kislerdm Feb 8, 2023
Maintainer Author

@ColeDrain thanks for your feedback. Unfortunately full automation is not possible - too high risk to generate low quality data. It'd defeat the point of fine-tuning the model: "garbage in, garbage out" 🤷🏻‍♂️
Although, semi-automation would be possible if one employs existing prompt and use OpenAI playground:

Generate;
Correct;
Save.

It is essentially the "Users input" approach without involving diagramastext webapp.

kislerdm Feb 8, 2023
Maintainer Author

BTW could you please share a bit more about your concerns regarding crowdsourcing:

I don't think we should crowd source yet..

Thanks!

ColeDrain Feb 8, 2023
Collaborator

I agree with your point on automation.
I feel crowdsourcing would take a much longer time than suggested approach, and now we aren't out with a mvp, crowdsourcing may be risky, if we have competitors or so, that's how it feels in my head..

kislerdm Feb 8, 2023
Maintainer Author

@ColeDrain thanks for sharing your concerns - I totally get them! Although, let's not worry about "competitors" please - as OSS project, we shall embrace competition to generate even better outcome for community. It may be naïve, but I'd like to believe that competition is a good factor, it motives to grow 😃 WDYT?

ColeDrain Feb 8, 2023
Collaborator

I agree with you, that's just how it felt in my head.
Please let's not worry about that again..

zotttttttt · 2023-02-08T18:59:42Z

zotttttttt
Feb 8, 2023
Collaborator

Fine-tuned models can be costly 🤔 not only in training but also in usage. Have we explored the way of creating a proper prompt?

For example, we can have this algorithm:

User enters the text
diagramastext inputs that text in a certain prompt and feeds it to GPT-3 (curie, davinci, davinci-instruct, chatgpt, whatever works best)
GPT-3 responds in a certain way (maybe a few instructions in our prompt will be enough, but sometimes you have to provide examples, which is also not that hard to do)
diagramastext parses that response as a table (or any structure we want essentially)
diagramastext generates a diagram from that structure

Sometimes it works pretty well, and GPT-3 is able to respond within the given format pretty steadily.

If we can define item 4 (what's our ideal structure to generate the diagram from), I can try to experiment with items 2-3

4 replies

kislerdm Feb 8, 2023
Maintainer Author

@zotttttttt Hey! Thanks for your reply.

We have implemented what you suggested already ;) Please find the app architecture described in the diagram.

Have we explored the way of creating a proper prompt?

Would you mind clarifying what does "proper" mean?

For the context, we use the prompt to generate the diagram graph using the OpenAI model. It provides satisfactory results, yet they are very limited by the prompt volume. Unfortunately it cannot exceed a dozen of data point due to the hard limit for the completion models' quota.

As for the costs, I'd like to put it beyond the scope of this discussion. Let's focus on the prediction quality in terms of "variety" to minimise overfit to the prompt data.

zotttttttt Feb 8, 2023
Collaborator

thanks! Just to clarify:

the 1st line is an actual basic "prompt" that we have with the instructions:

"Given prompts and corresponding graphs as json define new graph based on new prompt ..."

the 2nd line is the example of a query for completion:

"Draw c4 container diagram ..."

and the next lines are the completion from GPT-3?

Or is it the instructions with the example and after that you use some kind of linebreaker and the actual query for completion?

kislerdm Feb 8, 2023
Maintainer Author

Prompt contains the "training sample" I prepared manually. In the other words, all lines of the prompt are the data points cherry-picked and curated by me. Nothing there is generated by the model.

kislerdm Feb 8, 2023
Maintainer Author

For the sake of illustration:

The prompt used as the prefix to the user's input is highlighted by the green rectangle in the screenshot above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to improve model performance without limiting user input, and generating high data throughput #23

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 9 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to improve model performance without limiting user input, and generating high data throughput #23

kislerdm Feb 8, 2023 Maintainer

Context

Problem

Proposed Solutions

Internally

Crowdsourcing

Users input

Summary

Replies: 2 comments · 9 replies

ColeDrain Feb 8, 2023 Collaborator

kislerdm Feb 8, 2023 Maintainer Author

kislerdm Feb 8, 2023 Maintainer Author

ColeDrain Feb 8, 2023 Collaborator

kislerdm Feb 8, 2023 Maintainer Author

ColeDrain Feb 8, 2023 Collaborator

zotttttttt Feb 8, 2023 Collaborator

kislerdm Feb 8, 2023 Maintainer Author

zotttttttt Feb 8, 2023 Collaborator

kislerdm Feb 8, 2023 Maintainer Author

kislerdm Feb 8, 2023 Maintainer Author

kislerdm
Feb 8, 2023
Maintainer

Replies: 2 comments 9 replies

ColeDrain
Feb 8, 2023
Collaborator

kislerdm Feb 8, 2023
Maintainer Author

kislerdm Feb 8, 2023
Maintainer Author

ColeDrain Feb 8, 2023
Collaborator

kislerdm Feb 8, 2023
Maintainer Author

ColeDrain Feb 8, 2023
Collaborator

zotttttttt
Feb 8, 2023
Collaborator

kislerdm Feb 8, 2023
Maintainer Author

zotttttttt Feb 8, 2023
Collaborator

kislerdm Feb 8, 2023
Maintainer Author

kislerdm Feb 8, 2023
Maintainer Author