How are you going about cleaning? #31

cbsudux · 2023-03-29T10:15:01Z

How are you going about cleaning this?

Manually or with GPT-4.

HideLord · 2023-03-29T11:32:38Z

I use a regex to search for a specific subset of instructions/inputs/outputs that are highly likely to be wrong. After that, I manually go and verify that something is indeed incorrect and query gpt-3.5-turbo or fix them by hand.

gururise · 2023-03-29T17:43:48Z

How are you going about cleaning this?

Manually or with GPT-4.

Much of it has been hand curated with the help of various tools and regular expressions (see tools directory). There exists a tool that can use GPT3.5/4 to double-check the answers.

ehartford · 2023-03-29T21:45:31Z

i bet one could convince the gpt-3.5 api to take in a line, and decide whether it's acceptable or needs fixed, and if it needs fixed to suggest a fix, which could then be quick-reviewed (accept or reject) by a human

gururise · 2023-03-29T22:39:21Z

i bet one could convince the gpt-3.5 api to take in a line, and decide whether it's acceptable or needs fixed, and if it needs fixed to suggest a fix, which could then be quick-reviewed (accept or reject) by a human

We have this capability already built out in one of the tools. I think the first 1000 entries have been checked in this manner.

gururise mentioned this issue Apr 2, 2023

good job #45

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How are you going about cleaning? #31

How are you going about cleaning? #31

cbsudux commented Mar 29, 2023

HideLord commented Mar 29, 2023

gururise commented Mar 29, 2023 •

edited

Loading

ehartford commented Mar 29, 2023

gururise commented Mar 29, 2023

How are you going about cleaning? #31

How are you going about cleaning? #31

Comments

cbsudux commented Mar 29, 2023

HideLord commented Mar 29, 2023

gururise commented Mar 29, 2023 • edited Loading

ehartford commented Mar 29, 2023

gururise commented Mar 29, 2023

gururise commented Mar 29, 2023 •

edited

Loading