Legit real-world use cases? #155

Choms · 2019-07-08T10:52:39Z

Hello,

I've been reading about this project, and many of the articles referring to it discuss the potential abuse or whether it does or not represent an actual risk. I also read the mega thread in #16 and draw some conclusions for myself:

Most people who wants the full model release argue it's "for the sake of knowledge"
I feel like an ample percent of those are actually internet trolls that want a fun-and-easy to use tool for generating scam emails and such
Some people is actually concerned about the potential abuse and understand the caution on not releasing the full model

Now, what I didn't saw, neither on that thread or in the articles speaking about this project, are actual legitimate use cases for this technology - far from the obvious "research purposes".

So let's forget about fake news and internet trolling, I honestly don't see a situation where this would be of any help - or let me rephrase it... where this should be of any help. I've seen commercial offerings that pretty much sum up to "are you too lazy as for interpreting your own data? let our bot write reports for your stakeholders so they feel your project is going somewhere even if you don't know what you are doing at all".

The other real world use cases I can think of would be (non)writers who instead of paying others to write their books as they do currently, would use some sort of AI to bake standardized best-sellers for their own profit (with a huge marketing effort ofc - this could also help on that regard), or crappy news agencies who deprecated all the reporters in favor of interns who can type one paragraph so the text auto-generation tool can fill the rest of the made up article.

To sum up, I'd really love to hear some legitimate real world use cases which don't completely suck for this technology, from people who are actually working on it.

Cheers!

ddugovic · 2019-07-10T05:49:45Z

I imagine this tool could proofread emails & articles or source code (and possibly auto-complete the same, understanding that afterward the human may need to review & revise).

leejason · 2019-07-10T08:15:52Z

I'm experimenting on whether the "creative" nature in GPT-2 can be "innovative" in patent sense. The "fake news" issue should be a lesser concern since it's unreasonable to pay significant money to get a fake patent (if really granted) that does not work at all.

By fine-tuning with different corpus, it might shed more light on understanding GPT-2 better too.

Patent Claim Generation by Fine-Tuning OpenAI GPT-2

In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) building an e-mail bot for future researchers to explore the fine-tuned GPT-2 model further.

Choms · 2019-07-11T17:23:32Z

@leejason thanks, that is actually quite imaginative, though personally I'm completely anti-patents, I think they are one of the worst issues for our society, they only help patent trolls and in the best case they discourage innovation and make impossible to use certain knowledge to help people. As they say, sharing is caring, and knowledge cannot be "owned" ;)

danuker · 2019-07-12T08:07:32Z

I imagine it could eventually learn to write source code (perhaps based on a natural language requirement text), not just proofread.

merltron · 2019-07-12T08:31:39Z

This article explains the business cases for the more general field of Natural Language Generation (NLG) pretty well: https://medium.com/sciforce/a-comprehensive-guide-to-natural-language-generation-dd63a4b6e548
(it mentions GPT-2)

As a tech writer for API docs, I would love something recognizes and generates boilerplate sentence patterns which differ only by a few nouns. "Sets the configuration to enable or disable X for Y". If it could analyze code and self-generate API / Library documentation that a human could edit and tweak, that would be awesome. You would definitely have to tone down the temperature or "creativity" though...

KoolenDasheppi · 2019-07-14T21:38:41Z

To be honest, I just want the full model to mess around with. I sure as hell don't have the hardware to train a 1.5B parameter model, plus I don't have the knowledge to replicate the model in the first place. I just want a pretrained 1.5B model, as do others. I mean, there are plenty of ways to determine a fake. First of all, where was the text posted? On an official website/account? If not, then it's fake. Plus, there's also people developing AI that'll detect fake news/generated text. So why haven't they released the full model? My theory is that it doesn't exist. It's all a facade, and I cannot wait to see OpenAI's consequences for this obvious lie. Prove that I'm wrong by releasing it. No? Didn't think so.

Choms · 2019-07-18T17:15:04Z

In fairness, I just saw this, that I'll take as a valid real world use case for this: https://tabnine.com/blog/deep 🙂

Still interested though if someone wants to share more use cases in other fields, most examples seem to be focused to software development itself.

julien-c · 2019-07-26T16:15:33Z

To run GPT-2 on-device (on iOS using Swift and CoreML), you can take a look at https://github.com/huggingface/swift-coreml-transformers

cc @LysandreJik

MrKrzYch00 · 2019-07-27T10:39:42Z

I see it more like a tool to play with. Kind of like, "AI, tell me about x," "Oh, that's interesting, tell me more..."
Making made up stories like a small kid or creating kind of dream-like sci-fi material.
For writing books or whatever, it's good to give some ideas if you are stuck, lost in thoughts or indecisive what to do next, nonetheless, you should write it by yourself mostly. (this is quite useful for next-sentence action with small output to input ratio, like 64 output, 960 input - for dialogue, not so much usually).
But other than that I see it like a tool that completes the tokens graph by predicting the possible outcome. How will the line go with some randomness going on at the same time.

danuker · 2019-08-22T07:09:53Z

You could train it on tech support chat, and let it have the customer try different things.

dji-transpire · 2019-08-24T21:59:06Z

You could argue that https://tabnine.com/ is a real use case that hopefully saves time and produces some revenue for the developer.

DAMO238 · 2019-09-21T15:32:32Z

I could see uses in this to help automated responses seem more natural. For example, how many times have you got an email from a large company, like Google, that is so obviously just a template with your name in and whatsuch? Now imagine that email looked hand crafted just for you while still relaying all that information, would you be more inclined to read the email? I know that I would.

danuker · 2019-09-26T10:18:39Z

Have the model read lots of medical papers, then you can use it to suggest diseases that might be responsible (of course, to be confirmed by real doctors):

The patient reported cough, chest pain, fever, and trouble breathing. The most likely condition is ->
pneumonia.

Or check for similar conditions (samples generated by Talk to Transformer):

Input:
Often misdiagnosed as influenza,
Outputs:
erythema multiforme...
erythema migrans...

Often misdiagnosed as diabetes, -> ileitis...

Of course, the suggestions from the 774M model are not very helpful.

Edit:

Another similar throw-documents-at-it case would be the legal one. You could train the model on lots of winning defense statements, start writing a defense, and guide the algorithm through it.

corasundae · 2019-10-25T15:51:23Z

It helps generate interesting writing ideas when you're stuck.

alexa-ai · 2019-11-25T01:37:04Z

There are some real-world use cases (it can provide ideas to authors to expand the visual description of a place) and lot of possibilities for abuse. I guess all search engines would have added the algorithm by now to detect gpt-2 generated articles. For me, the idea itself was fascinating... until I saw the actual output :-) It's nowhere near originally written article.

lbatteau · 2020-02-15T19:27:46Z

We are experimenting with GPT-2 to generate autocomplete suggestions for tech support agents. It could save them a lot of typing. See also https://medium.com/@lukas_1583/serving-gpt-2-in-google-cloud-platform-9ea07a69c87d.

IveJ · 2020-02-15T20:01:56Z

Thank for sharing.

…

On Sun, Feb 16, 2020, 02:28 Lukas Batteau ***@***.***> wrote: We are experimenting with GPT-2 to generate autocomplete suggestions for tech support agents. It could save them a lot of typing. See also ***@***.***_1583/serving-gpt-2-in-google-cloud-platform-9ea07a69c87d . — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#155?email_source=notifications&email_token=AEYAML6ZQ573UEAEVRT46W3RDA64XA5CNFSM4H62NKG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3VBLA#issuecomment-586633388>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AEYAML7DO5JFTXK3VBVL52DRDA64XANCNFSM4H62NKGQ> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Legit real-world use cases? #155

Legit real-world use cases? #155

Choms commented Jul 8, 2019

ddugovic commented Jul 10, 2019

leejason commented Jul 10, 2019

Choms commented Jul 11, 2019

danuker commented Jul 12, 2019

merltron commented Jul 12, 2019 •

edited

Loading

KoolenDasheppi commented Jul 14, 2019

Choms commented Jul 18, 2019

julien-c commented Jul 26, 2019 •

edited

Loading

MrKrzYch00 commented Jul 27, 2019

danuker commented Aug 22, 2019

dji-transpire commented Aug 24, 2019

DAMO238 commented Sep 21, 2019

danuker commented Sep 26, 2019 •

edited

Loading

corasundae commented Oct 25, 2019

alexa-ai commented Nov 25, 2019

lbatteau commented Feb 15, 2020

IveJ commented Feb 15, 2020 via email

Legit real-world use cases? #155

Legit real-world use cases? #155

Comments

Choms commented Jul 8, 2019

ddugovic commented Jul 10, 2019

leejason commented Jul 10, 2019

Choms commented Jul 11, 2019

danuker commented Jul 12, 2019

merltron commented Jul 12, 2019 • edited Loading

KoolenDasheppi commented Jul 14, 2019

Choms commented Jul 18, 2019

julien-c commented Jul 26, 2019 • edited Loading

MrKrzYch00 commented Jul 27, 2019

danuker commented Aug 22, 2019

dji-transpire commented Aug 24, 2019

DAMO238 commented Sep 21, 2019

danuker commented Sep 26, 2019 • edited Loading

corasundae commented Oct 25, 2019

alexa-ai commented Nov 25, 2019

lbatteau commented Feb 15, 2020

IveJ commented Feb 15, 2020 via email

merltron commented Jul 12, 2019 •

edited

Loading

julien-c commented Jul 26, 2019 •

edited

Loading

danuker commented Sep 26, 2019 •

edited

Loading