Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add: Generate Image by DALL·E 3 #493

Merged
merged 8 commits into from
Mar 15, 2024
Merged

Add: Generate Image by DALL·E 3 #493

merged 8 commits into from
Mar 15, 2024

Conversation

mingming-ma
Copy link
Collaborator

@mingming-ma mingming-ma commented Mar 13, 2024

Description

The function generateImage calls the dall-e-3 model api to get results. Add a new command image on user side to display the image generated.

How to test

Enter prompt after command image in the input area: /image ${prompt}
Example: /image Happy White Border Collie. Close up

Fixes: #124

@mingming-ma mingming-ma self-assigned this Mar 13, 2024
@mingming-ma mingming-ma added this to the Release 1.5 milestone Mar 13, 2024
Copy link

cloudflare-pages bot commented Mar 13, 2024

Deploying chatcraft-org with  Cloudflare Pages  Cloudflare Pages

Latest commit: be3d7e9
Status: ✅  Deploy successful!
Preview URL: https://c90437cd.console-overthinker-dev.pages.dev
Branch Preview URL: https://mingming-generateimage.console-overthinker-dev.pages.dev

View logs

@rjwignar rjwignar self-requested a review March 13, 2024 15:02
@tarasglek
Copy link
Owner

we need to change code so commands added automatically show up in /commands atm this command is impossible to discover
image

@tarasglek
Copy link
Owner

also really confusing that /dalle3 doesn't work

@humphd
Copy link
Collaborator

humphd commented Mar 13, 2024

I think it should be /image ...

src/lib/ai.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@rjwignar rjwignar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I noticed is that once the chat exceeds the screen height, the scrollbar doesn't automatically move down to show the user the generated image:
dalle3Scrollbar

However, this is something to be investigated in a follow-up, and should not block this from being merged.

src/lib/ai.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@rjwignar rjwignar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mingming-ma These changes look good! It was a good idea to rename the command to "/image".

At the time of my review, @tarasglek and @humphd's suggestions have been addressed and incorporated into this PR:
image

I left ~3 observations/concerns that are either non-issues or can be addressed in follow-up PRs.
They shouldn't prevent this from being merged. @humphd do you agree?

@tarasglek
Copy link
Owner

tarasglek commented Mar 14, 2024

I think we should also try to store the image similarly to how we upload images into chatcraft, so we can then ask gpt vision to analyze it(this can be done in followup)

Copy link
Collaborator

@Amnish04 Amnish04 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mingming-ma This is a pretty cool feature. Just left a few comments after reviewing.

src/lib/ai.ts Outdated Show resolved Hide resolved
src/lib/ai.ts Show resolved Hide resolved
src/lib/commands/ImageCommand.ts Show resolved Hide resolved
src/lib/ai.ts Outdated Show resolved Hide resolved
@Amnish04
Copy link
Collaborator

Amnish04 commented Mar 14, 2024

@mingming-ma This might be unrelated to this PR, but I feel like the images should have a cursor: pointer for the user to know something happens on clicking them.

Can we make that change in this PR or a follow up?

Another thing I wanted to suggest is that since image generation takes a long time, should we have an info alert as soon as the command is executed saying something like

{
    title: "Generating...",
    message: "Please wait while the image gets generated."
}

@mingming-ma
Copy link
Collaborator Author

I think we should also try to store the image similarly to how we upload images into chatcraft, so we can then ask gpt vision to analyze it(this can be done in followup)

I filed an issue to fix store problem #496 , just want to inform you that at the moment user is able use vision to analyze the generated image

@mingming-ma
Copy link
Collaborator Author

One thing I noticed is that once the chat exceeds the screen height, the scrollbar doesn't automatically move down to show the user the generated image:

@rjwignar I just saw @Amnish04 fixed an issue related the auto scroll, #495, I don't know it is related or not, let's see how it is after this be merged, sounds good?

@Amnish04
Copy link
Collaborator

Amnish04 commented Mar 15, 2024

@rjwignar I just saw @Amnish04 fixed an issue related the auto scroll, #495, I don't know it is related or not, let's see how it is after this be merged, sounds good?

@mingming-ma I just undid that fix in #497 as that was causing nested submenus to break (more details here). But nonetheless, I don't think that was related to this issue

src/lib/ai.ts Outdated
try {
const response = await openai.images.generate({
model: "dall-e-3",
prompt: prompt,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: when you have the same key/value names, only include it once:

{
  model: "dall-e-e",
  prompt,
  n,
  size,
}

@humphd
Copy link
Collaborator

humphd commented Mar 15, 2024

In a follow-up, I wonder if we should add a menu item to the Options menu to allow you to do the same thing, but pop-up a little model where you type your prompt? Or maybe make it so entering a prompt and asking the dall-e model does this too?

@mingming-ma
Copy link
Collaborator Author

mingming-ma commented Mar 15, 2024

In a follow-up, I wonder if we should add a menu item to the Options menu to allow you to do the same thing, but pop-up a little model where you type your prompt? Or maybe make it so entering a prompt and asking the dall-e model does this too?

@humphd Just guessing the main concern here, let user know this feature more easily?
Maybe we could also add the Commands in the Options menu, and list them as submenus.

@humphd
Copy link
Collaborator

humphd commented Mar 15, 2024

In a follow-up, I wonder if we should add a menu item to the Options menu to allow you to do the same thing, but pop-up a little model where you type your prompt? Or maybe make it so entering a prompt and asking the dall-e model does this too?

@humphd Just guessing the main concern here, let user know this feature more easily? Maybe we could also add the Commands in the Options menu, and list them as submenus.

Lots of users will never use/know about /slash commands. It's pretty advanced.

@Amnish04
Copy link
Collaborator

@mingming-ma Do we have an issue to track this #493 (comment)

@Amnish04
Copy link
Collaborator

Amnish04 commented Mar 15, 2024

@mingming-ma One more thing, I feel like the image preview modal doesn't help much right now as most of the times we get a larger image in the message view itself.

313337019-b83d2269-a991-4deb-a82e-b6fd68731d20

As a user, I would love to have an image zoom feature on click in the preview modal. We should do it in a separate issue.
I found this library that could help with it
https://www.npmjs.com/package/react-medium-image-zoom

Examples at https://rpearce.github.io/react-medium-image-zoom/?path=/story/galleries--image-gallery

@humphd Do we have anything already in our tree to help with it, or should we pull this library I mentioned.

@mingming-ma
Copy link
Collaborator Author

mingming-ma commented Mar 15, 2024

@mingming-ma Do we have an issue to track this #493 (comment)

@Amnish04 Just filed #504 to track this

@mingming-ma One more thing, I feel like the image preview modal doesn't help much right now as most of the times we get a larger image in the message view itself.
As a user, I would love to have an image zoom feature on click in the preview modal. We should do it in a separate issue.
I found this library that could help with it
https://www.npmjs.com/package/react-medium-image-zoom

That's great! I'll investigate that later.

@mingming-ma
Copy link
Collaborator Author

mingming-ma commented Mar 15, 2024

I think we can merge this PR for now, @humphd @Amnish04 I'll create another issue follow up the react-medium-image-zoom

Update: issue -> #505

@mingming-ma mingming-ma merged commit d09c947 into main Mar 15, 2024
4 checks passed
@mingming-ma mingming-ma deleted the mingming/generateimage branch April 10, 2024 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider supporting Dall-E for image generation in chats
5 participants