Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using images as input with gpt-4 vision #286

Merged
merged 55 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
b5da133
rebase origin/main
mingming-ma Nov 9, 2023
c146714
add image_url to ChatCraftHumanMessage, update onSendClick to send th…
mingming-ma Nov 9, 2023
fd4d5df
Update store format to use Base64 data
mingming-ma Nov 9, 2023
a009a66
rename image_url to image and remove debug log
mingming-ma Nov 9, 2023
8791b35
remove deprecated interface and rename content to textAndImage
mingming-ma Nov 9, 2023
5b73645
rename image_url to image
mingming-ma Nov 9, 2023
40340d5
fix cutoff test
mingming-ma Nov 11, 2023
ff7de7f
fix type mismatch by using ChatCompletionContentPart
mingming-ma Nov 19, 2023
0cace53
add supportsImages to ChatCraftModel and set max_tokens for gpt-4-vis…
mingming-ma Nov 21, 2023
0c4d571
add clip icon to upload files
mingming-ma Nov 21, 2023
22fda70
now click icon support upload images
mingming-ma Nov 21, 2023
c41c148
preview image only cap height, so can preview full
mingming-ma Nov 21, 2023
3373039
reset input, user can attach same files if they want
mingming-ma Nov 21, 2023
6121261
update model when user select images
mingming-ma Nov 23, 2023
855ce84
try openai lasted version
mingming-ma Nov 23, 2023
7d2fdbb
db store images
mingming-ma Nov 23, 2023
95903f6
rebase on latest main
mingming-ma Nov 23, 2023
af2fcc0
move close X circle to top-right corner
mingming-ma Nov 23, 2023
aca69c9
bugfix, should also clear input images after send transcript
mingming-ma Nov 23, 2023
3ff2898
click small image shows larger in the middle
mingming-ma Nov 23, 2023
a7e3db4
better look of small image's close button
mingming-ma Nov 23, 2023
a429d58
Images in messages width 100%
mingming-ma Nov 29, 2023
2378dac
re-focus the prompt
mingming-ma Nov 30, 2023
a611c05
past image file to attach
mingming-ma Dec 5, 2023
427ca59
use chakra-ui Image component
mingming-ma Dec 11, 2023
874d9d7
rebase on main
mingming-ma Dec 11, 2023
63cd21a
refactor with ImageModal component
mingming-ma Dec 11, 2023
9f7eb0b
remove unused variable in the comment
mingming-ma Dec 12, 2023
5101e66
set image optional in ChatCraftMessageTable
mingming-ma Dec 12, 2023
af0d8f6
rename Clip everywhere to Attach
mingming-ma Dec 12, 2023
8ec864f
rename textAndImage to content
mingming-ma Dec 12, 2023
bd3b179
optimize: remove this.image check as not undefined; shorthand assign …
mingming-ma Dec 12, 2023
87e3147
catch processing images' error and show the popup
mingming-ma Dec 12, 2023
cc975e2
Support adding more than 1 image at a time in the file picker
mingming-ma Jan 16, 2024
8050611
add an index number to the top left corner of each image
mingming-ma Jan 16, 2024
bad9188
fix number outline shape oval to round, add margin
mingming-ma Jan 28, 2024
1a3f24f
fix broken pnpm-lock
mingming-ma Feb 1, 2024
ae80a97
only image also get response
mingming-ma Feb 1, 2024
3cc95c7
Re-focus after closing the full image
mingming-ma Feb 3, 2024
d870c8c
update comment about cutoff bug
mingming-ma Feb 3, 2024
eb817cf
rebase on main
mingming-ma Feb 6, 2024
11a9e18
use new menu component, no long space after message
mingming-ma Feb 6, 2024
3e795b4
image in modal enlarge
mingming-ma Feb 6, 2024
8c1d624
support continue use non vision model when history has images
mingming-ma Feb 8, 2024
725f5cf
Not being able to send empty messages
mingming-ma Feb 8, 2024
de8b5fb
Fix image styling when using gpt4-vision (#419)
Amnish04 Feb 8, 2024
7f1109c
remove the unused import
mingming-ma Feb 8, 2024
0e8d50d
fix share chat image missing
mingming-ma Feb 8, 2024
68830a6
refact useEffect dependencies
mingming-ma Feb 8, 2024
6f725f5
rename Attach to AttachFileButton
mingming-ma Feb 8, 2024
1a592b6
rename image to imageUrls
mingming-ma Feb 8, 2024
1f023ea
fix typo
mingming-ma Feb 8, 2024
11ebc37
use maxHeight for responsive modal
mingming-ma Feb 8, 2024
3a27281
refactor code
mingming-ma Feb 9, 2024
f679e4e
update dependencies
mingming-ma Feb 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 6 additions & 2 deletions src/Chat/ChatBase.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ function ChatBase({ chat }: ChatBaseProps) {

// Handle prompt form submission
const onPrompt = useCallback(
async (prompt?: string) => {
async (prompt?: string, imageUrls?: string[]) => {
setLoading(true);

// Special-case for "help", to invoke /help command
Expand Down Expand Up @@ -198,7 +198,11 @@ function ChatBase({ chat }: ChatBaseProps) {
// If the prompt text exist, package it up as a human message and add to the chat
if (prompt) {
// Add this prompt message to the chat
promptMessage = new ChatCraftHumanMessage({ text: prompt, user });
promptMessage = new ChatCraftHumanMessage({ text: prompt, imageUrls, user });
await chat.addMessage(promptMessage);
} else if (imageUrls?.length) {
// Add only image to the chat
humphd marked this conversation as resolved.
Show resolved Hide resolved
promptMessage = new ChatCraftHumanMessage({ text: "", imageUrls, user });
await chat.addMessage(promptMessage);
} else {
// If there isn't any prompt text, see if the final message in the chat was a human
Expand Down
40 changes: 40 additions & 0 deletions src/components/ImageModal.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import React from "react";
import {
Modal,
ModalOverlay,
ModalContent,
ModalCloseButton,
ModalBody,
Image,
Flex,
} from "@chakra-ui/react";

interface ImageModalProps {
isOpen: boolean;
onClose: () => void;
imageSrc: string;
}

const ImageModal: React.FC<ImageModalProps> = ({ isOpen, onClose, imageSrc }) => (
<Modal isOpen={isOpen} onClose={onClose} size="2xl" isCentered>
<ModalOverlay />
<ModalContent maxW="90vw" maxHeight="90vh">
<ModalCloseButton />
<ModalBody>
<Flex height={"100%"} justifyContent={"center"} alignItems={"center"}>
<Image
maxWidth="100%"
maxHeight="70vh"
overflow={"auto"}
src={imageSrc}
alt="Selected Image"
m="auto"
objectFit="contain"
/>
</Flex>
</ModalBody>
</ModalContent>
</Modal>
);

export default ImageModal;
24 changes: 23 additions & 1 deletion src/components/Message/MessageBase.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import {
Flex,
Heading,
IconButton,
Image,
Link,
Tag,
Text,
Expand Down Expand Up @@ -53,6 +54,7 @@ import { ChatCraftModel } from "../../lib/ChatCraftModel";
import { useModels } from "../../hooks/use-models";
import { useSettings } from "../../hooks/use-settings";
import { useAlert } from "../../hooks/use-alert";
import ImageModal from "../ImageModal";

// Styles for the message text are defined in CSS vs. Chakra-UI
import "./Message.css";
Expand Down Expand Up @@ -103,7 +105,7 @@ function MessageBase({
disableFork,
disableEdit,
}: MessageBaseProps) {
const { id, date, text } = message;
const { id, date, text, imageUrls } = message;
const { models } = useModels();
const { onCopy } = useClipboard(text);
const { info, error } = useAlert();
Expand All @@ -114,6 +116,8 @@ function MessageBase({
const messageForm = useRef<HTMLFormElement>(null);
const messageContent = useRef<HTMLDivElement>(null);
const meta = useMemo(getMetaKey, []);
const [imageModalOpen, setImageModalOpen] = useState<boolean>(false);
const [selectedImage, setSelectedImage] = useState<string>("");
const { isOpen, onToggle: originalOnToggle } = useDisclosure();
const isLongMessage = text.length > 5000;
const displaySummaryText = !isOpen && (summaryText || isLongMessage);
Expand Down Expand Up @@ -269,6 +273,12 @@ function MessageBase({
[message, onResubmitClick, chatId, error, onEditingChange]
);

const openModalWithImage = (imageSrc: string) => {
setSelectedImage(imageSrc);
setImageModalOpen(true);
};
const closeModal = () => setImageModalOpen(false);

return (
<Box
id={id}
Expand Down Expand Up @@ -481,6 +491,17 @@ function MessageBase({
// Add a single pixel of offset for rendering to canvas (offset handled above with m=-1)
p={1}
>
{imageUrls.map((imageUrl, index) => (
<Box key={`${id}-${index}`}>
<Image
src={imageUrl}
alt={`Images# ${index}`}
margin={"auto"}
maxWidth={"100%"}
onClick={() => openModalWithImage(imageUrl)}
/>
</Box>
))}
<Markdown
previewCode={!hidePreviews && !displaySummaryText}
isLoading={isLoading}
Expand All @@ -497,6 +518,7 @@ function MessageBase({
</Box>
)}
</Box>
<ImageModal isOpen={imageModalOpen} onClose={closeModal} imageSrc={selectedImage} />
</Flex>
</CardBody>
{footer && <CardFooter py={2}>{footer}</CardFooter>}
Expand Down
61 changes: 61 additions & 0 deletions src/components/PromptForm/AttachFileButton.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
import { useState, useRef } from "react";
import { IconButton, Input } from "@chakra-ui/react";
import { BsPaperclip } from "react-icons/bs";

import useMobileBreakpoint from "../../hooks/use-mobile-breakpoint";

type AttachProps = {
isDisabled: boolean;
onFileSelected: (base64: string) => void;
};

export default function AttachFileButton({ isDisabled = false, onFileSelected }: AttachProps) {
const isMobile = useMobileBreakpoint();
const [colorScheme] = useState<"blue" | "red">("blue");
const clipIconRef = useRef<HTMLButtonElement | null>(null);
const fileInputRef = useRef<HTMLInputElement>(null);

const handleFileChange = (event: React.ChangeEvent<HTMLInputElement>) => {
const files = event.target.files;
if (files) {
for (let i = 0; i < files.length; i++) {
const reader = new FileReader();
reader.onload = (e) => {
onFileSelected(e.target?.result as string);
};
reader.readAsDataURL(files[i]);
}
// Reset the input value after file read
event.target.value = "";
}
};

const handleClick = () => {
fileInputRef.current?.click();
};

return (
<>
<Input
multiple
type="file"
ref={fileInputRef}
hidden
onChange={handleFileChange}
accept="image/*"
/>
<IconButton
onClick={handleClick}
isRound
isDisabled={isDisabled}
colorScheme={colorScheme}
variant={isMobile ? "outline" : "ghost"}
icon={<BsPaperclip />}
aria-label="Attach file"
size={isMobile ? "lg" : "md"}
fontSize="18px"
ref={clipIconRef}
/>
</>
);
}