Open
Description
To aid in the design for both of these:
I'm going to gather a bunch of examples of how different LLMs accept multi-modal inputs. I'm particularly interested in the following:
- What kind of files do they accept?
- Do they accept file uploads, base64 inline files, URL references or a selection?
- How are these interspersed with text prompts? This will help inform the database schema design for Design new LLM database schema #556
- If included with a text prompt does it go before or after the files?
- How many files can be attached at once?
- Is extra information such as the mimetype needed? If so, this helps inform how the CLI design looks (can I do
--file filename.extor do I need some other mechanism that helps provide the type as well?)