[iOS] support for multimodal #524

Kathryn-cat · 2023-07-10T16:21:45Z

This PR introduces multimodality for iOS. Specifically, below is a demo of running MiniGPT on iOS.

minigpt_ios.mov

Changes:

standalone image_embed.cc and image_module.py for image-module-related functionalities
support uploading images or photo taking in iOS
prefillImage function in LLMChat.mm handling conversion from UIImage* to void* to tvm::runtime::NDArray
add a image pre-processing module in relax_model

Update:

did not add minigpt model to the app-config.json file cuz it would affect users. let's add it in a followup pr after we upload the tuned minigpt model to HF

ios/MLCSwift/Sources/ObjC/LLMChat.mm

ios/MLCSwift/Sources/ObjC/include/LLMChat.h

tqchen · 2023-07-17T15:18:00Z

cc @MasterJH5574 @yzh119 @cyx-6 , would be great if you can help review

Kathryn-cat · 2023-07-17T15:26:10Z

Just a heads up, I didn't change the app-config.json file because I would like to formally add minigpt once we upload the tuned module to HF, so that it won't affect users right now.
If you would like some local testing, I can share the commands.

ios/MLCSwift/Sources/ObjC/LLMChat.mm

…oduced

Kathryn-cat · 2023-07-17T16:48:51Z

I'm planning to introduce chat.generate() python high-level API in a followup PR, because the API might need some further discussions. Let me know if everything is good in this PR.

dylanbeadle · 2023-08-20T16:46:00Z

@Kathryn-cat when (or where) can we expect to find minigpt4-7b-q3f16_0? I would like to test it if possible.

edit: found this, so I'll keep an eye on it there: #679

Kathryn-cat · 2023-08-26T16:02:37Z

Hey @dylanbeadle , minigpt4-7b-q3f16_0 is just for testing purpose. You can build minigpt4 with q3f16_0 (might be a slightly outdated quantization method) and test it in iOS. I'll work on testing the image-multimodal part on iOS soon too.

dylanbeadle · 2023-10-01T22:12:30Z

Hi @Kathryn-cat , I tried to build minigpt4 from https://huggingface.co/wangrongsheng/MiniGPT-4-LLaMA-7B using python3 build.py --model ../../mlc/MiniGPT-4-LLaMA-7B --quantization q3f16_0 --target metal but got "Unsupported model: {model_name}" (presumably because the model name does not start with 'minigpt' (from relax_model/minigpt.py", line 509). ANy suggestions or pointers on the correct(?) base MiniGPT4 model I should start with?

Lurrobert · 2023-11-03T16:16:10Z

Great work! 🎉

kumar-abhi · 2023-11-14T09:51:32Z

Hi @Kathryn-cat, I tried to build miniGPT-4 from https://huggingface.co/wangrongsheng/MiniGPT-4-LLaMA-7B but getting error as attached screenshot. Any suggestions or pointers to resolving this please.

amol-prakash · 2023-11-15T14:01:05Z

Hey @Kathryn-cat, I tried building MiniGPT-4 using the below URLs:

Url1 - https://huggingface.co/Vision-CAIR/MiniGPT-4
Url2 - https://huggingface.co/wangrongsheng/MiniGPT-4-LLaMA-7B

I'm getting below errors while doing this process. Any suggestions or pointers to resolving this please.

<username>@USER mlc-llm % python3 -m mlc_llm.build --hf-path /Vision-CAIR/MiniGPT-4 --target iphone --max-seq-len 768 --quantization q4f16_1
git: 'lfs' is not a git command. See 'git --help'.

The most similar command is
	log
Cloning into 'dist/models/MiniGPT-4'...
fatal: repository 'https://huggingface.co//Vision-CAIR/MiniGPT-4/' not found
Downloaded weights to dist/models/MiniGPT-4
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/build.py", line 47, in <module>
    main()
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/build.py", line 41, in main
    parsed_args = core._parse_args(parsed_args)  # pylint: disable=protected-access
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 429, in _parse_args
    parsed = _setup_model_path(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 471, in _setup_model_path
    validate_config(args.model_path)
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 514, in validate_config
    assert os.path.exists(
AssertionError: Expecting HuggingFace config, but file not found: dist/models/MiniGPT-4/config.json.
<username>@USER mlc-llm % python3 -m mlc_llm.build --hf-path /wangrongsheng/MiniGPT-4-LLaMA-7B --target iphone --max-seq-len 768 --quantization q4f16_1
git: 'lfs' is not a git command. See 'git --help'.

The most similar command is
	log
Cloning into 'dist/models/MiniGPT-4-LLaMA-7B'...
fatal: repository 'https://huggingface.co//wangrongsheng/MiniGPT-4-LLaMA-7B/' not found
Downloaded weights to dist/models/MiniGPT-4-LLaMA-7B
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/build.py", line 47, in <module>
    main()
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/build.py", line 41, in main
    parsed_args = core._parse_args(parsed_args)  # pylint: disable=protected-access
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 429, in _parse_args
    parsed = _setup_model_path(parsed)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 471, in _setup_model_path
    validate_config(args.model_path)
  File "/Users/<username>/Projects/Training Projects/mlc-llm/mlc-llm/mlc_llm/core.py", line 514, in validate_config
    assert os.path.exists(
AssertionError: Expecting HuggingFace config, but file not found: dist/models/MiniGPT-4-LLaMA-7B/config.json.

dylanbeadle · 2023-11-15T19:13:43Z

@amol-prakash seems you have an extra starting “/“ in your —hg-path parameter and the repos are not found. But even after fixing that, you may run into a similar issue as the one I posted above (#524 (comment)).

Kathryn-cat force-pushed the pr-ios-multimodal branch 7 times, most recently from 8487d88 to 6e24278 Compare July 11, 2023 17:56

Kathryn-cat marked this pull request as ready for review July 11, 2023 18:06

Kathryn-cat force-pushed the pr-ios-multimodal branch 6 times, most recently from ed37c7c to c5f1116 Compare July 14, 2023 21:49

tqchen reviewed Jul 14, 2023

View reviewed changes

ios/MLCSwift/Sources/ObjC/LLMChat.mm Outdated Show resolved Hide resolved

tqchen reviewed Jul 14, 2023

View reviewed changes

ios/MLCSwift/Sources/ObjC/LLMChat.mm Outdated Show resolved Hide resolved

tqchen reviewed Jul 14, 2023

View reviewed changes

ios/MLCSwift/Sources/ObjC/LLMChat.mm Outdated Show resolved Hide resolved

tqchen reviewed Jul 14, 2023

View reviewed changes

ios/MLCSwift/Sources/ObjC/include/LLMChat.h Show resolved Hide resolved

tqchen reviewed Jul 17, 2023

View reviewed changes

ios/MLCSwift/Sources/ObjC/LLMChat.mm Outdated Show resolved Hide resolved

Kathryn-cat added 9 commits July 18, 2023 00:34

finished

649a6c1

update naming from mod to module

27cb892

combine runtimeStatsTextImageMod into runtimeStatsText

091d3fb

update naming

f2573b6

provide clean conversion from rgba to rgb

d1cfcdc

deleted conversion

d191ca2

convert app-config to previous version

dcbf0dd

convert app-config to previous version

d70c767

create one standalone python module

d4b9309

Kathryn-cat added 4 commits July 18, 2023 00:34

avoid repeat calloc

745f32d

use vector instead of pointer

49d7598

update naming

2ef97a4

polish naming

77c8b47

Kathryn-cat force-pushed the pr-ios-multimodal branch from a2ac87b to 77c8b47 Compare July 17, 2023 16:34

revert changes in gradio - let's fix it after chat.generate() is intr…

e2d5d8e

…oduced

move back

27084ab

tqchen approved these changes Jul 17, 2023

View reviewed changes

tqchen merged commit 6cf8d4f into mlc-ai:main Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[iOS] support for multimodal #524

[iOS] support for multimodal #524

Kathryn-cat commented Jul 10, 2023 •

edited

tqchen commented Jul 17, 2023

Kathryn-cat commented Jul 17, 2023 •

edited

Kathryn-cat commented Jul 17, 2023

dylanbeadle commented Aug 20, 2023 •

edited

Kathryn-cat commented Aug 26, 2023

dylanbeadle commented Oct 1, 2023

Lurrobert commented Nov 3, 2023

kumar-abhi commented Nov 14, 2023

amol-prakash commented Nov 15, 2023 •

edited

dylanbeadle commented Nov 15, 2023

[iOS] support for multimodal #524

[iOS] support for multimodal #524

Conversation

Kathryn-cat commented Jul 10, 2023 • edited

tqchen commented Jul 17, 2023

Kathryn-cat commented Jul 17, 2023 • edited

Kathryn-cat commented Jul 17, 2023

dylanbeadle commented Aug 20, 2023 • edited

Kathryn-cat commented Aug 26, 2023

dylanbeadle commented Oct 1, 2023

Lurrobert commented Nov 3, 2023

kumar-abhi commented Nov 14, 2023

amol-prakash commented Nov 15, 2023 • edited

dylanbeadle commented Nov 15, 2023

Kathryn-cat commented Jul 10, 2023 •

edited

Kathryn-cat commented Jul 17, 2023 •

edited

dylanbeadle commented Aug 20, 2023 •

edited

amol-prakash commented Nov 15, 2023 •

edited