Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added input channel pre-processing to enable vision models functioning on MNIST #64

Closed
wants to merge 6 commits into from

Conversation

mau-mar
Copy link
Contributor

@mau-mar mau-mar commented Mar 17, 2024

I'm Mauro Marino from Group 0 (Mauro Marino, William Powell) working on Project 1 (TensorRT integration in MASE).

An issue was opened in which errors in processing MNIST dataset with vision models was brought to attention. This is traceable to MNIST dataset being grayscale, hence providing only one channel, whilst MASE convolutional models expect 3 channels.

The proposed fix applies a series of checks to ascertain that the model-dataset combination requires intervention (since some feedforward neural network models are able to run on MNIST without further action), then, if needed, overrides the model architecture when using MNIST by including a single Conv2d, mapping the single input channel to 3 output channels, before the first convolutional layer.

bakhtiarZ pushed a commit to bakhtiarZ/mase that referenced this pull request Mar 18, 2024
* print args in cli

* print args, remove redundant args

* rename quantize *_bits to *_width, *_fraction_bits to *_frac

* add MaseTracer and mark_as_leaf

* replace "bits" with "width" when specifying bit width

* add integer matmul

* create MaseTracer

* only sync_dist on epoch end

* on_epoch=False for wrapper training step

* add new modifier and tracer. support custom func/module as leaf node

* remove old modifier

* supported save_name specified by users

* remove redundant comments

* get_dummy_inputs for bert-base-uncased and roberta-base

* save modified as pickle

* fixed bugs in MaseTracer

* OPTAttention mode 1 and 3 work

* traceable OPTDecoderLayer

* add func get_patched_nlp_model, which works in a way similar to get_nlp_model

* more facebook/opt models supported, not tested yet

* remove "mase_output" dir and use args.save instead

* fixed bugs in modifier, new get_dummy_inputs

* support training quantized facebook/opt

* use a unified cache dir

now we have
software/
  |-- cache
       |-- model_cache_dir
       |-- dataset_cache_dir
       |-- tokenizer_cache_dir

* update README.md and gitignore

* use user's output dir

* agreed --project_dir and --project for saving generated files
KelseyJing pushed a commit to KelseyJing/mase that referenced this pull request Mar 18, 2024
* print args in cli

* print args, remove redundant args

* rename quantize *_bits to *_width, *_fraction_bits to *_frac

* add MaseTracer and mark_as_leaf

* replace "bits" with "width" when specifying bit width

* add integer matmul

* create MaseTracer

* only sync_dist on epoch end

* on_epoch=False for wrapper training step

* add new modifier and tracer. support custom func/module as leaf node

* remove old modifier

* supported save_name specified by users

* remove redundant comments

* get_dummy_inputs for bert-base-uncased and roberta-base

* save modified as pickle

* fixed bugs in MaseTracer

* OPTAttention mode 1 and 3 work

* traceable OPTDecoderLayer

* add func get_patched_nlp_model, which works in a way similar to get_nlp_model

* more facebook/opt models supported, not tested yet

* remove "mase_output" dir and use args.save instead

* fixed bugs in modifier, new get_dummy_inputs

* support training quantized facebook/opt

* use a unified cache dir

now we have
software/
  |-- cache
       |-- model_cache_dir
       |-- dataset_cache_dir
       |-- tokenizer_cache_dir

* update README.md and gitignore

* use user's output dir

* agreed --project_dir and --project for saving generated files
@jianyicheng
Copy link
Collaborator

@mau-mar

Hi, we found there is a bug the yml files that stop CI running from forked repos. It has now been fixed and merged into the upstream.
Could you merge the main branch of the upstream into this PR which should trigger the CI properly?

Thanks,

removed docker credentials (from upstream/main)
@mau-mar
Copy link
Contributor Author

mau-mar commented Mar 26, 2024

@jianyicheng

Hi, I just merged upstream changes into the PR branch. Does it work now?

@jianyicheng jianyicheng deleted the branch DeepWok:x April 3, 2024 15:40
@jianyicheng jianyicheng closed this Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants