Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue about training configuration on Binary Adapter with VTAB-1k Dataset #19

Open
leoli646 opened this issue Apr 24, 2024 · 1 comment

Comments

@leoli646
Copy link

Hello,
When I tried to replicate your binary_adapter experiment using the VTAB-1k dataset, I was unable to reproduce the results that you reported. I would like to discuss some potential issues with the training configuration that might be causing this discrepancy.

In similar works like VPT and SSF, different hyper-parameters (such as lr_rate, weight-decay, drop-path, etc.) are utilized for various datasets within VTAB-1k. However, the train.sh script in the binary_adapter codebase doesn't seem to account for these variations and applies default hyperparameters universally.

Could you advise on whether I should:

  1. Conduct a grid search to find the best hyperparameter set for each dataset?
  2. Or, should I use the hyperparameter settings from another public work like SSF, for instance?

Your insights would be greatly appreciated as I continue my experiments.

Looking forward to your reply!

@JieShibo
Copy link
Owner

In our experiments, we only searched for the scale factor. All the experiments are conducted on RTX3090 GPUs and may exhibit slight variations in results when executed on different devices. Further exploration of hyperparameters such as learning rate and weight decay could potentially enhance performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants