-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove n_ctx from configs #14165
Remove n_ctx from configs #14165
Conversation
yo, you probably meant to tag @patrickvonplaten :) |
Oops! Thanks! |
c6d49b3
to
d34f577
Compare
Thank you, Thomas! I haven't had a chance to confirm that it's truly identical. Perhaps someone could do it since I won't be able to do that now. But a question how are we going to deal with backward compatibly if we are removing a config key? Assume that @sgugger, what do you think? |
In terms of backward compatibility, it won't change anything to have this attribute or not: the config will still set the cc @LysandreJik |
what about the function arguments? this is part of this PR:
|
Changing the inside blocks is fine I think since those are internal, although I'm not sure the slight breaking change is worth it. However changing the config arguments used here and there(e.g. using All in all not sure this is worth changing anything compared to what we are potentially breaking. |
So that means that we will never be able to clean it up, not even in the proverbial v5. I suppose in v5 we could add an assert that if or alternatively perhaps asking a different question - is it even possible for a config to have different values for |
A major release won't change anything, as breaking changes to existing models on the Hub will never be accepted. I have no idea why the original design model has those two attributes, so I can't answer your second question. I guess the best path forward is to study all configs on the Hub with a script, and check if there is one with two different values for those attributes. If there are none, we can proceed with the PR as it is. |
Agreed! Is there an easy way to download all config files? does hub have a sort of index with all files and their locations? (assuming not in LFS as it's a tiny file) |
I think @patrickvonplaten might have a script to help on this :-) |
Okey as far as I see the only model where anything could potentially break is |
GPT2 can be cleaned up for sure IMO |
… exists should still be able to work
… are no configs such that it breaks
Okay so the only "breaking change" config wise are: GPTJForSequenceClassificationModel, OpenAIGPT* models. I scanned the hub, for configs where Subblocks are not considered as breaking change. cc @patrickvonplaten |
Script used to scan the hub. import argparse
from typing import Tuple
from transformers import AutoConfig
import json
from multiprocessing import Pool
def get_args():
parser = argparse.ArgumentParser()
# Required parameters
parser.add_argument(
"--model-ids-file", default=None, type=str, required=True, help="Path to the json file containing all models ids."
)
parser.add_argument(
"--procs", default=1, type=int, help="Number of processes."
)
return parser.parse_args()
def check_config(model_id) -> Tuple[str, bool, str]:
model_id = model_id.strip()
try:
config = AutoConfig.from_pretrained(model_id)
except:
return model_id, False, f"{model_id} cannot load config"
if isinstance(config.architectures, list) and len(config.architectures) > 0:
# here need to check for all `OpenAIGPT...` class names and filter
# for architecture in config.architectures:
# if architecture == "GPTJForSequenceClassification":
# if config.n_ctx != config.n_positions:
# return model_id, True, f"{config.model_type}, n_ctx != n_positions with n_ctx={config.n_ctx} n_positions={config.n_positions}"
# return model_id, False, f"No architecture matched GPTJForSequenceClassification or all have matching n_ctx n_positions"
for architecture in config.architectures:
if architecture.startswith("OpenAIGPT"):
if config.n_ctx != config.n_positions:
return model_id, True, f"{config.model_type}, n_ctx != n_positions with n_ctx={config.n_ctx} n_positions={config.n_positions}"
return model_id, False, f"No architecture matched GPTJForSequenceClassification or all have matching n_ctx n_positions"
# model_type_filter = ["gpt", "gpt2", "ctrl"]
# if config.model_type in model_type_filter:
# if config.n_ctx != config.n_positions:
# return model_id, True, f"{config.model_type}, n_ctx != n_positions with n_ctx={config.n_ctx} n_positions={config.n_positions}"
# return model_id, False, f"{config.model_type}, n_ctx == n_positions with n_ctx={config.n_ctx} n_positions={config.n_positions}"
# return model_id, False, f"{config.model_type} not in {model_type_filter}"
else:
return model_id, False, f"{model_id} is BAD!"
def main():
args = get_args()
with open(args.model_ids_file, "r") as f:
lines = json.load(f)
model_ids = [line["modelId"] for line in lines]
print(model_ids)
if args.procs > 1:
pool = Pool(args.procs)
model_ids_and_reasons = pool.imap(check_config, model_ids)
else:
model_ids_and_reasons = [check_config(model_id) for model_id in model_ids]
all_matches = []
for i, model_ids_and_reason in enumerate(model_ids_and_reasons):
if i % 1000 == 0:
print(i)
if model_ids_and_reason[1] is False:
continue
else:
all_matches.append(model_ids_and_reason)
for match in all_matches:
print(f"{match[0]} is not safe {match[2]}")
if __name__ == "__main__":
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on it! I verified that CTRL and GPT2 actually never make use of n_ctx
so it's safe to remove it.
The only models where n_ctx
has an effect are all OpenAIGPT*
and GPTJForSequenceClassification
. All those classes are rarely used and there is not a single config on the hub that has n_ctx != n_positions
=> so IMO this breaking change in the PR is ok!
Thanks a lot for checking the existing configs @patrickvonplaten ! |
Could we please save the scanning scripts |
Tests fails currently on master as well. Think this needs to be fixed and then the PR should be rebased :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me, thank you for working on this!
FWIW, run_tests_hub CI is currently broken - the failure is not related to this PR. |
Reran tests, seem to pass now. Previous failures seemed unrelated to this PR. Merging then. |
What does this PR do?
Remove
n_ctx
from configs as it's just a duplicate forn_positions
.GPTJ was left unchanged because it has a linear layer that depends on
n_ctx
. I'm unclear why, is it a bug and the author meantn_embed
?Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@patrickvonplaten @stas00