-
Notifications
You must be signed in to change notification settings - Fork 25.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
from_pretrained making internet connection if internet turned on #2867
Comments
I might be mistaken, but it seems that See transformers/src/transformers/file_utils.py Lines 330 to 336 in 0dbddba
|
Is there any way to turn that off? |
Not as far as I can see. What is your use-case? Why do you need this? |
A use case where validating against external servers is not ideal is if the network is behind a firewall and/or is a containerized microservice, and you want to avoid pinging outside the firewall as much as possible. I would appreciate a config flag that disables all external pinging. |
It's not comfortable for development - I'm doing many tests with the pretrained model and it's pretty annoying as it slows down my experiments considerably. I quess I could just save and load the model myself but I was curious why |
I think it should be possible by skipping this block (and setting transformers/src/transformers/file_utils.py Lines 399 to 409 in 0dbddba
which will then fallback to transformers/src/transformers/file_utils.py Lines 418 to 430 in 0dbddba
A flag should be added to the signature, something like: I might be able to work on this in the future, but it's not high on my priority list. Opinions? @minimaxir @Swarzkopf314 |
Yeah that would be great :) |
@Swarzkopf314 Can you tell me how you made the graphs in OP? (some library, I presume) So I can use them for testing. |
I made a wrapper for import pyinstrument
# with TreeProfiler(show_all=True):
# # code to profie...
class TreeProfiler(object):
def __init__(self, show_all=False):
self.profiler = pyinstrument.Profiler()
self.show_all = show_all # verbose output of pyinstrument profiler
def __enter__(self):
print("WITH TREE_PROFILER:")
self.profiler.start()
def __exit__(self, *args):
self.profiler.stop()
print(self.profiler.output_text(unicode=True, color=True, show_all=self.show_all)) |
You can try out my PR #2930 if you want. import pyinstrument
from transformers import DistilBertConfig, DistilBertModel, DistilBertTokenizer
class TreeProfiler():
def __init__(self, show_all=False):
self.profiler = pyinstrument.Profiler()
self.show_all = show_all # verbose output of pyinstrument profiler
def __enter__(self):
print("WITH TREE_PROFILER:")
self.profiler.start()
def __exit__(self, *args):
self.profiler.stop()
print(self.profiler.output_text(unicode=True, color=True, show_all=self.show_all))
def main():
with TreeProfiler(show_all=True):
config = DistilBertConfig.from_pretrained('distilbert-base-uncased', disable_outgoing=True)
model = DistilBertModel.from_pretrained('distilbert-base-uncased', disable_outgoing=True)
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased', disable_outgoing=True)
if __name__ == '__main__':
main() The above snippet will throw an error message when the expected files are not present in the cache. When they are, though, everything is loaded fine without the need of any additional lookups. |
Amazing, thanks a lot! <3 |
No problem. Note that I have not written tests for this functionality yet. I don't think it should break the library, but if you do find some inconsistencies, please let me know. |
Excellent! :D |
Note that the parameter name has been changed to |
Note that in practice, I find some parameter "local_files_first" which will resolve this issue even further. As named, it will first check if the model is cached. If not, it will make internet connection and download that model. I find this useful for production and testing, thus might write some pull requests for this new feature. |
I'd like to ask why model.from_pretrained makes ssl connection event though I provide cache_dir? If I turn off the internet everything works just fine.
and here's the output with internet turned off
The text was updated successfully, but these errors were encountered: