Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing model after pre-training #401

Closed
uconnectbrown opened this issue Jun 28, 2023 · 1 comment
Closed

Accessing model after pre-training #401

uconnectbrown opened this issue Jun 28, 2023 · 1 comment

Comments

@uconnectbrown
Copy link

Following the instructions laid out in the README, I have conducted pre-training using the Mosaic BERT model architecture and have pointed the save_folder to a directory in an S3 bucket. It appears as though the checkpointing has resulted in a .pt file being stored as desired, but I was wondering if there was a recommended method for loading the model back into SageMaker to perform inference and to confirm that the model is performing as expected. Any recommendations or suggestions would be immensely appreciated!

@dakinggg
Copy link
Collaborator

Hi, I'm not able to help with SageMaker, but https://github.com/mosaicml/composer/blob/457717427e4d84f645e04fd801e79ab45fd26877/composer/models/huggingface.py#L541 can be used to extract a config.json and pytorch_model.bin from the .pt file. These should be able to be loaded back in by the BertForMaskedLM and BertConfig class, and should generally be compatible with the code in examples repo. If you'd like to package the model code in the checkpoint as huggingface generally does, we unfortunatley dont have a script for that for bert, but you can roughly just copy the modeling files into the same folder, and we do have a script for our llm-foundry repo that should be a good starting point if you are trying to automate this (https://github.com/mosaicml/llm-foundry/blob/main/scripts/inference/convert_composer_to_hf.py).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants