-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for ktrain v0.26.x with tf_model and tf_preproc #369
Comments
As mentioned in this post, the 404 error is simply saying that there is no TensorFlow version ( Also, the 404 error is completely unrelated to your Are you able to load the import ktrain
from ktrain import text
TDATA = 'data/conll2003/train.txt'
VDATA = 'data/conll2003/valid.txt'
(trn, val, preproc) = text.entities_from_conll2003(TDATA, val_filepath=VDATA)
model = text.sequence_tagger('bilstm-bert', preproc, bert_model='emilyalsentzer/Bio_ClinicalBERT')
learner = ktrain.get_learner(model, train_data=trn, val_data=val, batch_size=128)
learner.fit(0.01, 1, cycle_len=1)
predictor = ktrain.get_predictor(learner.model, preproc)
predictor.predict('As of 2019, Donald Trump was still the President of the United States.')
predictor.save('/tmp/mypred')
reloaded_predictor = ktrain.load_predictor('/tmp/mypred')
reloaded_predictor.predict('Paul Newman is my favorite actor.') Although you will see a 404 client error (from transformers) due to the fact that the Pytorch version of Bio_ClinicalBERT is downloaded/loaded (because no TF version is available), all of the above should work using the latest version of ktrain. |
Thanks for the quick response - much appreciated!! The code you shared works. It does throw a 404 error at the model selection and prediction steps. So the issue with my app (the one that worked in February and is now not loading) is something else. In addition to the 404, I'm also seeing "ValueError('model must be of instance Model')" on the line 'predictor = ktrain.get_predictor(loaded_model, features).'
For clarity, the above code uses the original [February 2021] model.json, preproc.sav, and model.h5 objects, per this article. |
There are probably several things happening in your code. Unable to Load Old Predictor in New Version of ktrainThe newest version of ktrain (v0.26.x), uses How to DeployThe blog article you're following is quite old and was written before TF2 was released. Also, it is not necessary to manually save and load the model and
|
Following the approach of the FAQ entry, I'm still getting an error.
throws: Exception: Failed to load .preproc file in either the post v0.16.x loction (./model/tf_model.preproc) or pre v0.16.x location (./model.preproc) The tf_model.preproc and tf_model.h5 are in the model folder, which is in the root directory of the streamlit app where the call to load_predictor is made. I've tried supplying the direct file path as well, and that approach similarly didn't work. The .preproc file was generated last night after upgrading ktrain. But just in case, I tried the suggestion here to check the tf_model.preproc file to make sure it contains the line 'transformers.models.xlm.configuration_xlm.' When I open the .preproc file in a text editor it looks like gibberish. Sorry if I'm asking extremely obvious questions, but is there another way I should be doing this? |
I just tested it with StreamLit and everything worked for me. Step 1I trained an NER predictor using Step 2Create the StreamLit app with: # file: main.py
import streamlit as st
import ktrain
st.title('Using ktrain with streamlit')
st.subheader('Make a prediction from the supplied sentence:')
@st.cache(allow_output_mutation=True)
def load_model():
model = ktrain.load_predictor('/tmp/mypred')
return model
with st.spinner('Loading Model Into Memory...'):
model = load_model()
input = st.text_area('Enter your text', 'Type here')
if input != 'Type here':
with st.spinner('Doing AI things...'):
output = model.predict(input)
st.success("Raw output of predictor: %s"% (output)) Step 3Run the app with:
Use the app at URL printed out from above command. If you follow the same steps and experience problems, then it may be some sort of issue on your end.
with open('/tmp/mypred/tf_model.preproc', 'rb') as f: preproc = pickle.load(f) |
I'm mystified at why this isn't working. It's not an issue with Streamlit - the .load_predictor() function is throwing the "Failed to load .preproc file" from the terminal as well. From the terminal, trying the last line you suggested trying throws "_pickle.UnpicklingError: invalid load key, '\xef'." Here's the exact code in the app.py file (with the tf_model.preproc and tf_model.h5 files in the same directory as the app):
which results in "Failed to load .preproc file in either the post v0.16.x loction (tf_model.preproc) or pre v0.16.x location (.preproc)" I tried unpickling my Feb 2021 preproc and model files. Seems to work for preproc, but for model.h5, I'm getting "_pickle.UnpicklingError: invalid load key, 'H'." |
I feel like the last thing to do I just go and retrain the model to generate new (since Sunday) tf_model.preproc and tf_model.h5 files. On Sunday, I did the training in Colab, so I'm not sure if the predictor files got corrupted somehow as I was downloading them from that environment. |
You said earlier that, using the code I provided above to train an If that's the case, the problem sounds like something on your end related to 1) corrupted files downloaded from Google Colab or 2) something weird with your environment. I would do the following:
The version of TensorFlow shouldn't matter, but, as an extra measure, you might want to use the same version of TensorFlow both on Google Colab and on your local machine. |
Or is there some way I can make the code here back-compatible? This throws "model must be of instance Model" on the line loaded_model is class 'keras.engine.functional.Functional' |
And thank you for your suggestions for further troubleshooting with Colab. Sorry I didn't see your post before my last comment. |
I looked at your GitHub repository. I think your
After that things seem to work: import ktrain
p = ktrain.load_predictor('./fixed_predictor')
p.predict('Paul Newman is a great actor.')
# output
[('paul', 'B-person'),
('newman', 'I-person'),
('is', 'O'),
('a', 'O'),
('great', 'O'),
('actor', 'O'),
('.', 'O')] Also, you said earlier:
The reason for this is that I will close this issue, but feel free to reply if you have further issues. |
Hey @amaiya, hoping you can help me refactor my NER app so it works with ktrain v0.26.x.
Background: in February 2021, I had a Streamlit app that was using a model.h5, model.json, and preproc.sav file from training an NER model based on Bio_ClinicalBERT. Unfortunately, when I run the model this month, I get the following error: 404 Client Error: Not Found for url: https://huggingface.co/emilyalsentzer/Bio_ClinicalBERT/resolve/main/tf_model.h5.
Approach: in response, I thought retraining with the upgraded strain v0.26.x might be required to get the app to work. I created a tf_model.h5 and tf_model.preproc file, and updated the app.
Current issue: now when I run the app, I get the following error: invalid load key, '\xef'. Based on the line: features = pickle.load(open('tf_model.preproc', 'rb'))
Thoughts for how to proceed?
The text was updated successfully, but these errors were encountered: