-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_result_from_job_id AssertionError while initializing redeployed LS/ML NER backend #117
Comments
Hi @wojnarabc
and the error should disappear. If you are using TransformersBasedTagger - please check that fit method ends with appropriate result. |
Hello @KonstantinKorotaev, there is a fit method in example in |
I run into the exact same problem with my custom backend. I am in the process of upgrading my system to the latest LS and backend. Everything was working fine with LS 1.1.1 and the backend from a year ago. After training, another job is sent for some reason, and then I already set
|
So, to be more clear, I based my custom backend on pytorch_transfer_learning.py. I had to set Now, when I monitor I made an ugly workaround for this issue using an OS environment variable that saves the location of last trained model locally. Note that this causes issues if users switch between projects and it won't redeploy an existing model. # ...
from label_studio_ml import model
model.LABEL_STUDIO_ML_BACKEND_V2_DEFAULT = True
# ...
class ImageClassifierAPI(model.LabelStudioMLBase):
def __init__(self, label_config=None, train_output=None,**kwargs):
super(ImageClassifierAPI, self).__init__(**kwargs)
# ...
if self.train_output:
print(f"trying to use {self.train_output['model_path']} as model path")
self.model = ImageClassifier(self.classes, self.boxType)
self.model.load(self.train_output['model_path'])
elif os.environ.get("LAST_TRAINED_MODEL") is not None:
model_path = os.environ.get("LAST_TRAINED_MODEL")
print(f"trying to use {model_path} as model path")
self.model = ImageClassifier(self.classes, self.boxType)
self.model.load(model_path)
else:
self.model = ImageClassifier(self.classes, self.boxType)
# ...
def fit(self, annotations, workdir=None, batch_size=12, num_epochs=100, **kwargs):
self.model.train(annotations, workdir, self.boxType, batch_size=batch_size, num_epochs=num_epochs)
# Save workdir in environment
os.environ["LAST_TRAINED_MODEL"] = workdir
return {'model_path': workdir} Having to manually change |
@jrdalenberg
Your fit method seems to be expecting annotations, but from version 1.4.1 there will be an event instead of annotations. Maybe this is leading to empty train_output. If you want to switch this logic off - disable webhook in LS. |
I see. It’s still odd that train_output is returned once and then cleared right after, though.
Thanks! I did not see this in the object recognition examples. I’ll test it out after my vacation 😄 |
same bug
022-08-19 03:17:21,961] [ERROR] [label_studio_ml.model::get_result_from_last_job::131] 1660820149 job returns exception: |
Load new model from: /opt/mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py /opt/mmdetection/checkpoints/faster_rcnn/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth seems return values not work |
Hi @idreamerhx
I don't see any error in your log, what do you mean by not work?
This means that your training procedure didn't return results, could you please check what your fit method returns? |
This should be a bug somehow. Because the results should be a list of dict but it tells me it is wrong. |
@sjtuytc could you please clarify what do you mean? |
I mean this assertion is too restrictive. A dummy result would tricker the assertion error. This should not be desired. |
I'm also getting the same assertion error problem. Does the model always have to return a dictionary? I'm following the examples provided in Label Studio and it seems like lists of dicts are also allowed: https://github.com/heartexlabs/label-studio-ml-backend/blob/db6d1d6d3efde1db503532f1f77a6977a7f100d2/label_studio_ml/examples/simple_text_classifier/simple_text_classifier.py#L93 |
Hi @seanswyi |
Errors after setting
How should we work around? |
Do you have folders with job results 1678761463 and 1678761465? |
Hello, I have the same problem: I've created a custom backend based on example in mmdetection.py. I don't use active learning (I think, i did not set that up). Every time I will switch to next image for annotation, I am getting this output in console:
The exception is caused because _get_result_from_job_id is returning None because os.path.exists(result_file) is returning False:
What is strange for me is that I do have job_result.json file in the required directory but probably it is not there when the check occurs? It must be created later. The contents of file is empty json. and here is my predict() method:
But I dont think this is due to predict() as I've said earlier, the check: |
Hi @TrueWodzu
This file is created during training session. |
Hi @KonstantinKorotaev thank you for your answer. So is this bug? Because it seems like the file is required but at the same time, some of us don't want to train. How can I prevent it from happening? |
@KonstantinKorotaev I don't have training turned on so A simple change in the code solves the problem of exception:
|
Hi @TrueWodzu |
Hi @KonstantinKorotaev many thanks for the change! While it definitely works for me, I am just wondering is this a correct approach? What I mean by that is, is it really required right now to have So if I have training turned off, then there should be no exception about result_file? |
Hi @TrueWodzu
Yes, but intention for this error is to get message in case of anybody tried to load model that wasn't trained successfully. |
I am trying to deploy NER example model trained on my local machine along with the Label Studio project to another machine. I've gone through following steps:
...export?exportType=JSON&download_all_tasks=true
command)When trying to initialize and pair LS and ML Backend on a new machine, i am getting :
[2022-05-30 10:18:56,133] [ERROR] [label_studio_ml.model::get_result_from_last_job::128] 1647350146 job returns exception: Traceback (most recent call last): File "/Users/user/Projects/label-studio-ml-backend/label_studio_ml/model.py", line 126, in get_result_from_last_job result = self.get_result_from_job_id(job_id) File "/Users/user/Projects/label-studio-ml-backend/label_studio_ml/model.py", line 108, in get_result_from_job_id assert isinstance(result, dict) AssertionError
and it keeps repeating for each job
Should any additional steps be performed during deploy of the project/model to other environments ?
I've tried with following LS versions (1.1.1 - my initial one, 1.4.1post1 - most recent one) and the most current code base of ML backend. Using Python 3.8 and MacOS for both source and target environments.
The text was updated successfully, but these errors were encountered: