Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transformers NER model service start failure #39

Closed
jamnicki opened this issue Dec 29, 2022 · 3 comments
Closed

transformers NER model service start failure #39

jamnicki opened this issue Dec 29, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@jamnicki
Copy link

jamnicki commented Dec 29, 2022

Running service from following config, returns weird error. Service with other huggingface models for NER task ends with same result.

name: "default_app"
version: 0.1
active_learning:
  strategy:
    type: "RandomSampling"
    model:
      name: "dslim/bert-base-NER"
      hub: "huggingface"
      model: "bert-base-NER"
      tokenizer: "dslim/bert-base-NER"
      transformers_task: "ner"
      batch_size: 1
      device: "cpu"
  al_worker:
    protocol: "http"
    host: "0.0.0.0"
    port: 8081
    replicas: 1
  Waiting default_app... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2/1 0:00:02CRITI… default_app/rep-0@2592 can not load the executor from TorchALWorker                                                                                                 [12/29/22 19:05:32]
ERROR  default_app/rep-0@2592 TypeError("_sanitize_parameters() got an unexpected keyword argument 'return_all_scores'") during <class                                     [12/29/22 19:05:32]
       'jina.serve.runtimes.worker.WorkerRuntime'> initialization
        add "--quiet-error" to suppress the exception details
       Traceback (most recent call last):
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\orchestrate\pods\__init__.py", line 79, in run
           runtime = runtime_cls(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\runtimes\worker\__init__.py", line 39, in __init__
           super().__init__(args, **kwargs)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\runtimes\asyncio.py", line 77, in __init__
           self._loop.run_until_complete(self.async_setup())
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\asyncio\base_events.py", line 616, in run_until_complete
           return future.result()
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\runtimes\worker\__init__.py", line 104, in async_setup
           self._request_handler = WorkerRequestHandler(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\runtimes\worker\request_handling.py", line 54, in __init__
           self._load_executor(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\runtimes\worker\request_handling.py", line 204, in _load_executor
           self._executor: BaseExecutor = BaseExecutor.load_config(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\jaml\__init__.py", line 766, in load_config
           obj = JAML.load(tag_yml, substitute=False, runtime_args=runtime_args)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\jaml\__init__.py", line 174, in load
           r = yaml.load(stream, Loader=get_jina_loader_with_runtime(runtime_args))
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\yaml\__init__.py", line 81, in load
           return loader.get_single_data()
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\yaml\constructor.py", line 51, in get_single_data
           return self.construct_document(node)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\yaml\constructor.py", line 55, in construct_document
           data = self.construct_object(node)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\yaml\constructor.py", line 100, in construct_object
           data = constructor(self, node)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\jaml\__init__.py", line 582, in _from_yaml
           return get_parser(cls, version=data.get('version', None)).parse(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\jaml\parsers\executor\legacy.py", line 46, in parse
           obj = cls(
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\executors\decorators.py", line 60, in arg_wrapper
           f = func(self, *args, **kwargs)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\serve\helper.py", line 73, in arg_wrapper
           f = func(self, *args, **kwargs)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\alaas\server\executors\al_torch.py", line 104, in __init__
           self._model = pipeline(self._task,
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\transformers\pipelines\__init__.py", line 870, in pipeline
           return pipeline_class(model=model, framework=framework, task=task, **kwargs)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\transformers\pipelines\token_classification.py", line 126, in __init__
           super().__init__(*args, **kwargs)
         File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\transformers\pipelines\base.py", line 788, in __init__
           self._preprocess_params, self._forward_params, self._postprocess_params = self._sanitize_parameters(**kwargs)
       TypeError: _sanitize_parameters() got an unexpected keyword argument 'return_all_scores'
ERROR  Flow@16320 Flow is aborted due to ['default_app'] can not be started.                                                                                               [12/29/22 19:05:32]
WARNI… gateway/rep-0@16320 Pod was forced to close after 1 second. Graceful closing is not available on Windows.                                                           [12/29/22 19:05:33]
Traceback (most recent call last):                                                                                                                                                            
  File ".\server.py", line 9, in <module>
    main()
  File ".\server.py", line 5, in main
    Server.start_by_config('al-server.yml')
  File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\alaas\server\server.py", line 67, in start_by_config
    Flow(protocol=_proto, port=_port, host=_host) \
  File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\orchestrate\flow\builder.py", line 33, in arg_wrapper
    return func(self, *args, **kwargs)
  File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\orchestrate\flow\base.py", line 1782, in start
    self._wait_until_all_ready()
  File "C:\Users\jedrz\anaconda3\envs\active_learner\lib\site-packages\jina\orchestrate\flow\base.py", line 1913, in _wait_until_all_ready
    raise RuntimeFailToStart
jina.excepts.RuntimeFailToStart

AlaaS version: 0.2.0
System OS version: Win 11 22H2

Btw, does model really need to be initialized in service with random sampling strategy?

@huangyz0918 huangyz0918 added the bug Something isn't working label Dec 30, 2022
@huangyz0918
Copy link
Contributor

Thank you for your feedback! @jamnicki

For random sampling, I have skipped the model loading in the latest commit.

For NER tasks, the server can be started after the latest update. However, due to the task itself, sometimes the transformer model will return an empty list if there is no entity detected. In such cases, the AL strategy has no values to rank up. It is an interesting issue since currently our system only scores the score given by the hugging face models.

If you have any ideas about that, feel free to discuss them here.

@huangyz0918
Copy link
Contributor

Here is an example of this model's output:

Input texts:

The movie itself was to me a huge disappointment.
My name is Wolfgang and I live in Berlin

model outputs:

[[]]
[[{'entity': 'B-PER', 'score': 0.9990139, 'index': 4, 'word': 'Wolfgang', 'start': 11, 'end': 19}, {'entity': 'B-LOC', 'score': 0.999645, 'index': 9, 'word': 'Berlin', 'start': 34, 'end': 40}]]

For the first text, it cannot find a model output, which fails the data selection.

@jamnicki
Copy link
Author

jamnicki commented Jan 3, 2023

After the 0.2.1 release service with HF NER models starts properly with random sampling strategy and others. I'll open new issue about missing scores and unequal arrays size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants