Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility of spacy models as ray reference #13506

Open
suneeta-mall opened this issue May 23, 2024 · 0 comments
Open

Incompatibility of spacy models as ray reference #13506

suneeta-mall opened this issue May 23, 2024 · 0 comments

Comments

@suneeta-mall
Copy link

How to reproduce the behaviour

When running spacy with ray, for inference usecase, an error ValueError: buffer source array is read-only is encountered. Full stack is shown below:

  File "~/.venv/lib/python3.8/site-packages/ray/_private/worker.py", line 2380, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::generate() (pid=596931, ip=172.16.147.156)
  File "test.py", line 9, in generate
    doc = list(nlp.pipe([corpus]))[0]
  File "~/.venv/lib/python3.8/site-packages/spacy/language.py", line 1618, in pipe
    for doc in docs:
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/transition_parser.pyx", line 245, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1650, in minibatch
    batch = list(itertools.islice(items, int(batch_size)))
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/transition_parser.pyx", line 245, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1650, in minibatch
    batch = list(itertools.islice(items, int(batch_size)))
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/pipe.pyx", line 55, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/pipe.pyx", line 55, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/trainable_pipe.pyx", line 73, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1650, in minibatch
    batch = list(itertools.islice(items, int(batch_size)))
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1703, in _pipe
    yield from proc.pipe(docs, **kwargs)
  File "spacy/pipeline/trainable_pipe.pyx", line 79, in pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/util.py", line 1722, in raise_error
    raise e
  File "spacy/pipeline/trainable_pipe.pyx", line 75, in spacy.pipeline.trainable_pipe.TrainablePipe.pipe
  File "~/.venv/lib/python3.8/site-packages/spacy/pipeline/tok2vec.py", line 126, in predict
    tokvecs = self.model.predict(docs)
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 334, in predict
    return self._func(self, X, is_train=False)[0]
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 54, in forward
    Y, inc_layer_grad = layer(X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 54, in forward
    Y, inc_layer_grad = layer(X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/with_array.py", line 36, in forward
    return cast(Tuple[SeqT, Callable], _ragged_forward(model, Xseq, is_train))
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/with_array.py", line 91, in _ragged_forward
    Y, get_dX = layer(Xr.dataXd, is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/concatenate.py", line 57, in forward
    Ys, callbacks = zip(*[layer(X, is_train=is_train) for layer in model.layers])
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/concatenate.py", line 57, in <listcomp>
    Ys, callbacks = zip(*[layer(X, is_train=is_train) for layer in model.layers])
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/chain.py", line 54, in forward
    Y, inc_layer_grad = layer(X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "~/.venv/lib/python3.8/site-packages/thinc/layers/hashembed.py", line 72, in forward
    output = model.ops.gather_add(vectors, keys)
  File "thinc/backends/numpy_ops.pyx", line 460, in thinc.backends.numpy_ops.NumpyOps.gather_add
  File "stringsource", line 660, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 350, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

Code to reproduce is:

from typing import List

import ray
import spacy


@ray.remote
def generate(nlp, corpus: str) -> List[str]:
    doc = list(nlp.pipe([corpus]))[0]
    return doc.noun_chunks


if __name__ == "__main__":
    ray.init()
    nlp = spacy.load(name="en_core_sci_sm")
    nlp_ref = ray.put(nlp)
    texts = ["sfdfdl?", "dgfhgfhjgj"] 
    ref_ids = [generate.remote(nlp_ref, text) for text in texts]

    while len(ref_ids):
        processed, unprocessed = ray.wait(ref_ids)
        ref_ids = unprocessed
        if processed:
            print(ray.get(processed))

Note: The error disappears when the model is initialised in action instead of as a ray reference.

I expected it to just work?

Your Environment

  • Operating System: Linux/mac
  • Python Version Used: 3.8.10
  • spaCy Version Used: '3.7.4'
  • Environment Information:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant