Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add spacy #368

Closed
wants to merge 5 commits into from
Closed

WIP: Add spacy #368

wants to merge 5 commits into from

Conversation

msabramo
Copy link
Contributor

@msabramo msabramo commented Mar 26, 2019

From https://spacy.io/:

spaCy is the best way to prepare text for deep learning. It interoperates seamlessly with TensorFlow, PyTorch, scikit-learn, Gensim and the rest of Python's awesome AI ecosystem. With spaCy, you can easily construct linguistically sophisticated statistical models for a variety of NLP problems.

@astrojuanlu
Copy link

Weird errors coming from NumPy:

/home/circleci/repo/packages/spacy/build/spacy-2.1.3/include/numpy/npy_cpu.h:72:6: error: Unknown CPU, please report this to numpy maintainers with     information about your platform (OS, CPU and compiler)
    #error Unknown CPU, please report this to numpy maintainers with \
     ^
In file included from spacy/_align.cpp:608:
In file included from /home/circleci/repo/packages/spacy/build/spacy-2.1.3/include/numpy/arrayobject.h:15:
In file included from /home/circleci/repo/packages/spacy/build/spacy-2.1.3/include/numpy/ndarrayobject.h:17:
In file included from /home/circleci/repo/packages/spacy/build/spacy-2.1.3/include/numpy/ndarraytypes.h:8:
/home/circleci/repo/packages/spacy/build/spacy-2.1.3/include/numpy/npy_endian.h:42:10: error: Unknown CPU: can not set endianness
        #error Unknown CPU: can not set endianness
         ^

For comparison, I went to see the logs for the pandas build (#150) and the only differences I see are that numpy is listed as a dependency in meta.yml and that numpy itself is older (1.14.5).

I have no idea of what is wrong but in general I would love to see guidelines to contribute packages to pyodide.

@msabramo
Copy link
Contributor Author

msabramo commented Aug 8, 2019

I suspect that the weird NumPy errors about endianness are happening because spaCy bundles its own copies of some NumPy headers (including npy_endian.h) and those headers are missing the patch https://github.com/iodide-project/pyodide/blob/master/packages/numpy/patches/add-emscripten-cpu.patch

I'm experimenting with adding that patch for spaCy.

The [thinc](https://github.com/explosion/thinc) package is a dependency
of [spaCy](https://spacy.io/).
@msabramo
Copy link
Contributor Author

msabramo commented Feb 7, 2020

I haven't made any progress on this. I'm going to close it; someone is welcome to pick up the work if they are interested.

@msabramo msabramo closed this Feb 7, 2020
@rth rth mentioned this pull request Jul 28, 2020
@ghost
Copy link

ghost commented Oct 23, 2021

Please do not close important, unresolved tickets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants