Skip to content

Tokenization not working using v2.1 #3356

@rulai-huajunzeng

Description

@rulai-huajunzeng

How to reproduce the behaviour

I found a bug where tokenization is completely not working with version 2.1.0a10 on python 2.7. I have reproduced this on three of my machines.

$ conda create -n py27_spacy2 python=2.7
$ source activate py27_spacy2
$ pip install -U spacy-nightly
$ python -m spacy download en_core_web_sm
$ python -c "import spacy; nlp=spacy.load('en_core_web_sm'); doc=nlp(u'hello world'); print ','.join([t.text for t in doc])"
h,e,ll,o,w,o,r,l,d

Your Environment

  • Operating System: Ubuntu
  • Python Version Used: 2.7
  • spaCy Version Used: 2.1.0a10

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBugs and behaviour differing from documentationcompatCross-platform and cross-Python compatibilityfeat / tokenizerFeature: Tokenizerhelp wantedContributions welcome!upgradeIssues related to upgrading spaCy

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions