Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault (core dumped) #47

Closed
theSage21 opened this issue Jun 13, 2018 · 13 comments
Closed

Segmentation fault (core dumped) #47

theSage21 opened this issue Jun 13, 2018 · 13 comments

Comments

@theSage21
Copy link

theSage21 commented Jun 13, 2018

Python 3.6.2 and neuralcoref-3.0, en_coref_sm the following code produces a segfault.

import spacy
nlp = spacy.load('en_coref_sm')
nlp('''Although the Drive moved to Massachusetts for the 1994 season, the AFL had a number of other teams which it considered "dynasties", including the Tampa Bay Storm (the only team that has existed in some form for all twenty-eight contested seasons), their arch-rival the Orlando Predators, the now-defunct San Jose SaberCats of the present decade, and their rivals the Arizona Rattlers. Where did the Drive franchise relocate to?''')

What can I do to help?

@maxindelicato
Copy link

I can confirm the issue too.

@thomwolf
Copy link
Member

Can you give me some additional information:
What system are you using? Windows/Mac/linux
Does it work on other examples?

@maxindelicato
Copy link

maxindelicato commented Jun 13, 2018

Same exact versions of Python, neuralcoref, and model as @theSage21. In addition, I'm on Ubuntu 16.04. It works seemingly intermittently. It works on many examples, but then not on others. I can get more failure examples with some digging (I'm currently parsing a large file, sentence by sentence).

@thomwolf
Copy link
Member

Can you share the file (or a part of it that doesn't work) with me? For example at thomas[at]huggingface[dot]co?

@theSage21
Copy link
Author

My ubuntu version is at 18.04 LTS.

In addition I'm parsing SQuAD 2.0 as the dataset. Download link is here

A sample script to generate text which causes segfault

import json
import spacy

nlp = spacy.load('en_coref_sm')
SKIP = 1039

with open('../data/train-v2.0.json', 'r') as f:
    df = json.load(f)


texts = []
for wiki in df['data']:
    for para in wiki['paragraphs']:
        for qas in para['qas']:
                text = para['context'] + ' ' + qas['question']
                texts.append(text)

for index, text in enumerate(texts):
    if index <= SKIP:
        continue
    print(index, text)
    nlp(text)

This will print the last text which segfaulted. you'll have to run it again to find others though.

@petermartigny
Copy link

I have the same issue on OS X 10.13.4, using python3.6.4, neuralcoref-3.0 and en_coref_md.
After removing the first token ("Although") of the given error sentence it works normally.

From my side I first noticed the error message when using "As" as a first token:

import spacy
import en_coref_md
nlp = en_coref_md.load()
nlp("As")
Segmentation fault: 11

@kyoungrok0517
Copy link

I experience the same problem. Hope to see this solved.

@thomwolf
Copy link
Member

I am working on it

@thomwolf
Copy link
Member

Thanks to everyone for helping here and giving examples (in particular @petermartigny and @theSage21).

I think the segfaults bugs are now fixed as well as the install scripts of the models. I am uploading new models (sm and md are already up, lg is still uploading). You can try them (just follow the install instructions of the readme).

There should be no need to install NeuralCoref (or even spaCy) separately, everything is bundled in the models. The quickest way to install them is pip install MODEL_URL in a clean environment and then use them with

import en_coref_sm # or en_coref_md or en_coref_lg
nlp = en_coref_sm.load()
doc = nlp("I have a dog, he is very smart")

Please tell me if there is an issue.

@theSage21
Copy link
Author

I can verify that the entire SQuAD dataset passes through without problems.

Super fast too. 😄

Thanks for your work @thomwolf

@kyoungrok0517
Copy link

Mine works too. Thanks!

@akshayparakh25
Copy link

Hello @thomwolf ,

Environment details: Ubuntu 18 and python 3.6
neural coref sample code from website causing segmentation fault. Could you please help!

@svlandeg
Copy link
Collaborator

Hi @akshayparakh25, it's more convenient if you open a new Issue as this one is 1,5 year old and closed. From the log above it looks like this was an issue in neuralcoref-3.0 and got fixed. You haven't mentioned the version you're using but if it's 3.0, could you retry with 4.0?

If you keep running into issues - please open a new issue and provide more details (setup, versions, example data, example code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants