Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added NER Support for Yoruba #2230

Merged
merged 4 commits into from Apr 19, 2021
Merged

Added NER Support for Yoruba #2230

merged 4 commits into from Apr 19, 2021

Conversation

paula813
Copy link
Contributor

NER Support for Yoruba (Introduction task for NLP lecture)

@alanakbik
Copy link
Collaborator

Hello @paula813 running the following code:

from flair.datasets import YORUBA_NER

corpus = YORUBA_NER()

Throws the following error:

Traceback (most recent call last):
  File "/home/alan/PycharmProjects/flair/local_load_datasets.py", line 10, in <module>
    corpus = YORUBA_NER()
  File "/home/alan/PycharmProjects/flair/flair/datasets/sequence_labeling.py", line 1632, in __init__
    cached_path(f"{yoruba_path}test.txt", Path("datasets") / dataset_name)
  File "/home/alan/PycharmProjects/flair/flair/file_utils.py", line 91, in cached_path
    return get_from_cache(url_or_filename, dataset_cache)
  File "/home/alan/PycharmProjects/flair/flair/file_utils.py", line 220, in get_from_cache
    f"HEAD request failed for url {url} with status code {response.status_code}."
OSError: HEAD request failed for url https://github.com/masakhane-io/masakhane-ner/tree/main/data/yortest.txt with status code 404.

I think the yoruba_path variable is wrong or missing a slash.

Copy link
Collaborator

@alanakbik alanakbik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix download bug

@paula813
Copy link
Contributor Author

Thanks for the information, I fixed the URL.

@alanakbik
Copy link
Collaborator

The URL is fixed, but the code still does not run:

from flair.datasets import YORUBA_NER

corpus = YORUBA_NER()

You can check #2227 for a related PR that works. (And please test your code before putting in a PR.)

@paula813
Copy link
Contributor Author

paula813 commented Apr 19, 2021

I tested it and I think training a model works now. Only after the training is finished it says:
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
But I do not have any idea why that happens?

@alanakbik
Copy link
Collaborator

@paula813 thanks, not sure about the error but the dataset seems to work now!

@alanakbik alanakbik merged commit 1fa14e4 into flairNLP:master Apr 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants