Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running NER #321

Closed
yogeshhk opened this issue Jun 15, 2021 · 12 comments
Closed

Error while running NER #321

yogeshhk opened this issue Jun 15, 2021 · 12 comments
Labels
bug Something isn't working

Comments

@yogeshhk
Copy link
Contributor

馃悰 Bug

To Reproduce

Steps to reproduce the behavior:

  1. Run
    python examples/pytorch/name_entity_recognition/main.py --graph_type dependency_graph --gpu 0 --init_hidden_size 400 --hidden_size 128 --lr 0.01 --batch_size 100 --gnn_type graphsage --direction_option undirected

  2. Getting
    TypeError: Can't instantiate abstract class ConllDataset with abstract methods download

Expected behavior

Environment

  • Graph4NLP Version (e.g., 0.4.1):
  • Backend Library & Version (e.g., PyTorch 1.6.0): 1.8.1
  • OS (e.g., Linux): Windows 10
  • How you installed Graph4NLP (pip, source): source
  • Build command you used (if compiling from source): python setup.py install
  • Python version: 3.6.5
  • CUDA/cuDNN version (if applicable): CPU
  • GPU models and configuration (e.g. 2080Ti):
  • Any other relevant information:

Additional context

@AlanSwift
Copy link
Contributor

Thank you for giving us feedback.
We will check this issue asap.
@xguo7 will follow up it.

@AlanSwift
Copy link
Contributor

Could you please check whether the raw data exists in your computer (Please refer to https://github.com/graph4ai/graph4nlp/tree/master/examples/pytorch/name_entity_recognition/conll/raw)?
Currently, the download function is not implemented, and the raw data should be downloaded in the repo. (The download function will be implemented in the future version.) We are sorry for the inconvenience.

@yogeshhk
Copy link
Contributor Author

yogeshhk commented Jun 17, 2021

I see 3 files there:
eng.train
eng.testa
eng.testb

Files and content look fine with IOB data

@AlanSwift AlanSwift added the bug Something isn't working label Jun 17, 2021
@AlanSwift
Copy link
Contributor

We have conducted several tests on different computers with Windows 10 system and can't reproduce this problem.
May I ask under what path did you execute this command?

@yogeshhk
Copy link
Contributor Author

At root of graph4nlp folder...which has been forked-cloned...from that path, the Text classifier examples work..Here is the call stack

(graph4nlp) graph4nlp>python examples/pytorch/name_entity_recognition/main.py --graph_type dependency_graph --gpu 0 --init_hidden_size 400 --hidden_size 128 --lr 0.01 --batch_size 100 --gnn_type graphsage --direction_option undirected
Using backend: pytorch
starting build the dataset
Traceback (most recent call last):
  File "examples/pytorch/name_entity_recognition/main.py", line 547, in <module>
    runner = Conll()
  File "examples/pytorch/name_entity_recognition/main.py", line 319, in __init__
    self._build_dataloader()
  File "examples/pytorch/name_entity_recognition/main.py", line 342, in _build_dataloader
    tag_types=self.tag_types)
  File "C:\Users\yogesh.kulkarni\AppData\Local\Continuum\anaconda3\envs\graph4nlp\lib\typing.py", line 1231, in __new__
    return _generic_new(cls.__next_in_mro__, cls, *args, **kwds)
  File "C:\Users\yogesh.kulkarni\AppData\Local\Continuum\anaconda3\envs\graph4nlp\lib\typing.py", line 1186, in _generic_new
    return base_cls.__new__(cls)
TypeError: Can't instantiate abstract class ConllDataset with abstract methods download

@AlanSwift
Copy link
Contributor

AlanSwift commented Jun 17, 2021

This looks weird.
Could you please add the following code
import os
print("The raw data's path is", self.raw_dir)
print(os.path.exists(self.raw_dir))
after

def _download(self):

and see whether the raw data exists?

@yogeshhk
Copy link
Contributor Author

Its not hitting there....Let me debug further and I will keep you posted

@yogeshhk
Copy link
Contributor Author

in conll.py


    def download(self):
        print("The raw data's path is", self.raw_dir)
        print(os.path.exists(self.raw_dir))
       # raise NotImplementedError(
       #     'This dataset is now under test and cannot be downloaded. Please prepare the raw data yourself.')

Made it to work...but still I am not sure if this is good change...I will debug this more.

@AlanSwift
Copy link
Contributor

in conll.py


    def download(self):
        print("The raw data's path is", self.raw_dir)
        print(os.path.exists(self.raw_dir))
       # raise NotImplementedError(
       #     'This dataset is now under test and cannot be downloaded. Please prepare the raw data yourself.')

Made it to work...but still I am not sure if this is good change...I will debug this more.

Actually, this function will not be executed. So I guess there must be some fault. Since we can't reproduce this problem, I suggest you debug it more. Thank you!

@SaizhuoWang
Copy link
Contributor

SaizhuoWang commented Jun 17, 2021

To make it clearer, when instantiating a Dataset (in this case the ConllDataset), the library will check if the raw data are present in the environment, in this case the raw directory and the contents in it, which is specified in the raw_file_names property. If the raw data is not present, the download method will be called to download the raw data. In this case the download method is not implemented by ConllDataset, which means the raw data must be present as the GitHub repo does. Otherwise, the NotImplementedError is raised owing to an abstract method call.

@AlanSwift
Copy link
Contributor

I will close this issue.

@vinven7
Copy link

vinven7 commented Mar 5, 2022

@AlanSwift @SaizhuoWang @yogeshhk I am having this exact issue. I raised a new issue before I saw this one. Could you please help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants