Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

test datasets issue with unitest #60

Closed
cgraywang opened this issue Apr 22, 2018 · 1 comment
Closed

test datasets issue with unitest #60

cgraywang opened this issue Apr 22, 2018 · 1 comment
Assignees

Comments

@cgraywang
Copy link
Contributor

This cause the PR failed: #53

It might be the same issue with #56

The error msg:

tests/unittest/test_datasets.py::test_rare_words Downloading tests/data/rarewords/rw.zip from http://www-nlp.stanford.edu/~lmthang/morphoNLM/rw.zip...

FAILED

=================================== FAILURES ===================================

_______________________________ test_rare_words ________________________________

def test_rare_words():
  data = nlp.data.RareWords(root=os.path.join('tests', 'data', 'rarewords'))

tests/unittest/test_datasets.py:172:


gluonnlp/data/word_embedding_evaluation.py:379: in init

super(RareWords, self).__init__(root=root)

gluonnlp/data/word_embedding_evaluation.py:163: in init

super(WordSimilarityEvaluationDataset, self).__init__(root=root)

gluonnlp/data/word_embedding_evaluation.py:120: in init

self._download_data()

gluonnlp/data/word_embedding_evaluation.py:131: in _download_data

verify=self._verify_ssl)

url = 'http://www-nlp.stanford.edu/~lmthang/morphoNLM/rw.zip'

path = 'tests/data/rarewords', overwrite = False

sha1_hash = 'bf9c5959a0a2d7ed8e51d91433ac5ebf366d4fb9', verify = True

def download(url, path=None, overwrite=False, sha1_hash=None, verify=True):

    """Download an given URL



    Parameters

    ----------

    url : str

        URL to download

    path : str, optional

        Destination path to store downloaded file. By default stores to the

        current directory with same name as in url.

    overwrite : bool, optional

        Whether to overwrite destination file if already exists.

    sha1_hash : str, optional

        Expected sha1 hash in hexadecimal digits. Will ignore existing file when hash is specified

        but doesn't match.

    verify : bool

        Toggle verification of SSL certificates.



    Returns

    -------

    str

        The file path of the downloaded file.

    """

    if path is None:

        fname = url.split('/')[-1]

    else:

        path = os.path.expanduser(path)

        if os.path.isdir(path):

            fname = os.path.join(path, url.split('/')[-1])

        else:

            fname = path



    if overwrite or not os.path.exists(fname) or (

            sha1_hash and not check_sha1(fname, sha1_hash)):

        dirname = os.path.dirname(os.path.abspath(os.path.expanduser(fname)))

        if not os.path.exists(dirname):

            os.makedirs(dirname)



        print('Downloading %s from %s...' % (fname, url))

        r = requests.get(url, stream=True, verify=verify)

        if r.status_code != 200:
          raise RuntimeError('Failed downloading url %s' % url)

E RuntimeError: Failed downloading url http://www-nlp.stanford.edu/~lmthang/morphoNLM/rw.zip

@leezu
Copy link
Contributor

leezu commented Apr 22, 2018

It's related to #56, but this time a different external server is at fault. We can't redistribute this dataset, however #58 and #62 should solve the issue.

@leezu leezu closed this as completed Apr 22, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants