Skip to content
This repository has been archived by the owner on Nov 22, 2022. It is now read-only.

Commit

Permalink
Support read file from http URL (#1317)
Browse files Browse the repository at this point in the history
Summary:
## Motivation and Context

1) We want to read pre-train model files from a public http/https URL using PathManager
2) As fvcore is available in PyPI, we should add it as dependency, such that we can use HTTPURLHandler to read http/https URL

#1220

## How Has This Been Tested
I'm able to download model file in Python Console. I'm not adding a unit test as it adds complexity to exclude the test for internal use case.

>>> with PathManager.open("https://dl.fbaipublicfiles.com/pytext/models/roberta/roberta_public.pt1", "rb") as f:
...   print(len(f.read()))
...
roberta_public.pt1:   9%|█████▊                                                           | 44.4M/497M [00:07<01:22, 5.47MB/s]
roberta_public.pt1: 497MB [01:19, 6.25MB/s]
497003753

## Types of changes

- [ ] Docs change / refactoring / dependency upgrade
- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)

## Checklist

- [x] My code follows the code style of this project.
- [ ] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [x] I have read the **CONTRIBUTING** document.
- [x] I have completed my CLA (see **CONTRIBUTING**)
- [ ] I have added tests to cover my changes.
- [x] All new and existing tests passed.
Pull Request resolved: #1317

Reviewed By: m3rlin45

Differential Revision: D21011330

Pulled By: hudeven

fbshipit-source-id: 6a7696f460aaa26b2d6fc9d4ae431502efb23875
  • Loading branch information
hudeven authored and facebook-github-bot committed Apr 15, 2020
1 parent 96ebe79 commit ba1b16e
Show file tree
Hide file tree
Showing 4 changed files with 10 additions and 54 deletions.
1 change: 1 addition & 0 deletions docs_requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ https://download.pytorch.org/whl/cpu/torch-1.2.0%2Bcpu-cp37-cp37m-manylinux1_x86
click
fairseq
future
fvcore
hypothesis<4.0
mock
numpy
Expand Down
3 changes: 2 additions & 1 deletion pytext/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from pytext.data.sources.data_source import DataSource
from pytext.task import load
from pytext.task.new_task import NewTask
from pytext.utils.file_io import PathManager
from pytext.utils.file_io import PathManager, register_http_url_handler
from pytext.workflow import _set_cuda

from .builtin_task import register_builtin_tasks
Expand All @@ -21,6 +21,7 @@


register_builtin_tasks()
register_http_url_handler()


Predictor = Callable[[Mapping[str, str]], Mapping[str, np.array]]
Expand Down
59 changes: 6 additions & 53 deletions pytext/utils/file_io.py
Original file line number Diff line number Diff line change
@@ -1,57 +1,10 @@
#!/usr/bin/env python3
# Copyright (c) Facebook, Inc. and its affiliates. All Rights Reserved
from fvcore.common.file_io import HTTPURLHandler, PathManager

"""
TODO: @stevenliu Deprecate this file after borc available in PyPI
"""
import os
import shutil
from typing import List


try: # noqa
from fvcore.common.file_io import PathManager

except ImportError:

class PathManager:
@staticmethod
def open(*args, **kwargs):
return open(*args, **kwargs)

@staticmethod
def copy(*args, **kwargs) -> bool:
try:
shutil.copyfile(*args, **kwargs)
return True
except Exception as e:
print("Error in file copy - {}".format(str(e)))
return False

@staticmethod
def get_local_path(path: str) -> str:
return path

@staticmethod
def exists(path: str) -> bool:
return os.path.exists(path)

@staticmethod
def isfile(path: str) -> bool:
return os.path.isfile(path)

@staticmethod
def isdir(path: str) -> bool:
return os.path.isdir(path)

@staticmethod
def ls(path: str) -> List[str]:
return os.listdir(path)

@staticmethod
def mkdirs(*args, **kwargs):
os.makedirs(*args, exist_ok=True, **kwargs)

@staticmethod
def rm(*args, **kwargs):
os.remove(*args, **kwargs)
def register_http_url_handler():
"""
support reading file from url starting with "http://", "https://", "ftp://"
"""
PathManager.register_handler(HTTPURLHandler(), allow_override=True)
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
click
fairseq
future
fvcore
hypothesis<4.0
joblib
numpy
Expand Down

0 comments on commit ba1b16e

Please sign in to comment.