Added error handling #18

alireza1111 · 2023-11-17T14:51:57Z

No description provided.

borsna · 2023-11-20T08:54:34Z

daget/__main__.py

@@ -22,22 +22,22 @@ def main():
  # get doi/url and resolve to landing page
  try:
    url = get_redirect_url(args.url)
+    res = [ele for ele in ["dataverse.harvard.edu", "dataverse.no", "snd.se/catalogue", "snd.gu.se", "su.figshare.com", "figshare.scilifelab.se", "zenodo.org"] if(ele in url)]


checking the url on a hardcoded list would not cover all cases, all repositories where schema.org contains distrubution information will work and the list of figshare repositories is quite long: https://knowledge.figshare.com/type-of-client/institutions

A more generic solution would be to check if get_file_list_from_repo() returns any values.

borsna · 2023-11-20T08:57:08Z

daget/utils.py

-  except urllib.error.HTTPError:
-    raise ResolveError(f"{url} not found") 
+
+  try:


looks good, giving more precise errors is good

borsna · 2023-11-20T09:00:48Z

daget/__main__.py

    exit(1)

  print(f'landing page: {url}')

  # get desitnation directory and create directory
  desitnation = os.path.realpath(args.destination)

-  if not os.path.exists(desitnation):


not sure why this was removed, checking for empty destination directory was a feature i added to make sure the downloaded dataset will be exactly as the remote source.

borsna · 2023-12-15T10:17:29Z

daget/exceptions.py

+        super().__init__(message)
+        self.url = url
+        self.supported_urls = supported_urls or ["dataverse.harvard.edu", "dataverse.no", "snd.se/catalogue", "su.figshare.com", "figshare.scilifelab.se", "zenodo.org"]


hardcoded list of repository url:s should be removed.
daget should try to get a file list via schema.org distribution (if it´s not figshare or zenodo) and if this fails it should throw the error instead. keeping a list of all suported url:s in the source coude is not a sustainable soultion

borsna · 2023-12-15T10:17:46Z

daget/utils.py

 import urllib, urllib.error
 from daget.exceptions import RepoError, ResolveError


 def get_redirect_url(url):
  # if url provided is a shorthand doi (TODO: check with regex)
-  if not url.startswith(('http://', 'https://')):
+  if not re.match(r'^https?://', url):


borsna reviewed Nov 20, 2023

View reviewed changes

Added error handling

8482899

alireza1111 force-pushed the main branch from b802c8c to 8482899 Compare December 15, 2023 09:58

borsna reviewed Dec 15, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added error handling #18

Added error handling #18

alireza1111 commented Nov 17, 2023

borsna Nov 20, 2023

borsna Nov 20, 2023 •

edited

Loading

borsna Nov 20, 2023

borsna Dec 15, 2023

borsna Dec 15, 2023

Added error handling #18

Are you sure you want to change the base?

Added error handling #18

Conversation

alireza1111 commented Nov 17, 2023

borsna Nov 20, 2023

Choose a reason for hiding this comment

borsna Nov 20, 2023 • edited Loading

Choose a reason for hiding this comment

borsna Nov 20, 2023

Choose a reason for hiding this comment

borsna Dec 15, 2023

Choose a reason for hiding this comment

borsna Dec 15, 2023

Choose a reason for hiding this comment

borsna Nov 20, 2023 •

edited

Loading