Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors while running cell5 In V2 #10

Closed
JohnDavid07 opened this issue Jul 28, 2020 · 9 comments
Closed

Errors while running cell5 In V2 #10

JohnDavid07 opened this issue Jul 28, 2020 · 9 comments

Comments

@JohnDavid07
Copy link

err

@JohnDavid07
Copy link
Author

``exclusions = ['__MACOSX/']

destination = "/content/drive/My Drive/"
download_tasks = [
{
'folder': 'gdppci',
'url': 'https://........................workers.dev/0:/..................................' (something private url)
},
]

print('##################################')
print('# Crawling all downloadable urls #')
print('##################################', end='\n\n')
tasks = []
for task in download_tasks:
tasks += crawler_v2(task['url'], [], os.path.join(destination, task['folder']), 0, exclusions, verbose=False)

print(json.dumps(tasks, indent=2), end='\n\n')

total_size = get_filesize(sum([int(task['size']) for task in tasks]))

print(json.dumps(tasks, indent=2))
print('\nTotal Task:', len(tasks))
print('Total size: %.3fGB' % total_size, end='\n\n')``

@JohnDavid07
Copy link
Author

Can you please give/post a guide on how to use.

@NullBruce
Copy link

NullBruce commented Aug 22, 2020

@atlonxp can you please look into this? i can't find the problem with "tasks"

@atlonxp
Copy link
Owner

atlonxp commented Aug 22, 2020

@JohnDavid07 @NullBruce could you provide me the goindex link I will try when I have time

@NullBruce
Copy link

@atlonxp literally any link.

Crawling all downloadable urls #
##################################

https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #2 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #3 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #4 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #5 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/

  • Data is missing! change a plan -
  • use terminal CURL -
    Nah, something went wrong!


JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read()
---> 56 response_json = json.loads(response)
57 except Exception as e:

4 frames

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
57 except Exception as e:
58 print('Nah, something went wrong!')
---> 59 print(e.args())
60 return []
61 except Exception as e:

TypeError: 'tuple' object is not callable

@atlonxp
Copy link
Owner

atlonxp commented Aug 23, 2020

Huh! You don't seem to aware that tutflix (aka tutnetflix) has been banned from Cloudflare. The links you provided were not available long ago.

Easy way to check if the link working is to visit the GoIndex website.

  • if it displays its contents --> it is working
  • if it does not display anything, just loading progress toolbar --> not working at all.
    Screen Shot 2020-08-23 at 5 54 07 PM

@atlonxp atlonxp closed this as completed Aug 23, 2020
@NullBruce
Copy link

@atlonxp here's a link that doesn't work, also i tried with multiple ones that are up.

#################################

Crawling all downloadable urls

##################################

https://manga.td-index.workers.dev/0:/
retry #2 https://manga.td-index.workers.dev/0:/
retry #3 https://manga.td-index.workers.dev/0:/
retry #4 https://manga.td-index.workers.dev/0:/
retry #5 https://manga.td-index.workers.dev/0:/

  • Data is missing! change a plan -
  • use terminal CURL -
    Nah, something went wrong!


JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read()
---> 56 response_json = json.loads(response)
57 except Exception as e:

4 frames

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
57 except Exception as e:
58 print('Nah, something went wrong!')
---> 59 print(e.args())
60 return []
61 except Exception as e:

TypeError: 'tuple' object is not callable

@Rudo2204
Copy link
Contributor

@NullBruce Acrous index is not yet supported. See #7
I tried to poke around a bit but nothing seems to work :(

@atlonxp
Copy link
Owner

atlonxp commented Aug 24, 2020

@Rudo2204 @NullBruce i need to have a look around how Acrous working. I think it is just a theme but there might as well be some script for a dynamic content generation (which is causing the problem)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants