Errors while running cell5 In V2 #10

JohnDavid07 · 2020-07-28T16:00:53Z

JohnDavid07 · 2020-07-28T16:06:00Z

``exclusions = ['__MACOSX/']

destination = "/content/drive/My Drive/"
download_tasks = [
{
'folder': 'gdppci',
'url': 'https://........................workers.dev/0:/..................................' (something private url)
},
]

print('##################################')
print('# Crawling all downloadable urls #')
print('##################################', end='\n\n')
tasks = []
for task in download_tasks:
tasks += crawler_v2(task['url'], [], os.path.join(destination, task['folder']), 0, exclusions, verbose=False)

print(json.dumps(tasks, indent=2), end='\n\n')

total_size = get_filesize(sum([int(task['size']) for task in tasks]))

print(json.dumps(tasks, indent=2))
print('\nTotal Task:', len(tasks))
print('Total size: %.3fGB' % total_size, end='\n\n')``

JohnDavid07 · 2020-07-28T16:08:35Z

Can you please give/post a guide on how to use.

NullBruce · 2020-08-22T14:34:20Z

@atlonxp can you please look into this? i can't find the problem with "tasks"

atlonxp · 2020-08-22T20:49:52Z

@JohnDavid07 @NullBruce could you provide me the goindex link I will try when I have time

NullBruce · 2020-08-23T13:45:17Z

@atlonxp literally any link.

Crawling all downloadable urls #
##################################

https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #2 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #3 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #4 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/
retry #5 https://tutnetflix.mlwdl.workers.dev/FrontEndMasters%20-%20Complete%20Intro%20to%20Containers/

Data is missing! change a plan -
use terminal CURL -
Nah, something went wrong!

JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read()
---> 56 response_json = json.loads(response)
57 except Exception as e:

4 frames

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
57 except Exception as e:
58 print('Nah, something went wrong!')
---> 59 print(e.args())
60 return []
61 except Exception as e:

TypeError: 'tuple' object is not callable

atlonxp · 2020-08-23T16:54:47Z

Huh! You don't seem to aware that tutflix (aka tutnetflix) has been banned from Cloudflare. The links you provided were not available long ago.

Easy way to check if the link working is to visit the GoIndex website.

if it displays its contents --> it is working
if it does not display anything, just loading progress toolbar --> not working at all.

NullBruce · 2020-08-24T12:43:16Z

@atlonxp here's a link that doesn't work, also i tried with multiple ones that are up.

#################################

Crawling all downloadable urls

##################################

https://manga.td-index.workers.dev/0:/
retry #2 https://manga.td-index.workers.dev/0:/
retry #3 https://manga.td-index.workers.dev/0:/
retry #4 https://manga.td-index.workers.dev/0:/
retry #5 https://manga.td-index.workers.dev/0:/

Data is missing! change a plan -
use terminal CURL -
Nah, something went wrong!

JSONDecodeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
55 response = os.popen("curl --globoff {} -d ''".format(url.geturl())).read()
---> 56 response_json = json.loads(response)
57 except Exception as e:

4 frames

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)

in crawler_v2(url, downloading_dict, path, level, exclusions, verbose)
57 except Exception as e:
58 print('Nah, something went wrong!')
---> 59 print(e.args())
60 return []
61 except Exception as e:

TypeError: 'tuple' object is not callable

Rudo2204 · 2020-08-24T12:49:40Z

@NullBruce Acrous index is not yet supported. See #7
I tried to poke around a bit but nothing seems to work :(

atlonxp · 2020-08-24T12:58:57Z

@Rudo2204 @NullBruce i need to have a look around how Acrous working. I think it is just a theme but there might as well be some script for a dynamic content generation (which is causing the problem)

atlonxp closed this as completed Aug 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors while running cell5 In V2 #10

Errors while running cell5 In V2 #10

JohnDavid07 commented Jul 28, 2020

JohnDavid07 commented Jul 28, 2020

JohnDavid07 commented Jul 28, 2020

NullBruce commented Aug 22, 2020 •

edited

Loading

atlonxp commented Aug 22, 2020

NullBruce commented Aug 23, 2020

atlonxp commented Aug 23, 2020

NullBruce commented Aug 24, 2020

Rudo2204 commented Aug 24, 2020

atlonxp commented Aug 24, 2020

Errors while running cell5 In V2 #10

Errors while running cell5 In V2 #10

Comments

JohnDavid07 commented Jul 28, 2020

JohnDavid07 commented Jul 28, 2020

print(json.dumps(tasks, indent=2), end='\n\n')

JohnDavid07 commented Jul 28, 2020

NullBruce commented Aug 22, 2020 • edited Loading

atlonxp commented Aug 22, 2020

NullBruce commented Aug 23, 2020

atlonxp commented Aug 23, 2020

NullBruce commented Aug 24, 2020

Crawling all downloadable urls

Rudo2204 commented Aug 24, 2020

atlonxp commented Aug 24, 2020

NullBruce commented Aug 22, 2020 •

edited

Loading