Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with downloader #888

Closed
fusedragon opened this issue May 20, 2021 · 1 comment
Closed

Issue with downloader #888

fusedragon opened this issue May 20, 2021 · 1 comment
Labels

Comments

@fusedragon
Copy link

Environment

Hydrus version: 440 executable
Platform: Windows 10

What happens

Not sure if this is the right place to ask, but I'm having an issue with a downloader. I tried editing the Instagram downloader to parse post captions as tags: caption instead of tags: title, but I was unable to download pictures after that. I tried re-importing the downloader, and using a clean copy of Hydrus on a different directory, but keep encountering the same issue.

Error message / Log file / Screenshots

Page Parser instagram user gallery api parser: Content Parser first page next gallery page url: Unable to parse that JSON: Expecting value: line 1 column 1 (char 0). JSON sample:

    <title>

Login • Instagram

</title>
    <meta name="robots" content="noimageindex, noarchive">
    <meta name="apple-mobile-web-app-status-bar-style" content="default">
    <meta name="mobile-web-app-capable" content="yes">
    <meta name="theme-color" content="#ffffff">
    <meta id="viewport" name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, viewport-fit=cover">
    <link rel="manifest" href="/data/manifest.json">

    <link rel="preload" href="/static/bundles/metro/ConsumerUICommons.css/9253cd2478eb.css" as="style" type="text/css" crossorigin="anonymous" />
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "hydrus\client\ClientParsing.py", line 2097, in Parse
parsed_texts = list( self._formula.Parse( parsing_context, parsing_text ) )
File "hydrus\client\ClientParsing.py", line 622, in Parse
raw_texts = self._ParseRawTexts( parsing_context, parsing_text )
File "hydrus\client\ClientParsing.py", line 731, in _ParseRawTexts
stream = formula.Parse( parsing_context, parsing_text )
File "hydrus\client\ClientParsing.py", line 622, in Parse
raw_texts = self._ParseRawTexts( parsing_context, parsing_text )
File "hydrus\client\ClientParsing.py", line 1711, in _ParseRawTexts
raise HydrusExceptions.ParseException( message )
hydrus.core.HydrusExceptions.ParseException: Unable to parse that JSON: Expecting value: line 1 column 1 (char 0). JSON sample:

    <title>

Login • Instagram

</title>
    <meta name="robots" content="noimageindex, noarchive">
    <meta name="apple-mobile-web-app-status-bar-style" content="default">
    <meta name="mobile-web-app-capable" content="yes">
    <meta name="theme-color" content="#ffffff">
    <meta id="viewport" name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, viewport-fit=cover">
    <link rel="manifest" href="/data/manifest.json">

    <link rel="preload" href="/static/bundles/metro/ConsumerUICommons.css/9253cd2478eb.css" as="style" type="text/css" crossorigin="anonymous" />
    <title>

Login • Instagram

</title>
    <meta name="robots" content="noimageindex, noarchive">
    <meta name="apple-mobile-web-app-status-bar-style" content="default">
    <meta name="mobile-web-app-capable" content="yes">
    <meta name="theme-color" content="#ffffff">
    <meta id="viewport" name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, viewport-fit=cover">
    <link rel="manifest" href="/data/manifest.json">

    <link rel="preload" href="/static/bundles/metro/ConsumerUICommons.css/9253cd2478eb.css" as="style" type="text/css" crossorigin="anonymous" />
    <title>

Login • Instagram

</title>
    <meta name="robots" content="noimageindex, noarchive">
    <meta name="apple-mobile-web-app-status-bar-style" content="default">
    <meta name="mobile-web-app-capable" content="yes">
    <meta name="theme-color" content="#ffffff">
    <meta id="viewport" name="viewport" content="width=device-width, initial-scale=1, minimum-scale=1, maximum-scale=1, viewport-fit=cover">
    <link rel="manifest" href="/data/manifest.json">

    <link rel="preload" href="/static/bundles/metro/ConsumerUICommons.css/9253cd2478eb.css" as="style" type="text/css" crossorigin="anonymous" />
@fusedragon fusedragon added the bug label May 20, 2021
@hydrusnetwork
Copy link
Owner

hydrusnetwork commented Oct 26, 2021

EDIT: Damn, I just realised this is super old! Sorry, I am catching up with my bug reports. I regret the delay.

Thank you for this report. It looks by that error that the parser was expecting some JSON (i.e. there was a JSON formula), but it received some HTML instead. I am not familiar with the instagram parser and cannot talk cleverly about it, but if you were altering things, was there any chance that when you added a Content Parser that somehow it reset its formula to JSON? Or could this html be a login page or similar, where it is redirecting back to a normal web page instead of an API endpoint?

I think it is probably worth editing that parser again and doing some tests with the URL here to try to figure out whether it should be pulling HTML or what. help->debug->report modes->network report mode can help here, if you want to be certain about which URLs the downloader is actually pulling.

In any case, I will improve the error text here to try to notice if it actually got html and say that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants