Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle parsing of the new Google UI #1609

Open
unixfox opened this issue May 30, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@unixfox
Copy link
Contributor

commented May 30, 2019

In a previous issue #1596 we talked about Google forcing the new UI to every Searx instance thus breaking the current parser and having no results when doing a search. This has been temporally fixed by changing the user agent to Internet Explorer on this PR #1597.

But this might not work/break in the future if Google decide for example to drop the support for Internet Explorer or if they found a way to make their new UI compatible with Internet Explorer.

So we need to rework the google.py by introducing a new parser that can handle the new Google UI.
A PR #1603 has been opened but it's still in work in progress (see this review).

@rachmadaniHaryono

This comment has been minimized.

Copy link
Contributor

commented May 30, 2019

hi, i am the one who made #1603 pr. this case is not often happen to me so i have difficulties to test it

if anyone want to help me, you can put these code on that pr

def response(resp):
    ...
    from datetime import datetime
    dts = datetime.now().strftime('%Y%m%d_%H%M%S')
    is_new_parser = False
    if not results:
        is_new_parser = True
        with open('google_{}.html'.format(dts), 'w') as f:
            f.write(resp.text)
        ...
    ...
    # return results
    if is_new_parser:
        import yaml
        with open('google_{}.yaml'.format(dts), 'w') as f:
            yaml.dump(results, f)
    return results

what this will do is to create html and yaml file from the engine. this can be used on #1606 to compare it with different result set

e: yaml part is optional. you can skip it and generate yaml later. or you can change the code above so it will only write yaml if no result given from first parser.

e3: remove original code and fix the second one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.