Skip to content

Commit

Permalink
[pull] main from benbusby:main (#8)
Browse files Browse the repository at this point in the history
* Render error message w/o `safe` filter

The error message shown in the error template does not need to be
rendered using the safe filter, and furthermore opens up an XSS
vulnerability.

* Only create ip card if main result div is found

The ip address card that is created for searches like "my ip" only needs
to be created/inserted if a main result div id is found.

Fixes benbusby#735

* Remove unused `/url` endpoint

The `/url` endpoint was previously used as a way of mirroring the
`/url?q=<result domain>` formatting of locations in search results from
Google. Rather than have this unnecessary intermediary step, the result
path was extracted and used as the immediate path for each result item
instead.

This endpoint hasn't been in use for many versions and has been in need
of removal for quite some time.

* Bump version to 0.7.2

* Fix pipx dependencies (benbusby#738)

Missing cssutils

* Remove "/" before endpoints & tags (benbusby#734)

Removes the leading slash before imgres and other endpoints

Fix benbusby#733

* Add `WHOOGLE_URL_PREFIX` to app.json (benbusby#737)

* Update zh-tw translation (benbusby#736)

* Fix german translation error (benbusby#742)

"Nachrichten" is the correct translation of "News"

* Replace public instance url

s.alefvanoon.xyz -> s.tokhmi.xyz

Fixes benbusby#743

* Use `window` from Endpoint enum for anon view (benbusby#748)

Removes previously hardcoded "/window" from anon view links

* Update and add instances [skip ci] (benbusby#750)

Updates Garudalinux instance
Add dr460nf1r3.org instance

* Use `lax` for session `SameSite` value (not `strict`)

SESSION_COOKIE_SAMESITE must be set to 'lax' to allow the user's
previous session to persist when accessing the instance from an external
link. Setting this value to 'strict' causes Whoogle to revalidate a new
session, and fail, resulting in cookies being disabled.

This could be re-evaluated if Whoogle ever switches to client side
configuration instead.

Fixes benbusby#749

* Improve G page distinction between footer and results

Pages in the Whoogle footer that by default route to Google pages were
previously being removed, but caused results that also routed to similar
pages to no longer be accessible. This was due to the removal of the
'/url' endpoint that Google uses for each result.

To fix this, the result link is now parsed so that the domain of the
result can be checked against the disallowed G page list. Since results
are delivered in a "/url?q=<domain>" format -- even for pages to
Google's own products -- and the footer links are formatted as
"<product>.google.com", footer links are removed and result links are
parsed correctly.

Fixes benbusby#747

* Replace leading slash for image links (benbusby#762)

The leading slash was previously removed without noticing it was part of a
string replacement in benbusby#734. This caused the href of "View Image" contain a
leading "/" which is wrong.

* Remove duplicated handling of /url result links (benbusby#769)

It appears that result links beginning with '/url' were mistakenly
commited with an inefficient filtering process in its place. With the
way the code is structured, this less effective '/url' link filter took
precedence over the previous link filter, and also caused users with the
"open link in new tab" config enabled to no longer have access to that
feature.

Fixes benbusby#769

Co-authored-by: Ben Busby <contact@benbusby.com>
Co-authored-by: Sandro <sandro.jaeckel@gmail.com>
Co-authored-by: invis-z <22781620+invis-z@users.noreply.github.com>
Co-authored-by: xatier <xatierlike@gmail.com>
Co-authored-by: hoschi1337 <58056262+hoschi1337@users.noreply.github.com>
Co-authored-by: Nico <njcrypted@protonmail.com>
Co-authored-by: Joao A. Candido Ramos <joao.candido@etu.unige.ch>
  • Loading branch information
8 people committed May 26, 2022
1 parent 234fdc0 commit 21a6913
Show file tree
Hide file tree
Showing 12 changed files with 71 additions and 66 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -514,9 +514,10 @@ A lot of the app currently piggybacks on Google's existing support for fetching
| Website | Country | Language | Cloudflare |
|-|-|-|-|
| [https://search.albony.xyz](https://search.albony.xyz/) | 🇮🇳 IN | Multi-choice | |
| [https://search.garudalinux.org](https://search.garudalinux.org) | 🇩🇪 DE | Multi-choice | |
| [https://search.garudalinux.org](https://search.garudalinux.org) | 🇫🇮 FI | Multi-choice ||
| [https://search.dr460nf1r3.org](https://search.dr460nf1r3.org) | 🇩🇪 DE | Multi-choice ||
| [https://whooglesearch.net](https://whooglesearch.net) | 🇩🇪 DE | Spanish | |
| [https://s.alefvanoon.xyz](https://s.alefvanoon.xyz) | 🇺🇸 US | Multi-choice ||
| [https://s.tokhmi.xyz](https://s.tokhmi.xyz) | 🇺🇸 US | Multi-choice ||
| [https://www.whooglesearch.ml](https://www.whooglesearch.ml) | 🇺🇸 US | English | |
| [https://search.sethforprivacy.com](https://search.sethforprivacy.com) | 🇩🇪 DE | English | |
| [https://whoogle.dcs0.hu](https://whoogle.dcs0.hu) | 🇭🇺 HU | Multi-choice | |
Expand Down
5 changes: 5 additions & 0 deletions app.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@
],
"stack": "container",
"env": {
"WHOOGLE_URL_PREFIX": {
"description": "The URL prefix to use for the whoogle instance (i.e. \"/whoogle\")",
"value": "",
"required": false
},
"WHOOGLE_USER": {
"description": "The username for basic auth. WHOOGLE_PASS must also be set if used. Leave empty to disable.",
"value": "",
Expand Down
12 changes: 10 additions & 2 deletions app/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,24 @@
load_dotenv(os.path.join(os.path.dirname(os.path.abspath(__file__)),
dotenv_path))

# Session values
# NOTE: SESSION_COOKIE_SAMESITE must be set to 'lax' to allow the user's
# previous session to persist when accessing the instance from an external
# link. Setting this value to 'strict' causes Whoogle to revalidate a new
# session, and fail, resulting in cookies being disabled.
#
# This could be re-evaluated if Whoogle ever switches to client side
# configuration instead.
app.default_key = generate_user_key()
app.config['SECRET_KEY'] = os.urandom(32)
app.config['SESSION_TYPE'] = 'filesystem'
app.config['SESSION_COOKIE_SAMESITE'] = 'strict'
app.config['SESSION_COOKIE_SAMESITE'] = 'Lax'

if os.getenv('HTTPS_ONLY'):
app.config['SESSION_COOKIE_NAME'] = '__Secure-session'
app.config['SESSION_COOKIE_SECURE'] = True

app.config['VERSION_NUMBER'] = '0.7.1'
app.config['VERSION_NUMBER'] = '0.7.2'
app.config['APP_ROOT'] = os.getenv(
'APP_ROOT',
os.path.dirname(os.path.abspath(__file__)))
Expand Down
9 changes: 6 additions & 3 deletions app/filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def clean_css(css: str, page_url: str) -> str:
continue
css = css.replace(
url,
f'/element?type=image/png&url={abs_url}'
f'{Endpoint.element}?type=image/png&url={abs_url}'
)

return css
Expand Down Expand Up @@ -410,8 +410,10 @@ def update_link(self, link: Tag) -> None:
None (the tag is updated directly)
"""
link_netloc = urlparse.urlparse(link['href']).netloc

# Remove any elements that direct to unsupported Google pages
if any(url in link['href'] for url in unsupported_g_pages):
if any(url in link_netloc for url in unsupported_g_pages):
# FIXME: The "Shopping" tab requires further filtering (see #136)
# Temporarily removing all links to that tab for now.
parent = link.parent
Expand Down Expand Up @@ -466,7 +468,8 @@ def update_link(self, link: Tag) -> None:
if href.startswith(MAPS_URL):
# Maps links don't work if a site filter is applied
link['href'] = MAPS_URL + "?q=" + clean_query(q)
elif href.startswith('/?') or href.startswith('/search?'):
elif (href.startswith('/?') or href.startswith('/search?') or
href.startswith('/imgres?')):
# make sure that tags can be clicked as relative URLs
link['href'] = href[1:]
elif href.startswith('/intl/'):
Expand Down
18 changes: 1 addition & 17 deletions app/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -432,22 +432,6 @@ def config():
return redirect(url_for('.index'), code=403)


@app.route(f'/{Endpoint.url}', methods=['GET'])
@session_required
@auth_required
def url():
if 'url' in request.args:
return redirect(request.args.get('url'))

q = request.args.get('q')
if len(q) > 0 and 'http' in q:
return redirect(q)
else:
return render_template(
'error.html',
error_message='Unable to resolve query: ' + q)


@app.route(f'/{Endpoint.imgres}')
@session_required
@auth_required
Expand Down Expand Up @@ -536,7 +520,7 @@ def window():

# Use anonymous view for all links on page
for a in results.find_all('a', {'href': True}):
a['href'] = '/window?location=' + a['href'] + (
a['href'] = f'{Endpoint.window}?location=' + a['href'] + (
'&nojs=1' if 'nojs' in request.args else '')

# Remove all iframes -- these are commonly used inside of <noscript> tags
Expand Down
32 changes: 16 additions & 16 deletions app/static/settings/translations.json
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@
"images": "Bilder",
"maps": "Maps",
"videos": "Videos",
"news": "Nieuws",
"news": "Nachrichten",
"books": "Bücher",
"anon-view": "Anonyme Ansicht"
},
Expand Down Expand Up @@ -554,27 +554,27 @@
"lang_zh-TW": {
"search": "搜尋",
"config": "設定",
"config-country": "設置國家",
"config-lang": "界面語言",
"config-country": "設定國家",
"config-lang": "介面語言",
"config-lang-search": "搜尋語言",
"config-near": "接近",
"config-near-help": "城市名",
"config-block": "排除",
"config-block-help": "網址列表,以逗號分隔",
"config-block-title": "按標題屏蔽",
"config-block-title-help": "使用正則表達式",
"config-block-url": "按網址屏蔽",
"config-block-url-help": "使用正則表達式",
"config-near-help": "城市名稱",
"config-block": "封鎖",
"config-block-help": "以逗號分隔之網址列表",
"config-block-title": "按標題封鎖",
"config-block-title-help": "使用正規表達式",
"config-block-url": "按網址封鎖",
"config-block-url-help": "使用正規表達式",
"config-theme": "主題",
"config-nojs": "在匿名視圖中刪除 Javascript",
"config-anon-view": "顯示匿名查看鏈接",
"config-nojs": "於匿名檢視中刪除 JavaScript",
"config-anon-view": "顯示匿名檢視鏈接",
"config-dark": "深色模式",
"config-safe": "安全搜尋",
"config-alts": "將社群網站連結換掉",
"config-alts-help": "將 Twitter/YouTube/Instagram 等網站的連結替換為尊重隱私的第三方網站",
"config-alts": "將社群網站連結替換",
"config-alts-help": "將 Twitter/YouTube/Instagram 等網站之連結替換為尊重隱私的第三方網站",
"config-new-tab": "以新分頁開啟連結",
"config-images": "完整尺寸圖片搜尋",
"config-images-help": "(實驗性)在桌面版圖片搜尋中增加「查看圖片」選項。這會使搜尋結果圖片解析度降低",
"config-images-help": "(實驗性)在桌面版圖片搜尋中增加「檢視圖片」選項。這會使搜尋結果圖片解析度降低",
"config-tor": "使用 Tor",
"config-get-only": "僅限於 GET 要求",
"config-url": "首頁網址",
Expand All @@ -595,7 +595,7 @@
"videos": "影片",
"news": "新聞",
"books": "書籍",
"anon-view": "匿名視圖"
"anon-view": "匿名檢視"
},
"lang_bg": {
"search": "Търсене",
Expand Down
2 changes: 1 addition & 1 deletion app/templates/error.html
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
<div>
<h1>Error</h1>
<p>
{{ error_message|safe }}
{{ error_message }}
</p>
<hr>
<p>
Expand Down
46 changes: 24 additions & 22 deletions app/utils/results.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ def append_nojs(result: BeautifulSoup) -> None:
"""
nojs_link = BeautifulSoup(features='html.parser').new_tag('a')
nojs_link['href'] = f'/{Endpoint.window}?nojs=1&location=' + result['href']
nojs_link['href'] = f'{Endpoint.window}?nojs=1&location=' + result['href']
nojs_link.string = ' NoJS Link'
result.append(nojs_link)

Expand All @@ -206,7 +206,7 @@ def append_anon_view(result: BeautifulSoup, config: Config) -> None:
av_link = BeautifulSoup(features='html.parser').new_tag('a')
nojs = 'nojs=1' if config.nojs else 'nojs=0'
location = f'location={result["href"]}'
av_link['href'] = f'/{Endpoint.window}?{nojs}&{location}'
av_link['href'] = f'{Endpoint.window}?{nojs}&{location}'
translation = current_app.config['TRANSLATIONS'][
config.get_localization_lang()
]
Expand All @@ -227,26 +227,28 @@ def add_ip_card(html_soup: BeautifulSoup, ip: str) -> BeautifulSoup:
BeautifulSoup
"""
# HTML IP card tag
ip_tag = html_soup.new_tag('div')
ip_tag['class'] = 'ZINbbc xpd O9g5cc uUPGi'

# For IP Address html tag
ip_address = html_soup.new_tag('div')
ip_address['class'] = 'kCrYT ip-address-div'
ip_address.string = ip

# Text below the IP address
ip_text = html_soup.new_tag('div')
ip_text.string = 'Your public IP address'
ip_text['class'] = 'kCrYT ip-text-div'

# Adding all the above html tags to the IP card
ip_tag.append(ip_address)
ip_tag.append(ip_text)

# Insert the element at the top of the result list
html_soup.select_one('#main').insert_before(ip_tag)
main_div = html_soup.select_one('#main')
if main_div:
# HTML IP card tag
ip_tag = html_soup.new_tag('div')
ip_tag['class'] = 'ZINbbc xpd O9g5cc uUPGi'

# For IP Address html tag
ip_address = html_soup.new_tag('div')
ip_address['class'] = 'kCrYT ip-address-div'
ip_address.string = ip

# Text below the IP address
ip_text = html_soup.new_tag('div')
ip_text.string = 'Your public IP address'
ip_text['class'] = 'kCrYT ip-text-div'

# Adding all the above html tags to the IP card
ip_tag.append(ip_address)
ip_tag.append(ip_text)

# Insert the element at the top of the result list
main_div.insert_before(ip_tag)
return html_soup


Expand Down
2 changes: 1 addition & 1 deletion charts/whoogle/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: whoogle
description: A self hosted search engine on Kubernetes
type: application
version: 0.1.0
appVersion: 0.7.1
appVersion: 0.7.2

icon: https://github.com/benbusby/whoogle-search/raw/main/app/static/img/favicon/favicon-96x96.png

Expand Down
3 changes: 2 additions & 1 deletion misc/instances.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
https://gowogle.voring.me
https://s.alefvanoon.xyz
https://s.tokhmi.xyz
https://search.albony.xyz
https://search.garudalinux.org
https://search.dr460nf1r3.org
https://search.sethforprivacy.com
https://whoogle.fossho.st
https://whooglesearch.net
Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ packages = find:
include_package_data = True
install_requires=
beautifulsoup4
cssutils
cryptography
defusedxml
Flask
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@
if os.getenv('DEV_BUILD'):
optional_dev_tag = '.dev' + os.getenv('DEV_BUILD')

setuptools.setup(version='0.7.1' + optional_dev_tag)
setuptools.setup(version='0.7.2' + optional_dev_tag)

0 comments on commit 21a6913

Please sign in to comment.