Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Save config http only #17

Closed
gripped opened this issue May 10, 2020 · 12 comments
Closed

[BUG] Save config http only #17

gripped opened this issue May 10, 2020 · 12 comments
Assignees
Labels
bug Something isn't working Fixed (Pending PR Merge) A pull request is open for this issue

Comments

@gripped
Copy link
Contributor

gripped commented May 10, 2020

Describe the bug
I've installed this both in an AWS free instance manually and also using your heroku quick deploy
At first, in the case of AWS, saving the config settings worked because I'd opened port 80 to get ssl certs with certbot. Once I closed port 80 again the config changes won't save . No biggie I can open port 80 ! My nginx settings redirect back to https anyway.
However with heroku I start with a https page eg https://mywhoogle.herokuapp.com. Saving the config settings take me to the http page http://mywhoogle.herokuapp.com where it will stay for any subsequent searches

To Reproduce
Steps to reproduce the behavior: AWS

  1. Only allow traffic on port 443 https
  2. Go to https://mywhoogle.domain.com
  3. Edit settings and click save
  4. Page hangs

Steps to reproduce the behavior: Heroku

  1. https://mywhoogle.herokuapp.com
  2. Edit settings and click save
  3. Page is now http://mywhoogle.herokuapp.com

Expected behavior
AWS: allow save via https
heroku: remain on https after save if that's what we we on before saving.

Nice work :)

@gripped gripped added the bug Something isn't working label May 10, 2020
benbusby added a commit that referenced this issue May 10, 2020
Config options now allow setting a "root url", which defaults to the
request url root. Saving a new url in this field will allow for proper
redirects and usage of the opensearch element.

Also provides a possible solution for #17, where the default flask redirect method redirects to
http instead of https.
@benbusby
Copy link
Owner

I pushed a new update that should address this issue. The config section on the main page now has an option to set the instance's root url. When you get a chance, try pulling in the new changes and make sure the app url is using https before saving your config.

@gripped
Copy link
Contributor Author

gripped commented May 11, 2020

This does work to allow saving via https. If the Root URL: in the settings is changed to https://
However in the first instance, when first visiting via https:// the Root URL: is pre-populated with the http:// form of the url. I tested several times by restarting the heroku dyno.

Therefore someone not knowing they had to edit this to https:// does still get the hanging page on the AWS install (with port 80 blocked), when they click save. But on the heroku install the save will complete, and the session will be http until manually reverting to https.

So fixed for me, able to save via https, but not closing as I don't think it's fixed entirely ? A user shouldn't be getting dropped down to http because they didn't know to add an 's' to a url before saving. (not that I'll have any users, private search)

Still great work :)

@gripped
Copy link
Contributor Author

gripped commented May 11, 2020

Sorry more http woes. Adding whoogle as a search engine in firefox and either using it as the default engine, or from the search choices displayed in the address bar dropdown results in an http search.

In ~/.mozilla/firefox/xxxxxxxx.default/search.json.mozlz4

{
	"_name": "Whoogle",
	"_shortName": "whoogle",
	"_loadPath": "[http]whoogle.informationhouse.co.uk/whoogle.xml",
	"description": "Whoogle: A lightweight, deployable Google search proxy for desktop/mobile that removes Javascript, AMP links, and ads",
	"__searchForm": "http://whoogle.informationhouse.co.uk/search",
	"_iconURL": "",
	"_metaData": {
		"loadPathHash": "LfkXFjdLUC1r4N4dBxzEqrJqSF0ABhb2hvLzIN8gZs0=",
		"order": 13
	},
	"_urls": [
		{
			"params": [
				{
					"name": "q",
					"value": "{searchTerms}"
				}
			],
			"rels": [],
			"resultDomain": "whoogle.informationhouse.co.uk",
			"template": "http://whoogle.informationhouse.co.uk/search",
			"method": "POST"
		},
		{
			"params": [],
			"rels": [],
			"resultDomain": "whoogle.informationhouse.co.uk",
			"template": "http://whoogle.informationhouse.co.uk/search",
			"type": "application/x-suggestions+json"
		}
	],
	"_isBuiltin": false,
	"_orderHint": null,
	"_telemetryId": null,
	"_hasPreferredIcon": false,
	"queryCharset": "UTF-8"
}

Strangely editing the http's to https's in this file not working for me. Still http search from the added engine ?

All my other added search engines seem to default to https.

@benbusby
Copy link
Owner

So this is a bit of a complicated issue. As far as the Flask server is aware, it's only running http (which is why the root url field is prepopulated the way it is -- this is technically how the Flask server is actually being run), even though requests are being proxied through https either through heroku, aws, etc. Another weird thing is that the past 4 out of 5 heroku instances I've spun up have handled https redirects properly, but one did not. So with the 4 that did work, POST requests to /config routes over http were properly forwarded to https and I never encountered any problems.

Beyond what I already pushed for a partial fix, I suppose I could add a section to the README with a note about how https traffic is occasionally not properly rerouted, so ensure that the root url under config matches the protocol you'd like to use. I can't do a blanket fix for rerouting all http to https, since this would break instances that aren't being proxied. Just adding a README fix also doesn't quite feel like a good solution, so I'm still trying to think of a better alternative. One possible alternative is just using javascript to send the POST request instead of relying on just the HTML form, where it'd be easier to specify the full url (with https protocol) as the proper endpoint.

Regarding your issues with setting the default search engine, I've had issues in the past where Firefox caches the opensearch template even when I change some small aspect about it. In the past I've just removed the Whoogle search template from the list in search preferences, and sometimes it's required clearing data in Firefox before it'll accept the new template. This may or may not be the issue you're seeing, just wanted to offer what I've experienced in the past.

@gripped
Copy link
Contributor Author

gripped commented May 12, 2020

Thanks for your explanation.
I think you probably should make this limitation clear in the readme. I expect most people making the effort to use a privacy enhanced google search will want to be using https all of the time.
With that in mind my own opinion is that long term you should make the app https only.
With the case of heroru that should be seamless as a certificate is in place automatically. But a bit harder to automatically setup with the other deployment methods. I get that you want this to be easy to install. Thinking outside the box maybe caddy instead of flask ?
Anyhow I'm glad I came across this project on reddit, as it's inspired me to finally also get my own private searx instance going. But I intend to use both going forward . I like the way whoogle gives me my own frontend to results very similar to what I'd get going straight through google.com

@benbusby
Copy link
Owner

With that in mind my own opinion is that long term you should make the app https only.

Agreed. I have a couple ideas for configuring https enforcement at runtime, but need to try them out and see how well they integrate with the rest of the codebase. Will let you know once I reach a proper solution.

@benbusby
Copy link
Owner

Fixed in #48 (pending merge). HTTPS is now enforced in all Docker instances unless otherwise specified, and there's an easy --https-only flag for instances running through pip/pipx. Let me know if you want to take a look or not, otherwise I'll probably merge this in the next few hours. Thanks again for the discussion, it's helpful to get feedback on stuff like this.

@gripped
Copy link
Contributor Author

gripped commented May 15, 2020

Testing it now. Back to you soon

@gripped
Copy link
Contributor Author

gripped commented May 15, 2020

Sorry multiple issues.
In the PR the readme says add --https-only to the command line

cd /home/whoogle/whoogle-search
source venv/bin/activate
./whoogle-search --https-only

This has the effect of causing config.json to get stored in whoogle-search/--https-only/static/ instead of whoogle-search/app/static/ (more on this in a bit)

Tried instead

HTTPS_ONLY=1 ./whoogle-search

But still wasn't working
And maybe I've lost the plot but isn't the logic wrong here
route.py line 24

 if os.getenv('HTTPS_ONLY',False) and request.url.startswith('http://'):

Shouldn't that be True ? I've tried both though and no effect with either

( --debug set by default in whoogle-search )

back to config.json
It's probable I'm missing something but where multiple users are using the same instance aren't they overwriting each others settings ?

Maybe I've done something wrong ?

git clone --single-branch --branch feature/https-only https://github.com/benbusby/whoogle-search.git

is the code I was testing.

@benbusby
Copy link
Owner

benbusby commented May 15, 2020

Regarding the logic in routes on line 24, it's falling back to false if that environment variable isn't set (which is what it should default to, only the Dockerfile build arg should ever set that to True).

Also in the readme changes, I should update that to be a bit more clear. The --https-only flag should only be applied directly to the run command for the Flask app, not the executable itself. I will make a note to update the executable name to reduce some confusion -- when it's installed through pip, the command to run the server is the same as the executable, but they function differently. Just unfortunate naming conventions on my part, but easy to fix.

Regarding the use case of multiple users using the same instance, that is an issue, but the (currently) intended use case is only ever supposed to be individually run instances only. With the introduction of basic auth in #51, I could start to visit the idea of separating config directories to be dependent on the active user, but otherwise the project would need a not-insignificant redesign to be used by multiple people with separate config files.

For testing the feature, I've been using the pip installed method of running the app, but the same effect can be accomplished by updating the executable to run with the --https-only flag at the end of the python3 -um app ... command.

@gripped
Copy link
Contributor Author

gripped commented May 15, 2020

I'll take another look tomorrow. I was tired when testing before.
Though I believe I did also try --https-only at the end of python3 -um app ...
The way it's written up in the readme is a definite red herring though.
Tried lots of things and it didn't seem to work.
Deployed on heroku as well.
This in the build log implies the env was setup correctly I think ?

Step 10/14 : ARG use_https=1

 ---> Running in cbbdb4d6226b

Removing intermediate container cbbdb4d6226b

 ---> de508992e5a7

Step 11/14 : ENV HTTPS_ONLY=$use_https

But https was not forced. Same behaviour as before.

@benbusby
Copy link
Owner

Okay so (presumably) final solution, which is all on master now:

  • Updated README with clear instructions on enforcing HTTPS on various deployment options
  • Renamed executable to run to avoid confusion with the pip installed whoogle-search executable
  • Updated heroku deployment button to use a separate branch (named heroku-app) that by default enforces HTTPS, since the deployments there are guaranteed to always support HTTPS.

I'm going to close this issue for now, but we can keep discussing here if you run into any issues with the new changes. I believe this is now a good compromise for giving the option to enforce HTTPS for some deployments, and not completely break other deployments that rely on HTTPS through a reverse proxy setup (where Flask should remain in HTTP anyways).

Just to make sure, I deployed two different Heroku versions of the app and manually verified that all routes are being served over HTTPS only, and that the opensearch template is correctly populated with the HTTPS version of the url.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Fixed (Pending PR Merge) A pull request is open for this issue
Projects
None yet
Development

No branches or pull requests

2 participants