Skip to content
This repository has been archived by the owner on Sep 7, 2023. It is now read-only.

Add searx/settings.py ? #2252

Open
dalf opened this issue Oct 9, 2020 · 11 comments
Open

Add searx/settings.py ? #2252

dalf opened this issue Oct 9, 2020 · 11 comments
Labels

Comments

@dalf
Copy link
Contributor

dalf commented Oct 9, 2020

Currently searx/__init__.py reads searx/settings.yml into searx.settings.

Then searx.settings is read directly in numerous places (not exhaustive) :

  • most of https://github.com/searx/searx/blob/master/searx/__init__.py
  • default_theme = settings['ui']['default_theme']
  • app.secret_key = settings['server']['secret_key']
  • searx/searx/preferences.py

    Lines 318 to 375 in da8b227

    self.key_value_settings = {
    'categories': MultipleChoiceSetting(
    ['general'], choices=categories + ['none']
    ),
    'language': SearchLanguageSetting(
    settings['search'].get('default_lang', ''),
    choices=list(LANGUAGE_CODES) + ['']
    ),
    'locale': EnumStringSetting(
    settings['ui'].get('default_locale', ''),
    choices=list(settings['locales'].keys()) + ['']
    ),
    'autocomplete': EnumStringSetting(
    settings['search'].get('autocomplete', ''),
    choices=list(autocomplete.backends.keys()) + ['']
    ),
    'image_proxy': MapSetting(
    settings['server'].get('image_proxy', False),
    map={
    '': settings['server'].get('image_proxy', 0),
    '0': False,
    '1': True,
    'True': True,
    'False': False
    }
    ),
    'method': EnumStringSetting(
    settings['server'].get('method', 'POST'),
    choices=('GET', 'POST')
    ),
    'safesearch': MapSetting(
    settings['search'].get('safe_search', 0),
    map={
    '0': 0,
    '1': 1,
    '2': 2
    }
    ),
    'theme': EnumStringSetting(
    settings['ui'].get('default_theme', 'oscar'),
    choices=themes
    ),
    'results_on_new_tab': MapSetting(
    settings['ui'].get('results_on_new_tab', False),
    map={
    '0': False,
    '1': True,
    'False': False,
    'True': True
    }
    ),
    'doi_resolver': MultipleChoiceSetting(
    ['oadoi.org'], choices=DOI_RESOLVERS
    ),
    'oscar-style': EnumStringSetting(
    settings['ui'].get('theme_args', {}).get('oscar_style', 'logicodev'),
    choices=['', 'logicodev', 'logicodev-dark', 'pointhi']),
    }
  • searx/searx/utils.py

    Lines 484 to 496 in da8b227

    def get_engine_from_settings(name):
    """Return engine configuration from settings.yml of a given engine name"""
    if 'engines' not in settings:
    return {}
    for engine in settings['engines']:
    if 'name' not in engine:
    continue
    if name == engine['name']:
    return engine
    return {}
  • connect = settings['outgoing'].get('pool_connections', 100) # Magic number kept from previous code
    maxsize = settings['outgoing'].get('pool_maxsize', requests.adapters.DEFAULT_POOLSIZE) # Picked from constructor
  • searx/searx/search.py

    Lines 38 to 48 in 474d56c

    max_request_timeout = settings.get('outgoing', {}).get('max_request_timeout' or None)
    if max_request_timeout is None:
    logger.info('max_request_timeout={0}'.format(max_request_timeout))
    else:
    if isinstance(max_request_timeout, float):
    logger.info('max_request_timeout={0} second(s)'.format(max_request_timeout))
    else:
    logger.critical('outgoing.max_request_timeout if defined has to be float')
    from sys import exit
    exit(1)

Wouldn't it be better to:

@dalf dalf added the core label Oct 9, 2020
@return42
Copy link
Contributor

return42 commented Oct 9, 2020

@kvch
Copy link
Member

kvch commented Oct 9, 2020

I agree, it would be nice to have some mechanism to check settings.yml. I think it could be something similar to the current preferences.py. We could even reuse the *Setting classes implemented there.

If we add the ability to check the settings.yml, we could also add a subcommand or whatever to let instance admins check the configuration without starting searx.

@asciimoo
Copy link
Member

asciimoo commented Oct 9, 2020

I don't fully agree. It's true, it would make settings handling better, but I think, there are much more important tasks to solve like engine fixes or other open bugs/enhancements. It generated quite a few issues over the years, so this is a low priority task for me.

If we add the ability to check the settings.yml, we could also add a subcommand or whatever to let instance admins check the configuration without starting searx.

Yeah, this could be a handy feature, but nobody asked for it in the past 6-7yrs, so I don't think that it would have so much added value.

nearly all application setup is spread around the project and is done while importing modules

I still don't get why is it a problem and how would it be solved by the proposed functionality. This is a single entry point application where every module initializes itself at import time. What is wrong with that in case of searx and what would be the added value of changing this behavior?

@return42
Copy link
Contributor

return42 commented Oct 9, 2020

and how would it be solved by the proposed functionality.

By ..

add searx/settings.py

and

write a loader

I still don't get why is it a problem

IMO it is conceptional wrong;

a. reading or writing files just because a module was imported at some point is programming by side effects (For me, it is a kind of antipattern)

b. settings.yml are the settings of the (web) application, not the settings of ever single module. Each python module should stand for its own and the application assembles and parametrize them explicit, not by uncalled site effects (while importing).

single entry point application where every module initializes itself at import time.

This is what I mean in my other comment; its all backed together and not really modular ... I can't use a module stand alone, I always have to use the whole application .. and on the opposite .. I can't replace one component (e.g. how to handle the settings) in the application without touching various modules.

This might be OK for small applications, but from my POV searx becomes more and more a real "server".

@asciimoo
Copy link
Member

asciimoo commented Oct 9, 2020

a. reading or writing files just because a module was imported at some point is programming by side effects (For me, it is a kind of antipattern)

Why? Print or socket write is also a side effect but both are very useful. From this point of view any kind of module initialization (like defining functions/classes/variables in a module) is also a side effect, it would be hard to avoid it. Also, I don't understand, that why is it a problem in case of searx that initialization happens in import time? Why would be better to create an init(*args) function in every module and call it manually after importing it? What is the disadvantage of the current behavior?

b. settings.yml are the settings of the (web) application, not the settings of ever single module. Each python module should stand for its own and the application assembles and parametrize them explicit, not by uncalled site effects (while importing).

Settings is an importable module which is used by multiple other modules. How would you handle it? Would you write a config file for each module?

This is what I mean in my other comment; its all backed together and not really modular ... I can't use a module stand alone, I always have to use the whole application

Yes, and it is perfectly fine, searx is an application it isn't a framework or a lib, modules are internal and there is no intent to use them outside of the application.

I can't replace one component (e.g. how to handle the settings) in the application without touching various modules.

That's how dependency work, there are multiple modules depend on the settings module, just like there are multiple modules that depend on built-in modules or other external ones. If you want to replace a module that is used by an other module as a dependency, then you have to modify the importing module too, but this is true for every single dependency of every single module.
This situation wouldn't be different if we had a settings.py.

Probably it is my fault, but I still can't see the added value.

@kvch
Copy link
Member

kvch commented Oct 9, 2020

I don't fully agree. It's true, it would make settings handling better, but I think, there are much more important tasks to solve like engine fixes or other open bugs/enhancements. It generated quite a few issues over the years, so this is a low priority task for me.

It is true, engine fixes should be the most important things on our TODO lists. I mean everyone can write a basic meta search engine from scratch. The real value of searx is the 70-80 engines which has been developed over the years. We should focus on that, because this sets us apart from other search engines.

If we add the ability to check the settings.yml, we could also add a subcommand or whatever to let instance admins check the configuration without starting searx.

Yeah, this could be a handy feature, but nobody asked for it in the past 6-7yrs, so I don't think that it would have so much added value.

It is indeed not an urgent/important feature.

This is what I mean in my other comment; its all backed together and not really modular ... I can't use a module stand alone, I always have to use the whole application

I am not sure why this is a problem. Searx is a standalone web application. Why would someone want to use it as a library or framework or whatever? These use cases are not supported.

@return42
Copy link
Contributor

Also, I don't understand, that why is it a problem in case of searx that initialization happens in import time?

See #2256 (comment)

@return42
Copy link
Contributor

return42 commented Oct 11, 2020

Probably it is my fault, but I still can't see the added value.

Not a fault, we have different views on this subject for practical reasons

Even if we think there is only one single app today, we already have more. Building the docs is another application (better called "use case") and we do not know what our need is tomorrow. The future might introduce complete new use cases, we can't think of today. So it is good to decouple tasks from each other. (importing is one task, initializing is another task).

In the following I will try to explain how I would decouple import from initialize ..

Settings is an importable module which is used by multiple other modules. How would you handle it? Would you write a config file for each module?

No. From my view, settings.yml is the settings file of the web-application. These settings should be loaded when the Flask App is initialized just before the app is started. With the current implementation, the searx.webapp.run function might be good place for such a task:

searx/searx/webapp.py

Lines 1035 to 1043 in e78bfd4

def run():
logger.debug('starting webserver on %s:%s', settings['server']['bind_address'], settings['server']['port'])
app.run(
debug=searx_debug,
use_debugger=searx_debug,
port=settings['server']['port'],
host=settings['server']['bind_address'],
threaded=True
)

Why would be better to create an init(*args) function in every module and call it manually after importing it?

The (new) settings.py should contain a loader function for tasks like this

searx/searx/__init__.py

Lines 50 to 51 in e78bfd4

with open(settings_path, 'r', encoding='utf-8') as settings_yaml:
settings = safe_load(settings_yaml)

And other modules like autocomplete (to take just one simple example) should have a init method. Since there is no need for OOP in autocomplete and to keep things simple, this init is done by a function, which inits a global variable. To illustrate what I mean:

settings=None

def get(*args, **kwargs):
    if 'timeout' not in kwargs:
        kwargs['timeout'] = settings['outgoing']['request_timeout']
    return http_get(*args, **kwargs)

def init(app_settings):
    global settings
    settings = app_settings

@dalf
Copy link
Contributor Author

dalf commented Jan 21, 2021

related to the issue, static_path is defined two different locations:

  • static_path = abspath(join(dirname(__file__), 'static'))

    Here the path is absolute
  • static_path = get_resources_directory(searx_dir, 'static', settings['ui']['static_path'])

    searx/searx/webutils.py

    Lines 50 to 55 in f310305

    def get_resources_directory(searx_directory, subdirectory, resources_directory):
    if not resources_directory:
    resources_directory = os.path.join(searx_directory, subdirectory)
    if not os.path.isdir(resources_directory):
    raise Exception(resources_directory + " is not a directory")
    return resources_directory

    Here the path may be relative

The only usage of searx.static_path :

target_dir = join(static_path, 'plugins/external_plugins', plugin_dir)


This value is never used:

engine_dir = dirname(realpath(__file__))


resp = make_response(redirect(urljoin(settings['server']['base_url'], url_for('index'))))

settings['server']['base_url'] should be get_base_url() ?

@dalf
Copy link
Contributor Author

dalf commented Jan 24, 2021

For reference, see
https://github.com/dalf/searx/tree/add_settings_default
more precisely this commit: dalf@23b6659

@return42
Copy link
Contributor

more precisely this commit: dalf/searx@23b6659

I cloned your add_settings_default branch and made a first review and some small tests .. looks good to me, what I really like is the cleanup of the static_path mess .. do you like to rebase and send PR?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants