Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import: option to disable package tree traversal #1182

Closed
Andrei-Pozolotin opened this issue Aug 8, 2019 · 16 comments
Closed

Import: option to disable package tree traversal #1182

Andrei-Pozolotin opened this issue Aug 8, 2019 · 16 comments

Comments

@Andrei-Pozolotin
Copy link

  1. when there is a convention:
    a) all user provided __init__.py are empty
    b) import a.b.c only means GET /a/b/c.py
    c) stdlib is pre-loaded and is not subject to this

  2. then there is no need for package tree traversal and endless 404:

a.js -> 404
a.pyc.js -> 404
a.py -> 404
a.__init__.js -> 404
a.__init__.pyc.js -> 404
a.__init__.py -> 404
a/b.js -> 404
...
...
...

  1. looking at py_import.js, etc, there seems no easy way to have that functionality:
    https://github.com/brython-dev/brython/blob/master/www/src/py_import.js
@PierreQuentel
Copy link
Contributor

With this tree structure :

src
|-- brython.js
tests
|-- test.html
    testa
    |-- __init__.py
    |-- testb
        |-- __init__.py
        |-- testc
            |-- __init__.py
            |-- testc.py

and this in test.html:

<!doctype html>
<html>
<head>
<meta charset="utf-8">
<script type="text/javascript" src="/src/brython.js"></script>
</head>
<body onLoad="brython(1)">
<script type="text/python">
import testa.testb.testc
</script>
</body>
</html>

the only Ajax calls when loading http://localhost:8000/tests/test.html are:

GET http://localhost:8000/tests/testa.py?v=1565431472291
[HTTP/1.0 404 File not found 4ms]

GET http://localhost:8000/tests/testa/__init__.py?v=1565431472324
[HTTP/1.0 200 OK 7ms]

GET http://localhost:8000/tests/testa/testb.py?v=1565431472411
[HTTP/1.0 404 File not found 5ms]

GET http://localhost:8000/tests/testa/testb/__init__.py?v=1565431472476
[HTTP/1.0 200 OK 5ms]

GET http://localhost:8000/tests/testa/testb/testc.py?v=1565431472508
[HTTP/1.0 200 OK 6ms]

I am suprised by so many tries for .js files. What is your tree structure ?

@Andrei-Pozolotin
Copy link
Author

thank you for looking into this.

  1. I have a similar structure in this experiment:
    see gist , page, python, console

  2. bottom line, experiment result:
    single line from healer_test.station.base.arkon_test import *
    results in nine 404, thus producing noisy unusable misleading console logs

  3. regardless: my point is that all 404 are needless,
    given brython user has a little more control over the import process

@Andrei-Pozolotin
Copy link
Author

Andrei-Pozolotin commented Aug 10, 2019

proposed change to brython() parameters: pythonpath_config

  • use_js - try to load *.js resources during import
  • use_pyc - try to load *.pyc resources during import
  • use_init - try to load __init__.py resources during import
  • use_tree - try to traverse package tree during import

so the desired functionality for #1182 :
import a.b.c only means GET /a/b/c.py

would be expressed as per-path config entry :

<body onload="brython({ 
   debug : 1, 
   pythonpath_config : {
      '/import' : { use_js: false, use_pyc: false, use_init: false, use_tree: false }
   }
})">

@Andrei-Pozolotin
Copy link
Author

Andrei-Pozolotin commented Aug 14, 2019

alternatively, new api in python:

brython = globals('__BRYTHON__')
brython.pythonpath_config = {
   '/import' : dict( use_js=False, use_pyc=False, use_init=False, use_tree=False )
}

PierreQuentel pushed a commit that referenced this issue Aug 15, 2019
@PierreQuentel
Copy link
Contributor

Files with extension .js or .pyc.js were a way to reduce import time before the indexedDB cache was introduced, but they should not be used now, and are not documented anyway. The commit referenced above removes search for these extensions, which should solve the issue you report.

@Andrei-Pozolotin
Copy link
Author

Andrei-Pozolotin commented Aug 16, 2019

  1. yes, that commit partially removes needless imports, thank you

  2. still, what do you think about the original idea:

when there is a convention:
a) all user provided __init__.py are empty
b) import a.b.c only means GET /a/b/c.py and nothing else

then there is no need for package tree traversal
meaning that there is no need to try all these permutations:

a.__init__.py
a.py
a.b.__init__.py
a.b.py
a.b.c.__init__.py

@PierreQuentel
Copy link
Contributor

With the commit above, the number of useless Ajax calls is now reduced to the minimum. There still are useless calls, but apart from annoying messages in the browser console, their impact on performance is minimum.

As you certainly know, the standard solution to improve importing is to build brython_modules.js with python -m brython --modules ; this way there is no Ajax call at all.

The use case you describe (only empty __init__.py and import a.b.c only means "get from a/b/c.py") is too specific to deserve an entry in the options of brython(). It could lead to bugs difficult to solve : if for some reason the developer later adds a package in a directory d in a/b, he won't be able to import a.b.d. And even finding a name for this options would be difficult.

@Andrei-Pozolotin
Copy link
Author

thank you for thinking about this, and let me try to address your concerns:

apart from annoying messages in the browser console, their impact on performance is minimum

well, their impact on my performance is maximum - annoying messages are a major work disruption

solution to improve importing is to build brython_modules.js

that would destroy the initial attractiveness of brython, i.e. immediate feedback during development, and would make it just another transpiler

could lead to bugs difficult to solve

well, since this feature requires active opt-in, the responsibility stays with the developer

a package in a directory d in a/b, he won't be able to import a.b.d

sure, but this scenario only means that the convention was broken, again,
import a.b.d means http/get /a/b/d.py and nothing else, hence: no contradiction

finding a name for this options would be difficult

current interface is not much different from the proposed:

current interface:

<body onload="brython({ debug:1, pythonpath:[ '/import' ] })">

proposed interface in js:

<body onload="brython({ debug:1, pythonpath_config: { '/import': {use_tree=false} } })">

proposed interface in python:

brython = globals('__BRYTHON__')
brython.pythonpath_config = {
   '/import' : dict( use_tree=False )
}

@PierreQuentel
Copy link
Contributor

If I understand your message correctly, the useless Ajax calls have no impact on the response time of your application (this is what I mean with "performance"), it's just that you would prefer not to see them in the console. Am I correct ?

@Andrei-Pozolotin
Copy link
Author

yes, correct

@PierreQuentel
Copy link
Contributor

Instead of adding a new option to brython() for this use case, another solution would be to use the standard Python mechanism to customize imports defined in PEP 451, ie define a "meta path finder" that replaces the one used by Brython to find modules and packages in the application directory.

Here is an example of a path finder that does the same as the current default finder, but written in Python (I had to make a few changes in commit 561b452 to make it work):

import sys
from browser import ajax

req = ajax.Ajax()

class Finder:
    """Meta path finder, searches Python modules or packages in the
    application directory."""

    @classmethod
    def find_spec(cls, fullname, path=None):
        url = fullname.replace(".", "/")
        for end in ["", "/__init__"]:
            # blocking Ajax call
            req.open("GET", url + end + ".py", False)
            req.send()
            if req.readyState == 4 and req.status == 200:
                spec = ModuleSpec(fullname, Finder)
                spec.cached = False
                spec.has_location = True
                spec.loader_state = {
                    "req": req,
                    "path": fullname,
                    "is_package": end != ""
                }
                spec.origin = url
                spec.parent = fullname
                spec.submodule_search_locations = None
                return spec

    @classmethod
    def create_module(cls, spec):
        pass

    @classmethod
    def exec_module(cls, module):
        spec = module.__spec__
        if spec.loader_state["is_package"]:
            module.__path__ = [spec.loader_state["path"]]
        sys.modules[spec.name] = module
        ns = {}
        # execute source code of Python module
        exec(spec.loader_state["req"].responseText, ns)
        for key, value in ns.items():
            setattr(module, key, value)


class ModuleSpec:

    def __init__(self, name, loader):
        self.name = name
        self.loader = loader

# replace default path finder
sys.meta_path[-1] = Finder

You can adapt it to avoid searching __init__.py when necessary.

@Andrei-Pozolotin
Copy link
Author

  1. great idea

  2. except import sys needs stdlib and that is too heavy (so far managed to stay away from it)

@PierreQuentel
Copy link
Contributor

You can avoid importing sys and replace

sys.modules[spec.name] = module

by

__BRYTHON__.imported[spec.name] = module

but I'm afraid you will still have an objection ;-)

@Andrei-Pozolotin
Copy link
Author

no, no objection (for now), sorry! :-) looking forward for the release to try this out

@PierreQuentel
Copy link
Contributor

This is included in release 3.7.5. Can we close the issue ?

@Andrei-Pozolotin
Copy link
Author

yes, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants