Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Namespaces Support #503

Closed
wants to merge 10 commits into from
Closed

Conversation

JeffersGlass
Copy link
Member

Addressing #166, this adds support for namespaces for any tag with an eval() or evaluate() method (py-script, py-repl, py-button, etc.) These tags can have a new attribute pys-namespace to specify which namespace they execute in. This replaces PR #407.

An very bare-bones example:

<py-script pys-namespace="my-first-namespace">
print(x := "First Namespace")
</py-script>

<py-script pys-namespace="my-second-namespace">
print(x := "Second Namespace")
</py-script>

<py-script pys-namespace="my-first-namespace">
print(x) #Should be 'First Namespace’
</py-script>

There is a new namespaces.html that demonstrates the basic namespace functionality.

Under the hood: After pyodide is initialized, the new initNamespaces() initializer makes a copy of pyodide.globals for every distinct value of pys-namespace found in the document. This is so the new namespaces will have access to the standard lib, PyScript, etc.

These copies are stored in a new dictionary in the pyodide.globals space called "pyscript_namespaces". Each element has a namespace attribute that stores the name of its namespace (as a string), which are the keys to the pyscript_namespaces dict.

When a pyscript element is evaluate()'d or eval()'d, that element uses its pys-namespace value as a key into the pyscript_namespaces dict and uses that as its __globals__ namespace. If the element has no pys-namespace value, the usual pyodide.globals dict is used.

One thought for future improvement: right now, all the namespaces are created when pyodide is initialized. It might be nice if there were the option to initialize them on demand if they don't exist yet, so that pages could dynamically add new namespaces to elements. This is probably easiest if (2) is implemented, since the usual pyodide.globals namespace would only ever contain the standard pyodide startup modules, standard lib, and whatever PyScript initializes at init time.

JeffersGlass and others added 2 commits June 8, 2022 14:12
All executable tags (py-script, py-button, py-repl, etc)
can now take a 'pys-namespace' attribute, which will cause
the code to run in a namespace identified the attribute's value.
Tags without this attribute run in the default (global) namespace;
other namespaces are stored in a dictionary at
pyodide.globals["pyscript_namespaces"].

When a new namespace is created, the global dictionary is copied to
that namespace. This preserves access to the standard library, Pyodide,
pyscript, etc.

Currently, all namespaces are instantiated on page-load. It would
be better if there was an option to create them at execution time.

Refactor namespaces into new getNamespace function

Fix linting errors with const v let

Run format
@JeffersGlass
Copy link
Member Author

For what it's worth, I'm not married to namespaces having their own bespoke example - it was useful for testing and demonstration, but it may be too niche to merit it's own entire example.

@fpliger
Copy link
Contributor

fpliger commented Jun 9, 2022

Great!

Finally back from travels, conferences, etc.. and will get on this today.

Copy link
Contributor

@fpliger fpliger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffersGlass thanks for the PR! Overall it looks good but I think we should change the attribute name (and probably support it in other PyScript components, but this can come in follow up PRs). What do you think?

pyscriptjs/src/components/base.ts Outdated Show resolved Hide resolved
pyscriptjs/src/components/base.ts Outdated Show resolved Hide resolved
pyscriptjs/src/components/pyscript.ts Outdated Show resolved Hide resolved
@fpliger
Copy link
Contributor

fpliger commented Jun 10, 2022

@rth Do you mind a quick review to see if it's the best way to manage pyodide namespaces?

Copy link
Contributor

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JeffersGlass ! Overall this approach is reasonable I think. There was a previous discussion about this in pyodide/pyodide#703 and we have added

  • pyodide.state._save_state
  • pyodide.state._restore_state

which also records JS packages loaded to the global scope pyodide/pyodide#1349 but you probably don't need that here.

The caveat is that sys.modules and any changes to modules (e.g. extra path in sys.path) will be shared between namespaces. I think it should be OK, but hard to say what side effects this could have until someone starts using it extensively.

IPython's %reset magic command also does something similar (cf code) which is encouraging.

Maybe @antocuni would also have an opinion on this.

pyscriptjs/src/components/base.ts Show resolved Hide resolved
pyscriptjs/src/utils.ts Outdated Show resolved Hide resolved
@JeffersGlass
Copy link
Member Author

Thanks @rth! Am I right in thinking that pyodide.state._save_state and pyodide.state._restore_state specifically restore the pyodide globals dict, and wouldn't be able to essentially copy a dict for use in a separate namespace?

Right now, this PR just makes a copy of the globals dict immediately after its initialized, which feels just a little clunky, but functional.

I see what you're saying about sys.modules being shared between namespaces. As you say, not sure if that'd be an issue, but curious to see how that manifests in projects.

@fpliger fpliger added tag: interpreter Related to the Python interpreter configuration tag: component Related to PyScript components waiting on feedback Issue or PR waiting on feedback from core team labels Jun 10, 2022
@antocuni
Copy link
Contributor

The caveat is that sys.modules and any changes to modules (e.g. extra path in sys.path) will be shared between namespaces. I think it should be OK,

I see what you're saying about sys.modules being shared between namespaces. As you say, not sure if that'd be an issue,

I think that in order to answer that, we need to decide what do we want from namespaces:

  1. do we want fully separated environments which are independent from each other?
  2. OR, do we simply want namespaces so that each <py-script> can avoid cluttering the global namespace?

(1) is very hard to implement. In addition to sys.modules there is tons of global state in Python which makes namespaces not truly independent, including but not limited to static variables in C modules. The only way to achieve true independence would be to have two separate instances of pyodide.

The good news is that I don't think we want (1) :).
I think that we want (2), but better to say it explicitly to avoid confusion; let's also see what @fpliger thinks since he's the one which opened the original issue.

That said, I think it would be also nice to add a way to access other namespaces. We need:

  1. a way to access a named namespace from another namespace
  2. a way to access the default namespace from a named one

An example showing (1), in which all the named namespaces are automatically available

<py-script namespace="aaa">
x = 42
</py-script>

<py-script>
y = aaa.x + 1
</py-script>

But (2) becomes more difficult: if I am in a <py-script namespace="bbb">, how to access the value of y? One option is something like default_namespace.y, but it's a bit ugly. Or maybe we could call it __main__, which is similar to how Python calls the default module when you execute a script:

<py-script namespace="aaa">
x = 42
</py-script>

<py-script>
y = aaa.x + 1
</py-script>

<py-script namespace="bbb">
print(__main__.y)

# which one of the following should work? Or both?
print(__main__.aaa.x)
print(aaa.x)
</py-script>

However, my favorite solution is slightly different: Python has already a well known construct to separate the code into multiple namespaces and a way to reference them from each other: it's called modules! So, what about the following:

<py-script module="aaa">
x = 42
</py-script>

<py-script>
import aaa
y = aaa.x + 1
</py-script>

<py-script module="bbb">
import __main__
import aaa
print(__main__.y)
print(aaa.x)
</py-script>

Advantages of this solution:

  • in order to reference another namespace/module, you have to explicitly import it. Explicit is better than implicit
  • the default <py-script> is executed inside a module called __main__, which sounds nicely similar to normal scripts; bonus point, if you copy&paste code from the internet which says if __name__ == '__main__', it still works :)
  • it makes it very easy to move the do refactoring by moving the code from a <py-script> tag into its own file

If we decide to go with this approach, there are still open questions though. In particular:

  1. what happens if you declare multiple <py-script> with the same module? Personally, I think it should be possible, and the semantics should be the same as we have in this PR
  2. what happens if you declare <py-script module='foo'> and there is also a file called foo.py? I think this should be forbidden.

@JeffersGlass
Copy link
Member Author

@antocuni thank you for this! Still digesting all of those very good thoughts (and similarly want to hear from @fpliger) - in the meantime, I've added some commits to address the original comments from @fpliger, namely:

  • The tag attribute which determines the namespace is now called just namespace (though of course this can change again).
  • To clarify: any element which inherits from BaseEvalElement or PyWidget is able to use the namespace functionality. I've added examples using the a py-button and a py-repl to the example. I also corrected an issue where the namespace was not being propagated when REPLs are autogenerated
  • The log message about using the default namespace has been removed.

Additionally, I've removed an issue where a namespace was being created/recreated for every tag with a namespace attribute, even if it had already been created by a previous tag with the same namespace.

@fpliger
Copy link
Contributor

fpliger commented Jun 10, 2022

@JeffersGlass thanks for addressing the comments (doh, I totally overlooked the attribute being manage in the base classes :) )

@antocuni thanks for the nice overview, a lot of good food for thought.

In general, I agree that in PyScript we want to first we simply want namespaces so that each <py-script> can avoid cluttering the global namespace? and not want fully separated environments (runtimes) which are independent from each other (although the latter will most likely be a future feature (these are really 2 different problems though).

I also agree that explicit is better than implicit and modules are a good wait to think about a first implementation for that. With that said, there are a few things to consider though:

  • accessibility: taking my Pythonista hat off and thinking of non-technical users, I feel like namespace is more semantically helpful to understand the concept here than module. (an easy solution could be to implement it as modules but the attribute name as namespace
  • security: there might be use cases where the user or the author of a PyScript app doesn't want namespaces to be accessible, for security reasons. We need to think a way to support that as well.
  • naming convention: in regards of using __main__ as default namespace ..... my brain needs to digest it 😄 I kind of like the idea but haven't thought of the impact on PyScript users that are not pythonistas

@JeffersGlass
Copy link
Member Author

I agree that module semantics are a slick and appraochable way of handling this issue, but I think by allowing multiple py-script tags to exist within the same module, there's a conflict created with the top-to-bottom-of-the-webpage execution of code. For example, what should the following code output?

<py-script module="aaa">
    x = 1
</py-script>
<py-script> # module == '__main__'
    import aaa
    print(aaa.x)
</py-script>
<py-script module="aaa">
    x = 2
</py-script>
<py-script>  # module == '__main__'
    print(x)
</py-script>

If scripts are executed top-to-bottom, we would presumably get 1 2 as the output. However, outside of Pyscript, when I import aaa from my main script, I'd expect it to execute all the lines of code in that module sequentially and return to me the resulting locals in a namespace, in which case the result would be 2 2.

This also probably has ramifications for working with REPLs, or with dynamically-generated code, if we want it to be possible for those to work with namespaces. If a py-repl is evaluated in namespace/module that already exists, we would presumably want to execute that code as if it were executing "at the end" of all existing code in that module.

Doing some tinkering, it seems possible to mess with a module's dict in such a way that this is possible, but I'm not entirely sure. We'd also want to investigate whether that caching of loaded modules messes with an approach like this - in the example above, even if aaa.__dict__['x'] is set to 2 at some point, do we need to re-import aaa (or force a re-import) so that the second print statement prints 2 and not the cached value 1? That's where I'm unclear at what level module caching happens.

Tl;dr It feels like "top-to-bottom-of-the-page-script execution" and "namespaces as modules" are somewhat at odds with each other. Maybe there's a way of "appending" to modules that I'm missing?


Calling the default namespace '__main__' I think implies that it is the entry point to all of the code on the page, or that the default, __main__ namespace scripts would run first. That could be an another way of thinking about execution order, but it would be a significant change from how things currently execute, it seems? Though I think the advantages of being able to re-use if __name__ == '__main__' are a good argument for using it...


As an alternative to standard Python import (and I haven't thought this all the way through), what about something like Pyscript.import('aaa', 'x'), which loads a value from another namespace specifically at the time it's called? I imagine this as a wrapper for something like pyodide.globals.get("pyscript_namespaces")["aaa"]["x"].

This feels a bit lit reinventing the wheel as far as import goes, but it clarifies that it would work differently than import (no caching). We could tweak the semantics to allow imports of the whole module namespace as well, similar to from aaa import * - perhaps with Pyscript.import('aaa'), and perhaps the function could take an iterable of arguments to import, to replace from aaa import b, c, d`?

@fpliger
Copy link
Contributor

fpliger commented Jun 11, 2022

Yeah, agree that there's a lot to think about. Probably the best option is to open a discussion issue and work on a proposal.

@antocuni
Copy link
Contributor

Wow, lot's of things to discuss. Let's start to answer @fpliger remarks:

  • accessibility: taking my Pythonista hat off and thinking of non-technical users, I feel like namespace is more semantically helpful to understand the concept here than module. (an easy solution could be to implement it as modules but the attribute name as namespace

I see your point here, and I agree that in a hypothetical world it would be nice to use the "namespace", but I I politely disagree with your conclusions.

The main point of my idea is to be able to do import aaa; as a Python user, being able to import modules is expected, but the concept of importing "namespaces" is totally unexpected, surprising and probably unpythonic.
One of the biggest point which makes Python easy for newcomers is internal consistency: we should be VERY careful at breaking it, because it's a dangerous slope.

About non-technical users: if they are non-technical, they don't know what is a namespace anyway :).
Also, namespaces/modules seems to be a semi-advanced feature, as most newcomers would just use plain <py-script>.

  • security: there might be use cases where the user or the author of a PyScript app doesn't want namespaces to be accessible, for security reasons. We need to think a way to support that as well.

this is basically impossible to do in Python inside the same process. There have been attempts in the past to offer sandboxed/restricted python execution but they all ultimately failed (e.g. prior to Python 2.3 the stdlib included rexec and Bastion but they were removed because of too many issues and security bugs).

E.g. PyPy offers secure sandboxed execution by running the code inside a special subprocess where all the syscalls are intercepted and implemented by an outside "controller" process.

  • naming convention: in regards of using __main__ as default namespace ..... my brain needs to digest it smile I kind of like the idea but haven't thought of the impact on PyScript users that are not pythonistas

__main__ is just a suggestion, could be any other name. But any choice here is purely arbitrary, so why not?
For non-pythonistas it's just as arbitrary as default, pyscript_main, __pyscript__, or anything else.


Answering @JeffersGlass

I agree that module semantics are a slick and appraochable way of handling this issue, but I think by allowing multiple py-script tags to exist within the same module, there's a conflict created with the top-to-bottom-of-the-webpage execution of code. For example, what should the following code output?
[cut]

It would be executed top-to-bottom and prints 1, 2.
Really, the semantics which I am suggesting is the very same as yours; the only difference is that in your implementation the global dictionaries are stored inside pyscript_namespace, while in my idea the dicts are attached to modules.
Basically, it would be something like this:

import sys
import types

class PyScriptModule(types.ModuleType):
    pass

def eval_pyscript_block(src, modname='__main__'):
    mod = sys.modules.get(modname)
    if mod:
        if not isinstance(mod, PyScriptModule):
            raise Exception(f'Cannot mix <py-script> with regular modules; {modname} already exists')
    else:
        mod = PyScriptModule(modname)
        sys.modules[modname] = mod
    exec(src, mod.__dict__)

### example of usage
eval_pyscript_block('x = 42', modname='hello')
import hello
print(hello.x)

We'd also want to investigate whether that caching of loaded modules messes with an approach like this - in the example above, even if aaa.__dict__['x'] is set to 2 at some point, do we need to re-import aaa (or force a re-import) so that the second print statement prints 2 and not the cached value 1? That's where I'm unclear at what level module caching happens.

if implemented with the logic above, reloading would not be allowed nor possible:

>>> eval_pyscript_block('x = 42', modname='hello')
>>> import hello
>>> import importlib
>>> importlib.reload(hello)
Traceback (most recent call last):
  File "/usr/lib/python3.8/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.8/importlib/__init__.py", line 168, in reload
    raise ModuleNotFoundError(f"spec not found for the module {name!r}", name=name)
ModuleNotFoundError: spec not found for the module 'hello'

The semantics of module "caching" is very simple in Python: when you do import x, python checks whether 'x' is in sys.modules: if it's there, it just uses it, without doing anything else (you can also put arbitrary objects there -- try to do sys.module['hello'] = 42 and then import hello).

As an alternative to standard Python import (and I haven't thought this all the way through), what about something like Pyscript.import('aaa', 'x'), which loads a value from another namespace specifically at the time it's called?

I don't understand what you mean. A normal from aaa import x also loads a value specifically at the time it's called, so I don't understand what would be the difference with your Pyscript.import.

@JeffersGlass
Copy link
Member Author

JeffersGlass commented Jun 13, 2022

@antocuni thank you for your thorough explanation, and especially the example code on the how a PyScriptModule might be implemented. I think we agree on 95% of the semantics and purposes here, and we just differ somewhat on whether import is the best nomenclature for it.

What I'm getting getting hung up on is the following situation, where (1) a module imported from an earlier py-script tag and (2) a module imported from a file, behave somewhat differently:

<py-script module="module_a">
  print("This is printing from module A")
  x = 42
</py-script>
<py-script>
    print("The ultimate answer is:")
    import module_a
    print(module_a.x)
</py-script>
----- OUTPUT ------
This is printing from module A
The ultimate answer is:
42

But if we move the first module to an external file:

### module_a.py
print("This is printing from module A")
x = 42

### index.html
<py-env>
-paths:
  - ./module_a.py
</py-env>
<py-script>
    print("The ultimate answer is:")
    import module_a
    print(module_a.x)
</py-script>
----- OUTPUT ------
The ultimate answer is:
This is printing from module A
42

Which is to say - it seems to me that the import statement is doing two slightly different things here:

  • When applied to an external file/module/package, it runs the source code and binds the resulting locals to a namespace (or, granted, uses a cached version of the same.)
  • When applied to a module created by a py-script tag, it simply references an extant namespace, as that code has already been run.

While the side-effects of this tiny example are trivial, one could imagine a situation where they could be significant

Perhaps this is a minor distinction, but the fact this this behavior is different from typical Python behavior makes me pause. Personally, I think that adding a new method of namespace access is preferable to overriding an existing term with an additional meaning in a non-explicit way - that is, I think we should have difference nomenclature for referencing the namespace created by a py-script tag.

Hopefully I've done a better job explaining this time around - and as always, happy to be disagreed with.

@JeffersGlass
Copy link
Member Author

After sleeping on it, I'm wondering if I'm making a mountain out of a molehill - probably using import and clarifying in the documentation is sufficient to clear up any ambiguities for most users.

@fpliger Interested to know if this suits your use case, but I'm happy to revise this PR (or open a new one?) with a version of namespaces based on modules, along the lines of @antocuni's method above. I know there's some outstanding questions about the name of the default namespace itself and the tag attribute that sets the namespace/module, that seems like things we can continue a discussion on?

@antocuni
Copy link
Contributor

@antocuni thank you for your thorough explanation, and especially the example code on the how a PyScriptModule might be implemented. I think we agree on 95% of the semantics and purposes here, and we just differ somewhat on whether import is the best nomenclature for it.

What I'm getting getting hung up on is the following situation, where (1) a module imported from an earlier py-script tag and (2) a module imported from a file, behave somewhat differently:

[cut]

ah ok, I see what you mean and it's an interesting point. I don't agree that they have different semantics; from my POV, executing a <py-script> block is semantically equivalent to define a module and implicitly import it (because that's the only way to execute code inside modules, in python).
But the fact that you are confused by this is already a data point, because it might indicate that what is "obvious" for me it's not necessarily obvious for others. The following example might be indeed confusing, because someone might think that aaa is not executed until it is imported:

<py-script module="aaa">
print('aaa')
</py-script>

<py-script>
print('main')
import aaa
</py-script>

From this point of view, using namespace instead of module might lead to less surprise, so now I'm starting to lean towards using namespace:

<py-script namespace="aaa">
print('aaa')
</py-script>

<py-script>
print('main')
import aaa
</py-script>

@fpliger
Copy link
Contributor

fpliger commented Jun 14, 2022

@antocuni @JeffersGlass thanks , there's a lot of good stuff in this thread! :)

accessibility: taking my Pythonista hat off and thinking of non-technical users, I feel like namespace is more semantically helpful to understand the concept here than module. (an easy solution could be to implement it as modules but the attribute name as namespace
I see your point here, and I agree that in a hypothetical world it would be nice to use the "namespace", but I I politely disagree with your conclusions.

The main point of my idea is to be able to do import aaa; as a Python user, being able to import modules is expected, but the concept of importing "namespaces" is totally unexpected, surprising and probably unpythonic.
One of the biggest point which makes Python easy for newcomers is internal consistency: we should be VERY careful at breaking it, because it's a dangerous slope.

I've been trying to understand where we disagree/diverge and I think it's on the level of where namespace operate. You are proposing to look them just as a module (yes, one of [the main] Python constructs that manages a namespace) while I'm seeing them as top level context where your single script is running. In fact, a Pyodide namespace lives in the Typescript (runtime) level, unlike a module. One could argue that dicts are also namespaces (and that's what a Pyodide namespace really is) and that almost "everything in Python is a dict" and there's a lot of flexibility of choices. Picking modules to serve as namespaces implementation implies picking a lower level concept that exposes an implementation detail that is much more opinionated.

Additionally, PyScript is a framework on top of Pyodide and a lot of what it is, is providing APIs/Accessibility to things that would otherwise be more complicated. Hopefully Pyodide will not be the only runtime and Python will not be the only language. How would interop between these lang/runtimes/namespaces be?

Just having dicts that we pass around seem more flexible, easier to maintain and simpler to me.

About non-technical users: if they are non-technical, they don't know what is a namespace anyway :).
Also, namespaces/modules seems to be a semi-advanced feature, as most newcomers would just use plain .

😅 very good point. In that I agree with you, just disagree that it can be an excuse to not be thinking about making it more accessible and easier to use anyway.

security: there might be use cases where the user or the author of a PyScript app doesn't want namespaces to be accessible, for security reasons. We need to think a way to support that as well.
this is basically impossible to do in Python inside the same process. There have been attempts in the past to offer sandboxed/restricted python execution but they all ultimately failed (e.g. prior to Python 2.3 the stdlib included rexec and Bastion but they were removed because of too many issues and security bugs).

E.g. PyPy offers secure sandboxed execution by running the code inside a special subprocess where all the syscalls are intercepted and implemented by an outside "controller" process.

(It'd probably be good to have @rth opinion on a few of these actually... this one included)

I think the issue here is what I mentioned earlier. We are seeing things at different levels. If we use modules as implementation, I agree with you, if we use a mapping/dict at the Typescript/JS level we can decide what to pass and what to expose to the interpreter. There's more opportunity to control things. (@rth , please keep me honest here, I may be wrong :) ). Ultimately, since you can also access JS from Python you could still hack things but it's less "easy" (and I think we can do more).

In addition to the above the import choice to access namespaces feels like a stretch and trying to fit an oval in a circle. They are not the same thing and trying to use a construct already existing in Python might end up creating more confusion than not.

@antocuni
Copy link
Contributor

I've been trying to understand where we disagree/diverge and I think it's on the level of where namespace operate. You are proposing to look them just as a module (yes, one of [the main] Python constructs that manages a namespace) while I'm seeing them as top level context where your single script is running. In fact, a Pyodide namespace lives in the Typescript (runtime) level, unlike a module

I'm confused. In the original @JeffersGlass's implementation, namespaces are python dictionaries, not JS dictionaries.
If you want to keep pyscript namespaces separated at the JS level then I agree that python modules are not a good idea, but then it probably means that I'm missing the real point of what you are trying to achieve.

very good point. In that I agree with you, just disagree that it can be an excuse to not be thinking about making it more accessible and easier to use anyway.

100% agree. My proposal started because I wanted to find a good way to access a namespace from another one, and import seemed to fit very well.
I'm happy to listen to alternative proposals.

I think the issue here is what I mentioned earlier. We are seeing things at different levels. If we use modules as implementation, I agree with you, if we use a mapping/dict at the Typescript/JS level we can decide what to pass and what to expose to the interpreter.

another thing to consider is that doing too many JS->Python->JS->Python->... jumps can cause problems, because if I understand correctly how JsProxy and PyProxy work, you need to manually manage their lifetime, but @rth surely knows better than me.

@fpliger
Copy link
Contributor

fpliger commented Jun 14, 2022

I'm confused. In the original @JeffersGlass's implementation, namespaces are python dictionaries, not JS dictionaries.
If you want to keep pyscript namespaces separated at the JS level then I agree that python modules are not a good idea, but then it probably means that I'm missing the real point of what you are trying to achieve.

https://github.com/pyscript/pyscript/pull/503/files#diff-e5b66d49faff67e7097dcdde889c4da7fc9f6b3803eba6eec4581d3bbfedcf04R299-R302 for instance. The namespaces are created on TS side and passed explicitly to the runtime when running runPythonAsync. The fact that this PR proposes to store them in runtime.globals is an implementation detail. We could be storing them anywhere.... We could be storing them in JS memory as well as a DB.

another thing to consider is that doing too many JS->Python->JS->Python->... jumps can cause problems, because if I understand correctly how JsProxy and PyProxy work, you need to manually manage their lifetime, but @rth surely knows better than me.

Yeah, that's a concern and would love to hear @rth thoughts here. I'm not sure ho can we avoid jumps whenever we have an explicit namespace though (at least for Pyodide).

@antocuni
Copy link
Contributor

https://github.com/pyscript/pyscript/pull/503/files#diff-e5b66d49faff67e7097dcdde889c4da7fc9f6b3803eba6eec4581d3bbfedcf04R299-R302 for instance. The namespaces are created on TS side and passed explicitly to the runtime when running runPythonAsync.

yes but the actual namespace is a python dictionary which is created here, if I'm not mistaken:
https://github.com/pyscript/pyscript/pull/503/files#diff-ebdbc7740c6ae4c0c09fea0c8f472c1f73f8020cdcc313edfff35dd354d4657eR27

The fact that this PR proposes to store them in runtime.globals is an implementation detail. We could be storing them anywhere.... We could be storing them in JS memory as well as a DB.

I don't think this is actually possible: storing them in a DB would mean that you have to e.g. pickle/unpickle (or any other kind of serialization) the objects, which is not something that you can do generally and transparently in Python.

But again, now I start to think that I might be missing the whole point of the excercise. I thought that pyscript namespaces were mostly a way to avoid cluttering the global namespace; if you have other use cases in mind please tell me :).

@JeffersGlass
Copy link
Member Author

JeffersGlass commented Jun 14, 2022

The namespaces are created on TS side and passed explicitly to the runtime when running runPythonAsync
...
yes but the actual namespace is a python dictionary which is created here, if I'm not mistaken:

Some clarificiation - in the PR, namespaces are stored as Python dicts, within a Python dict called pyscript_namespace, which are then fed to runPythonAsync as necessary.

In more detail: the code at the end of loadInterpreter() creates a python dict called pyscript_namespaces (using pyodide.globals.get('dict')()) to instantiate it). The keys are strings (taken from the namespace tag at the moment) and the values are dictionaries: the 'namespaces' themselves, if you will, that ultimately get passed to runPythonAsync. Those namespace dictionaries at all currently initialized right after Pyodide is loaded by the initNamespaces function. When a tag is to be eval'd or evaluate'd, getNamespace() gets the proper namespace dictionary from pyscript_namespaces and passes it to runPythonAsync.


It's worth pointing out that the reason that the reason all namespace dicts are created immediately after Pyodide loads is so we can copy pyodide.globals into each namespace dict at that point to preserve access to globals and installed libraries. This is a hack around the fact that runPython/Async only takes a globals parameter, and has no concept of global vs local scope. So we have to copy all the modules/functions/builtins we want access to into every namespace.

I think if we end up managing namespaces as a TS mapping or similar, this kind of copying-builtins-and-libraries-hackery is somewhat inevitable, but: if namespaces are handled as some kind of Python dictionary, there's a better option:

Making use of python's object lookup process (allowing built-ins and modules). Using pyodide.eval_code, which accepts both globals and locals, should allow us to treat the builtins (and packages loaded in py-env) to exist in a single global scope, and code within each namespace to function within their own local scope, without having to make copies of the builtins. (This is similar to @antocuni's PyScriptModule example.)

This approach should work whether we use flat dicts or modules as the namespace dictionaries, but it's a point in favor of implementing multiple namespaces as some kind of Python construct and not as a TS mapping or otherwise.

At least, that's true in my conception of what 'namespaces' are, but as @antocuni says, maybe we should clarify our goals and purposes a bit first.

@antocuni
Copy link
Contributor

Also, I think there are two different topics at hand, and some of the confusion arises because sometimes we are thinking of one or the other:

  • implementation: what to use to implement namespaces? I think that the only reasonable choice is "a python dict for every pyscript namespace"

  • Python API: how do you access a namespace from another? My proposal was to use import.

The implementation is -- well -- an implementation detail. But the API is what we have to get right so let's focus on this for now.
I thought that import would be a good idea and very pythonic, but it seems that not everybody agrees, so let's try to think about alternatives and see what's the best.

Another "obvious" alternative which comes to my mind is to have a sort of global object which represents the "page" or the "document" and which is accessible from all namespaces. This is similar to the window object in js, but I think it should be python-only to avoid confusion. Let's call it pyscript for now (but we should think of a name). We could imagine to have something like this:

<py-script>
a = 1
</py-script>
<py-script namespace="xxx">
print(pyscript.__main__.a) # 1
b = 2
</py-script>
<py-script namespace="yyy">
print(pyscript.xxx.b) # 2
</py-script>

This is probably less controversial than modules. The question is:

  • do we want to have this global pyscript object?
  • if yes, what does it represent? The window? The document? The page? The interpreter?
  • depending on the answer to the previous question, we might want to name it differently. But we need to avoid things like document or window to avoid confusion with their js equivalent.

@JeffersGlass
Copy link
Member Author

In answer to the second question - "how do you access a namespace from another?"

To be honest @antocuni - when I first read @fpliger's conception of namespaces, I didn't imagine there being any way of importing objects between then. Since the original issue #166 referenced providing minimal isolation and using different execution scope - and having just completed a project with multiple discrete onscreen parts where knowing code in different chunks of the interface couldn't interact would have been very helpful - I was imagining ways of separating on-page scripts to prevent interaction, as opposed to looking for ways they could potentially connect.

So perhaps it's worth taking even a step further back and asking - should namespaces provide a (reasonable) way to access each other?

(Being in the same runtime, the real answer is that there's probably always a way to access other namespaces, via the pyodide or js global scope or somesuch, as you discussed previously. So perhaps the question is 'should they be able to do so easily/as a documented feature?').


Semi-relatedly, there's been some back-and-forth about a proposal for a <py-module> tag (#323) - which in my understanding achieves a similar code-organization purpose as the 'namespaces-are-modules' idea we were circling earlier, but code within a wouldn't be executed until it was imported by another piece of code.

Whether adopting the idea or not, it feels like figuring out what is desirable in terms of code organization may be separate from what's desirable in terms of code isolation, if that makes any sense. I recognize the two could be quite intertwined, but I wonder if thinking about those concepts separately may help shape the scope of what's desirable here.


On a personal note, I'm leaving for a long-delayed honeymoon about 12 hours after posting this, and so will have to drop out of the conversation for a couple weeks. But I look forward to seeing where it's gone in the meantime, or to picking up the conversation down the road!

@fpliger
Copy link
Contributor

fpliger commented Jun 16, 2022

So... this thread has been really good/helpful. Lots of food for thought.

I agree with @JeffersGlass on the fact that maybe it's worth taking a step (or two) back. After all, in general it's usually a better design to have smaller/more modular features that can be integrated together than a "one size fits all" larger monolith that does everything. I'm +10 on starting small and not thinking of namespaces as a way to interop between languages (and I first hinted at) and +0 on even keeping this first PR leaner and not adding a way for namespaces to interact with each other right away... and add that feature with a little more thought/design. This also helps to think about security.

In that sense, I agree that there's probably going to always be a hack users can do to access other namespaces anyway but at least we can think of a nice API to make things easy, explicit, etc...

Also, good point on code organization and overall. :)

@JeffersGlass CONGRATS on your honeymoon! I hope you enjoy and have the most wonderful time! I'm bummed (and personally apologize) for the slow progress/conversation on this very good PR... but if you are ok with it we can do the final work to bring it to the finish line while you enjoy your honeymoon and have a wonderful time not thinking about PyScript for a split second 😉 . Just let us know since you did a lot of great work here 🙏

@JeffersGlass
Copy link
Member Author

Thanks @fpliger! I haven't felt it was slow progress at all - it just turned out to be a bigger bite of the apple than I could have imagined, and well worth taking the time to knock around options and ramifications.

Absolutely feel free to take this PR to the finish line while I'm away - whatever route you take, I feel great having been part of this excellent discussion. Thank you and @antocuni for all your brainpower on on this - I'm excited to see the results when I get back, and to see what comes next for Pyscript as well.

@antocuni
Copy link
Contributor

Replying to @JeffersGlass

So perhaps it's worth taking even a step further back and asking - should namespaces provide a (reasonable) way to access each other?

right, I think this is the correct question to ask. I assumed that the answer would be "yes", because in my mind a namespace is merely a way to organize names, not a sandbox/isolation/security feature.
If we want to have a way to separate parts of the page which are not supposed to communicate, then I am -10 to call it "namespace".

the real answer is that there's probably always a way to access other namespaces, via the pyodide or js global scope or somesuch, as you discussed previously. So perhaps the question is 'should they be able to do so easily/as a documented feature

Agreed. Even if we decide that we don't want "namespaces" to be able to see each other, we need to be super explicit to say that this is not a secure sandbox.

On a personal note, I'm leaving for a long-delayed honeymoon

woooo, congrats! I hope you are enjoying it without thinking of pyscript 🎉


replying to @fpliger

I'm +10 on starting small and not thinking of namespaces as a way to interop between languages

agreed, let's keep this discussion python-only

+0 on even keeping this first PR leaner and not adding a way for namespaces to interact with each other right away... and add that feature with a little more thought/design. This also helps to think about security. [....]
In that sense, I agree that there's probably going to always be a hack users can do to access other namespaces anyway but at least we can think of a nice API to make things easy, explicit, etc...

I'm -0 on this. I see your point, but as you are saying people would find undocumented ways to achieve the goal anyway. If we don't design the inter-namespace-communication now, we will surely break someone's code when we will refactor it "properly" later.

@JeffersGlass
Copy link
Member Author

JeffersGlass commented Jul 2, 2022

For the sake of moving this ahead, I'd like to propose that this PR implement the most basic form of code/variable isolation, in ways that allow the possibilities we've discussed to become future improvements:

  • 'Namespaces' are implemented as a simple dict (which could later be the __dict__ of a module, or another mapping).
  • 'Namespaces' do not provide a (reasonable) way of loading/importing objects between them (but that functionality may be added later).
  • The default 'namespace' is called something generic and unique to Pyscript (ie. not __main__, but that generic name could become an alias of __main__ or something else later).

On the implementation end, I'd like to do three things:

  1. Modify the way I handled creation of namespaces in the initial PR - there is a cleaner way using collections.ChainMap that allows (1) namespaces to be determined at runtime and not just when Pyodide is initalized, and (2) eliminates the hack of making a copy of __builtins__ for each namespace.
  2. Add some additional documentation explaining/clarifying this new attribute, its purpose and limitations.
  3. Clean up some conflicts caused by commits in the main branch since this PR was created.

This leaves the hardest question to last: the naming of this concept and the associated HTML attribute. To revisit and propose some possible names:

  • Namespace: I personally like this name; per the Python Glossary: "The place where a variable is stored. Namespaces are implemented as dictionaries... Namespaces support modularity by preventing naming conflicts." That said, the same definition goes on to discuss how "namespaces make clear which module implements a function," and if we're not using modules, perhaps this is the wrong name, as you say @antocuni
  • Environment: From the Python Execution Model 4.2.2: "When a name is used in a code block, it is resolved using the nearest enclosing scope. The set of all such scopes visible to a code block is called the block’s environment." This seems to describe our situation, however I think the name may be too similar to the idea of virtual environments and the py-env system of importing modules and files.
  • Local(s): This name highlights the idea that the associated code is run with/defines a specific set of local names.
  • Scope: Again, emphasizes that the given code is run within a specific scope of names.

@JeffersGlass
Copy link
Member Author

JeffersGlass commented Jul 5, 2022

A couple updates:

  • Using collections.ChainMap proved to be unnecessary. Using pyodide.pyodide_py.eval_code() instead of pyodide.runPython() allows one to pass in a separate dictionary to hold local variables.
  • I've hit a bit of a snag with the way proxied functions interact with namespaces. In short, it seems proxied functions always execute with a global scope, and only the only available local variable is the context that triggered the function. Here's a minimal example (written with the Pyodide execution commands, just to be explicit about the scopes these are executing with). The relevant function signature is pyodide.pyodide_py.eval_code(expr, globals, locals) :
# This code runs in the global namespace:
pyodide.pyodide_py.eval_code(`
  from js import console, document
  from pyodide import create_proxy, to_js

  my_local_dict = {'x': 42} #Create a rudimentary namespace dict
  `,
pyodide.globals); #globals argument to eval_code


#This code will use my_local_dict as its local dictionary
pyodide.pyodide_py.eval_code(`
  console.log(f"Hey, the value of x at the start of this block is {x}") #Outputs 42
  
  def x_value( _ ):
    console.log(f"The value of x is {x}") # Should be 42... but instead raises NameError when executed by event  

  document.getElementById("btn-x").addEventListener('click', create_proxy(x_value))
  `,
pyodide.globals, #globals argument to eval_code
pyodide.globals.get('my_next_dict')); #locals argument to eval_code

This makes a certain amount of sense from a JavaScript perspective, but it feels like a sharp corner from a Python perspective.

It's possible to work around this, I suppose, by dynamically recreating the appropriate local variables when the function is called, but that feels a bit sloppy... another option would be a big warning in the documentation that says Proxied Functions Always Execute with Global Scope, but again, not ideal.

I've reached out on the Pyodide community gitter to see if there are other workarounds.


One current workaround is:

def x_value(_):
    x = globals()['my_next_dict']['x']
    console.log(f"The value of x is {x}") # Really does output 42

One could imagine a function like pyscript.create_proxy that acts like pyodide.create_proxy, but takes an additional namespace argument that dynamically copies the variables from the given namespace into the local scope for the purposes of function execution. Again, not super duper clean, just riffing on options.

Edit: Another idea - since this function receives the PointerEvent from the object that triggers it, it should be possible to get the namespace attribute from that object and use that as the namespace? That way the end user doesn't have to add any additional code themselves, but every user-event code will end up getting wrapped by this handler. I think this is promising.

@antocuni
Copy link
Contributor

@JeffersGlass first of all, sorry for the long delay. Both I and @fpliger had many things going on, including the Anaconda homecoming and the participation to SciPy, but now we are back. And I hope you had an amazing honeymoon :).

That said, let's go back to the namespaces. We have many things to discuss, so I'll try to keep the discussion ordered. The ToC is:

  • what is the expected semantics
  • how to call them
  • how to implement them

What is the expected semantics

  • 'Namespaces' are implemented as a simple dict (which could later be the __dict__ of a module, or another mapping).
  • 'Namespaces' do not provide a (reasonable) way of loading/importing objects between them (but that functionality may be added later).

I agree. Since it's clear that there is no obvious agreement upon what a namespace should do, let's keep it simple, do the bare minimum for now and re-think about if and how to communicate between namespaces later.

  • The default 'namespace' is called something generic and unique to Pyscript (ie. not __main__, but that generic name could become an alias of __main__ or something else later).

Note that if we don't provide a way to communicate between namespaces, the name of the default one is irrelevant.
Also note that the global default namespace provided by pyodide has a __name__ which is equal to __main__.
So I propose to:

  1. keep the name of the default namespace as __main__ (as I said it is irrelevant for now, so let's not bother to change it)
  2. make sure that all namespaces have a __name__ attribute which contains their name; this is very pythonic IMHO

How to call them

  • Namespace: I personally like this name; per the Python Glossary: "The place where a variable is stored. Namespaces are implemented as dictionaries... Namespaces support modularity by preventing naming conflicts." That said, the same definition goes on to discuss how "namespaces make clear which module implements a function," and if we're not using modules, perhaps this is the wrong name, as you say @antocuni
  • Environment: From the Python Execution Model 4.2.2: "When a name is used in a code block, it is resolved using the nearest enclosing scope. The set of all such scopes visible to a code block is called the block’s environment." This seems to describe our situation, however I think the name may be too similar to the idea of virtual environments and the py-env system of importing modules and files.
  • Local(s): This name highlights the idea that the associated code is run with/defines a specific set of local names.
  • Scope: Again, emphasizes that the given code is run within a specific scope of names.

Thanks for the nice summary!
My opinion:

  • -1 for environment, because as you said it's too tied with the concept of virtualenvs and py-env
  • -1 for locals: python has already a concept of locals() which is different than what we are trying to achieve here: our namespaces are substitutes for the globals() (see later for an explanation of the difference and why you had troubles with proxies)
  • -0.5 for scope, but I cannot really explain why :)
  • +1 for namespace: my main issue with this name is if we want to use them as a code isolation feature. But the more I think about it the more I am convinced that we cannot isolate code in this way (for that, you really need two separate pyodide instances). So, I'm happy to call them namespaces as long as don't advertise it as a security/safety feature, because that would be a lie. We should describe it for what it is, i.e. a way to organize your code, and for that namespace is a good name.

How to implement them part 1: why locals() is a bad idea

Using pyodide.pyodide_py.eval_code() instead of pyodide.runPython() allows one to pass in a separate dictionary to hold local variables.
[cut]

  • I've hit a bit of a snag with the way proxied functions interact with namespaces. In short, it seems proxied functions always execute with a global scope, and only the only available local variable is the context that triggered the function.
    [cut]
    This makes a certain amount of sense from a JavaScript perspective, but it feels like a sharp corner from a Python perspective.

This is actually the correct/expected behavior even from the point of view of Python. In short, every python function is associated to its global scope, which is "the dictionary where to search for names which are not local variables".
From within a function, you can get it by calling globals(). From the outside, you can get it using myfunc.__globals__. By default, the __globals__ of a func is the same as the globals() at the place where it is defined.

So, when you do a name lookup it first checks the locals(), and then the globals().
When you define a new function, it keeps the globals() but gets its own locals() at every invocation.

So basically, what are you doing is equivalent to the following pure python code (no pyodide needed):

myglobals = {'x': 'inside myglobals'}
#myglobals = {}
mylocals = {'x': 'inside mylocals'}

exec("""
print(f"[1] at global level:  x = {x}")

def x_value():
    print(f"[2] inside x_value(): x = {x}")
""",
     myglobals,
     mylocals)

x_value = mylocals['x_value']
x_value()
print(x_value.__globals__ is myglobals)

Which has the following output:

$ python3 /tmp/foo.py
[1] at global level:  x = inside mylocals
[2] inside x_value(): x = inside myglobals
True

At point [1], python tries to search for x in mylocals first, and it finds it.
At point [2], python tries to search for x in the function's own locals() (which is empty), and when it doesn't find it searches inside its globals(), which is myglobals.
If you uncomment myglobals = {}, you will get a NameError inside x_value(), exactly as it happens in your example.
I agree that the end result is a bit confusing, but it's the Python semantics :).

Long story short: we don't want to use locals() for our namespaces because as you noticed it's not what you would expect. What we want is really that each namespace has its own globals().

How to implement them part 2

I think that a lot of the complexity of this PR comes from the fact that you try hard to copy pyodide's globals into each namespace, but this is not really needed, because the pyodide.globals is mostly empty. Moreover, the pyodide FAQs contain the answer to our very precise question: How can I execute code in a custom namespace?

I tried the proposed solution in a simple HTML and it seems to just work:

    <script type="text/javascript">
      async function main(){
          console.log('start!')
          let pyodide = await loadPyodide();

          pyodide.runPython(`
              print("Default global namespace", __name__)
              print("id(globals()) =", id(globals()))
              for key in sorted(globals().keys()):
                  print("   ", key)
          `);

          console.log("---");
          let my_namespace = pyodide.globals.get("dict")();
          my_namespace.set('x', 42);
          pyodide.runPython(`
              print("Custom global namespace", __name__)
              print("id(globals()) =", id(globals()))
              for key in sorted(globals().keys()):
                  print("   ", key)

              def x_value( _ ):
                 print("click!")
                 print(f"The value of x is {x}")

              import js
              from pyodide import create_proxy
              js.document.getElementById("btn-x").addEventListener('click', create_proxy(x_value))
          `, { globals: my_namespace });
      }
      main();
    </script>

If I run it (and click on the button), I get the following output:

Python initialization complete
Default global namespace __main__
id(globals()) = 9269112
    __annotations__
    __builtins__
    __doc__
    __loader__
    __name__
    __package__
    __spec__
    pyversion
    version_info
---
Custom global namespace builtins
id(globals()) = 9632880
    __builtins__
    x
click!
The value of x is 42

So, I suggest to just use this solution to implement namespaces. As a side effect, we also get the bonus point that we can get rid of initNamespaces(), because it's no longer needed to initialize them at startup.

@JeffersGlass
Copy link
Member Author

JeffersGlass commented Jul 28, 2022

Thank you @antocuni! I think we're getting quite close on this. And thank you for the very thorough explanation of the functionality of locals - now that you've explained it to clearly, I would agree that the functionality we want is for each namespace to have its own globals and to leave locals out of it.

And similarly, I think referring to the concept as Namespaces and using the attribute namespace=... on Pyscript specific tags seems like a decent way to move forward. I'll ensure that's the case across this PR.

Regardless of the solution, I do think initNamespaces() can and should be done away with - especially since it created all its namespaces at page-load, and the ability to create new namespaces at runtime is probably desirable. It was part of the original hack I was trying to solve the following problem...


Implementation: Objects from Pyscript.py

To clarify the implementation issue: the issue I'm having is not one of wanting to copy pyodide.globals into each namespace, but of wanting access to the objects that Pyscript.py defines and initializes when Pyscript.js runs, and make them available in every namespace. (Element, PyScript, PyListTemplate, add_classes(), create(), etc..). Copying pyodide.globals after those objects were defined was just the original way I tried to achieve that.

Since on startup/pyodide load we immediately run pyodide.runPythonAsync(pyscript); (where "pyscript" is the contents of pyscript.py), those classes/functions/objects exist in the global namespace. What I've been struggling to find a solution for is how to allow code in other namespaces (i.e. using other dicts as their "globals" dict) to access those objects, so that a user can write, say, my_new_tag = Element('tag_id') in any namespace.

I had hoped that it would be possible to use a ChainMap or similar for the globals dict, to allow for chained lookup in both the unique "globals" dictionary of a namespace and the original global namespace, but from what I understand only a true dict can be used as the globals argument to pyodide.runPython(), as with exec.


Further Possible Solutions

I suppose one solution would be: whenever a new namespace needs to be created (i.e. evaluate() is called with a new namespace attribute), we could do pyodide.runPythonAsync(pyscript); in that new namespace before executing user code. This would allow the end user to make use of the classes/functions/variables that are in Pyscript.py.

Another might be adding those classes/functions to builtins, so that each namespace has access to them?

Another might be (and would want you to weigh in @fpliger, since this very much stretches the boundaries of this PR) that Pyscript.py doesn't get run unless imported by the user in a particular namespace. The user would have to use, for example, from pyscript import Element before using the Element class, and so on for the other classes/functions in Pyscript.py. This feels like an interesting route to me, since it makes it explicit that there are additional classes/functions to be used in this context. Though, some parts of that script (OutputManager?) might still need to run to ensure the desired behavior of things like stdout?


Does this all make sense? I feel I've done an imperfect job of laying out the problem I've been trying to solve - and perhaps there is a cleaner or more Pythonic way to do this.

@antocuni
Copy link
Contributor

antocuni commented Aug 3, 2022

To clarify the implementation issue: the issue I'm having is not one of wanting to copy pyodide.globals into each namespace, but of wanting access to the objects that Pyscript.py defines and initializes when Pyscript.js runs, and make them available in every namespace. (Element, PyScript, PyListTemplate, add_classes(), create(), etc..). Copying pyodide.globals after those objects were defined was just the original way I tried to achieve that.

ah ok, I understand what is the problem now, and I understand why you wanted to fix it iwth initNamespace(). It makes sense, given the current state of things.

Since on startup/pyodide load we immediately run pyodide.runPythonAsync(pyscript); (where "pyscript" is the contents of pyscript.py), those classes/functions/objects exist in the global namespace.

yes, I think that the current way of evaluating pyscript is problematic and should be improved.
For example, I faced a similar problem in #642: the content of the pyscript module are available inside the global namespace (aka __main__) but I wanted them to be available as import pyscript, so I added a hack:
https://github.com/pyscript/pyscript/pull/642/files#diff-f6c598af7fa6ff27144b5cc53370b3af217597392db136c6c14eb1c7a14bd529R455-R460

But here you are facing a similar problem, so I start to think that we should solve the problem once for all: we should evaluate the content of the source file into a proper pyscript module.
Then, we could have a function which does the equivalent of from pyscript import * when we create a new namespace. So, something along this lines (untested, but you get the idea):

import pyscript from './pyscript.py';

function loadInterpreter() {
    await pyodide.runPythonAsync(`
def _create_module(src):
    import sys
    import types
    mod = types.ModuleType('pyscript')
    exec(src, mod.__dict__)
    sys.modules['pyscript'] = mod

def new_namespace():
    import pyscript
    ns = {}
    ns.update(pyscript.__dict__)
    return ns
`)
    let _create_module = pyodide.globals.get('_create_module');
    _create_module(pyscript);
}

Another might be (and would want you to weigh in @fpliger, since this very much stretches the boundaries of this PR) that Pyscript.py doesn't get run unless imported by the user in a particular namespace. The user would have to use, for example, from pyscript import Element before using the Element class, and so on for the other classes/functions in Pyscript.py. This feels like an interesting route to me, since it makes it explicit that there are additional classes/functions to be used in this context.

You push an open door here. I also think that pyscript puts too many things in the global namespace and that we should do less.
We are trying to strike a balance between cleanliness and ease to use and the hard part is to find the right tradeoff; and usually I and @fpliger are on the opposite sides of the balance 😂

I think a good starting point could be:

  • all APIs are inside the pyscript module by default
  • import pyscript is done automatically, so that people can do e.g. pyscript.Element without having to import it explicitly
  • for a selected small set of names, we also put them in the global namespace, so that you can access them directly without using pyscript.; for example, we are going to have a function called render() to be used instead of print(), and this should probably be in the global namespace.

Anyway, this is probably OT with this PR and warrants its own discussion.

Does this all make sense? I feel I've done an imperfect job of laying out the problem I've been trying to solve - and perhaps there is a cleaner or more Pythonic way to do this.

It makes a lot of sense, thank you again for all the efforts and useful insights.

@fpliger
Copy link
Contributor

fpliger commented Aug 3, 2022

Thanks @JeffersGlass and @antocuni for moving the conversation ahead. This is really important and forces us to start taking some design decisions that, honestly, are needed asap anyway.

Focusing on one of the major source of problems, pyscript.py and what to [force] loaded in the scope: as @antocuni mentioned, he and I are often on the opposite sides of the balance 😆 but what I think we can agree on is that we need to improve it. I think @antocuni is of the idea that it should be fully removed from scope and that it should be a module. I'm convinced that, yes a bunch of things in there should be moved away and probably be scoped in their own module but others should be loaded already in the scope (to focus on users novice users). We should also provide the possibility to for users to opt-out of that and not have them loaded by default, through a config option.

Imho, the exercise we should do here is to sit down, think what's the layout of this higher level API and make a concrete proposal. With that down, the work on the namespaces design/implementation will be easier. I think @antocuni proposal above is in the right direction.

@antocuni
Copy link
Contributor

antocuni commented Aug 3, 2022

Focusing on one of the major source of problems, pyscript.py and what to [force] loaded in the scope: as @antocuni mentioned, he and I are often on the opposite sides of the balance laughing but what I think we can agree on is that we need to improve it. I think @antocuni is of the idea that it should be fully removed from scope and that it should be a module. I'm convinced that, yes a bunch of things in there should be moved away and probably be scoped in their own module but others should be loaded already in the scope (to focus on users novice users).

Strangely enough we agree here :).
I propose the following:

  1. move all the content which is currently in the globals inside a proper pyscript module
  2. make the pyscript module available by default
  3. make a few selected functions/classes of pyscript also available globally.

I bet the disagreement will come when we will have to decide what to include in (3). I propose to start from nothing, and then adds things when and only when we find an example use case which hugely benefits from it. E.g., render() will surely need to be a global, as soon as we introduce it.

We should also provide the possibility to for users to opt-out of that and not have them loaded by default, through a config option.

I'm unsure about this. If we keep the list at point (3) small enough, we probably don't need it.

Imho, the exercise we should do here is to sit down, think what's the layout of this higher level API and make a concrete proposal. With that down, the work on the namespaces design/implementation will be easier. I think @antocuni proposal above is in the right direction.

If you agree with the plan above, I will start a PR

@fpliger
Copy link
Contributor

fpliger commented Aug 3, 2022

Yeah, agree with the above (and that 3 is probably where we need to open a beer to ease the conversation 😆 )

I'd be +1 on starting this and only propose to start with exposing Element since it's probably the main class that may be exposed and used around. I'd also add a deprecation warning in the console warning that it will not be available in globals in the next release and users will need to import Element from something

I think you also proposed a better modularization of what pyscript provides, which I agree with. So it may even make sense to move Element somewhere else better thought/designed (i.e. a new pyhtml or pyscript.html or something similar)

Opened #659 to track this work/discussion

@fpliger fpliger mentioned this pull request Aug 3, 2022
3 tasks
@JeffersGlass
Copy link
Member Author

Thank you both very much - agreed, with some definitive answers around what is and isn't included in scope by default, the implementation and interface of namespaces is should become easier to specify. I have my own opinions on the [approachable for novices] vs [cleanliness of scope] spectrum, but I'll leave that to the discussion in #659 .

Since it's possible that the new namespace implementation looks nothing like what's in this PR, I'll close this, in preference to centering the discussion in the other issue.

@antocuni
Copy link
Contributor

antocuni commented Aug 4, 2022

Thank you both very much - agreed, with some definitive answers around what is and isn't included in scope by default, the implementation and interface of namespaces is should become easier to specify. I have my own opinions on the [approachable for novices] vs [cleanliness of scope] spectrum, but I'll leave that to the discussion in #659 .

I would love to hear your thoughts about it.

Since it's possible that the new namespace implementation looks nothing like what's in this PR, I'll close this, in preference to centering the discussion in the other issue.

thank you for pushing this conversation. I think it helped a lot to clarify what we want, what we don't and what are the various tradeoffs.
It might be the "best non-merged PR ever" :)

@fpliger
Copy link
Contributor

fpliger commented Aug 4, 2022

Thank you @JeffersGlass , looking forward to your feedback in #659 !

It might be the "best non-merged PR ever" :)

lol, totally agree!

@pauleveritt
Copy link

I know this is closed and #659 is the new home. But I wanted to argue about something left behind. 😀

I'm approaching this from a premise: Python code should look and run like Python code. I worry that we're creating a Python-inspired thing called PyScript.

Thus, I like the idea of magic globals disappearing, and you have to import stuff to access it. No more red-squiggles in my editor.

This means that I also like the discarded idea of modules. IMO, we should try harder to stick to module semantics, and possibly implementation. The more we can move from the JS side to the Python side, the better. PyScript code should "run on the server" as well as in the browser.

Sharing state might be a different thing. But web frameworks have thought about this forever due to threading. I think we'll ultimately come up with a channel/state system that is well above both modules and namespaces. #642 by @antocuni matches some of my experiments and notes.

Last point: let's keep an open mind about "module" and "package". Wheels, zipapps, even importing from database files. I watched this PyCon talk and the last sentence in that writeup was pretty mind-blowing. But it still stayed inside Python semantics.

I will re-re-read this conversation as I realize there are a TON of gotchas that @JeffersGlass and @antocuni have been considering, so I'm likely naive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tag: component Related to PyScript components tag: interpreter Related to the Python interpreter configuration waiting on feedback Issue or PR waiting on feedback from core team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants