Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

'print' statement ignores 'sys.stdout' #290

Open
alanjds opened this issue Apr 22, 2017 · 29 comments
Open

'print' statement ignores 'sys.stdout' #290

alanjds opened this issue Apr 22, 2017 · 29 comments

Comments

@alanjds
Copy link
Contributor

alanjds commented Apr 22, 2017

Replacing sys.stdout have no effect over print statement.

This test script outputs AssertionError: 0 chars printed, instead of 3, after printing 'foo' in the stdout.

import StringIO
import sys

sio = StringIO()
stdout = sys.stdout
sys.stdout = sio

print 'foo'

sys.stdout = stdout

chars = sio.tell()
assert chars == 3, '%s chars printed, instead of 3' % chars
@trotterdylan
Copy link
Contributor

Effectively, the sys module does this:

from __go__.grumpy import Stdout

stdout = Stdout

The Print() function in the Go runtime writes directly to grumpy.Stdout. It does not get it from the sys module. So things work fine until you re-bind sys.stdout, as you've found.

There are a couple ways to fix this:

  1. Write the sys module in Go and have the grumpy package import it at package init time so that the Print() function can access the sys module and write to the stdout attribute instead of a global Go variable. This is close to how CPython works. However, I think it would require adding some special casing into the import system, which could get ugly.

  2. Make the sys module an instance of some class where stdout is a property whose setter updates the value of Stdout in the grumpy runtime package. Because there's no way to assign to Go package variables from Python at the moment, this would require adding a SetStdout() function to the grumpy runtime package, which feels a little hacky.

In short, there's no great way to fix it at the moment, however, I think some avenues will open up once Grumpy is self-hosted.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 23, 2017

What about implementing the print statement as a pure-python function, and hook the golang Print function to the python one?

I will need to reimplement the print stuff very soon anyway...

@trotterdylan
Copy link
Contributor

trotterdylan commented Apr 23, 2017

If, for example, you define print() in __builtins__.py then you have the problem of accessing the __builtins__ Go package from within the grumpy runtime package, because they each would depend on each other.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 23, 2017 via email

@trotterdylan
Copy link
Contributor

trotterdylan commented Apr 23, 2017 via email

@alanjds
Copy link
Contributor Author

alanjds commented Apr 23, 2017

I just found the Print() function, and it should not rely on 'file' arg being a File anyway, otherwise "print" will never accept a duck-typed file-like, in my opinion.

For now, my plan is to factor it out from builtins, even if it is always loaded

Will draw some sketches on dead-trees to help with the circular dependencies. Maybe lazy-importing sys.Stdout, with a local cache to not penalize the most used case. I like it not, however.

@trotterdylan
Copy link
Contributor

trotterdylan commented Apr 23, 2017 via email

@cclauss
Copy link
Contributor

cclauss commented Apr 24, 2017

Are there a test cases for Python scripts that start with from future import print_function? That import makes the printer statement, the >> and the trailing comma all invalid. It also enables the end function parameter.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 24, 2017

@cclauss Yeah, but is a flag already handled by the compiler. I remember to saw some ifs in the compiler code about this. Print from "print statement" and from "print function" are different code paths, per file compiled, IIUC.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 24, 2017

@trotterdylan: Thinking in Print() check for 'sys' in SysModules and, if found, use 'sys.stdout'. If not found, fallback to Stdout as the 'sys' was not imported and sys.stdout is sure to be Stdout anyway.

@trotterdylan
Copy link
Contributor

Although that solution should work (if sys was not imported, then there's no way for someone to override it from Python), it's not very satisfying IMO. It's pretty easy to imagine a scenario where this logic would diverge from CPython (probably involving messing around with sys.modules). Although these scenarios are far fetched, I'm more worried that this implementation divergence from CPython will cause other, more subtle problems. I think the right approach is to make the sys module "built in" in a similar way to CPython.

A variation on option 1. above is to do exactly what CPython does and define the sys module within the runtime itself: https://github.com/python/cpython/blob/2.7/Python/sysmodule.c

This would still require some changes to the import system to special case sys, but it's probably the best option we have available in the short term.

@corona10
Copy link
Contributor

corona10 commented Apr 25, 2017

@trotterdylan I definitely agree with your idea. If we need modules of python which are implemented by C-level API. At that case, We should implement it by Go also.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 25, 2017

I do not know if I like this path.

I remember of news of some conference (could not find the reference, sorry) where people of different Python "runtimes" including CPython agreed that the best would be to write the stdlib in pure-python if possible, falling back to C-API only for serious speed needs or low-level access needs.

In the long term, do you expect to be easier to maintain a pure-python sys or a Go sys?
For now, the pure-python code have 73 LOC. The CPython one have 1801 LOC.
And I do not see a huge speed improvement on this path.

Another point is that, as we are borrowing PyPy and Ouroboros code, they could take it back and share the maintenance burden with us, but only if our modules are useful to him.

For now, my view of the thing is to implement whatever possible in pure-python (will be transpiled to Go anyway...), not worrying with implementation divergence from CPython. The result could be simpler and easier to maintain. And more useful to other pals too (PyPy for exemple).

@alanjds
Copy link
Contributor Author

alanjds commented Apr 25, 2017

Be clear that it is my philosophical perspective, more aligned with PyPy "view". But a view like "lets transpile CPython code to Go, whatever it is a good code or not" will sure work too, and will easily emulate the goods and bads there. Including maintenance ones.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 25, 2017

Please confirm your spirits. I can wait or can give a shot, even if to change it later. just confirm.

@trotterdylan
Copy link
Contributor

I agree with the spirit of your argument, but our sys.py would not be portable Python anyway, since it has to use __go__ imports to function properly. The sys module is kind of special in that it is tightly integrated with the interpreter state (default encoding, sys.modules, stdin/out/err, etc.)

That said, I believe that with more advanced native integration we can achieve what we want here while keeping the sys module in Python. Option 2. above is a start in this direction. Once we can assign to Go package variables from Python, we could get rid of the SetStd*() functions in the grumpy package and assign directly to the handles in the Go os package. That's a really nice end state because it means that Go and Python code will share the same std* handles.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 25, 2017

our sys.py would not be portable Python anyway, since it has to use __ go__ imports to function properly

PyPy solved it by having non-portable modules "native", something alike:

# sys.py (portable)
from _sys import stdout

# _sys.py (non-portable)
from __go__.os import Stdout
from __go__.grumpy import NewFileFromFD
stdout = NewFileFromFD(Stdout)

Now going on Option 2) variant...

Once we can assign to Go package variables from Python...

I do not see a good interface for it to happen. Go variables are not ObjectType instances. The far I can see is Grumpy package variables being assigned from Python, as you said.

import __go__.os as go_os
import __go__.grumpy.sys as grumpy_sys

go_os.Stdout = StringIO()   # Will never work I guess?
grumpy_sys.Stdout = StringIO()   # May work.

This way, everything in Grumpy (including Print()) should use grumpy.sys.Stdout, and users would overwrite grumpy.sys.Stdout. Still do not know how to proxy changes to sys.stdout to grumpy.sys.Stdout. Some hack would be needed.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 26, 2017

For "some hack", I tried to overwrite setattr of a module in CPython and it let me but have no effect in attribution. Seems as a limitation of CPython that Grumpy could ignore and be more flexible.

@trotterdylan
Copy link
Contributor

Re: go_os.Stdout = StringIO()

Yes, this is a tricky case that you'd probably need to jump through hoops to make work with os.Pipe() or something like that. That perhaps creates more problems than it solves.

Re: __setattr__ for modules:

One approach that should work is to a) define a descriptor (i.e. a property) for std* attributes and b) swap out sys.modules['sys'] with a custom class. Something along these lines:

# sys.py
from __go__.grumpy import GetStdout, SetStdout, SysModules

class _SysModule(object):
  def _get_stdout(self):
    return GetStdout()
  def _set_stdout(self, f):
    SetStdout(f)
  stdout = property(_get_stdout, _set_stdout)

modules = SysModules
modules['sys'] = _SysModule()

Then basically everything in the sys module would have to live on the _SysModule class.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 26, 2017

...And that is the hack we need to grumpy.sys map sys.py !!

But after attr_mode:rw got merged, GetStdout/SetStdout would become not needed. Or even better, maybe a __dict__ on grumpy.sys would allow this (and any) attributions transparently.

# sys.py
from __go__.grumpy import SysDict, SysModules

__all__ = ['every', 'func', 'implemented', 'in', 'python']

class _SysModule(object):
  def __init__(self):
    for k in __all__:
      SysDict[k] = locals()[k]
  def __setattr__(self, name, value):
    SysDict[name] = value
  def __getattr__(self, name):
    return SysDict.get(name)

modules = SysModules
modules['sys'] = _SysModule()

@trotterdylan
Copy link
Contributor

Cool, yeah, something like that would work.

I wonder though whether this is getting more hacky than just implementing the whole thing in Go. The Python code in sys.py is pretty far from portable at that point.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 27, 2017

I could accept a 10-line hack for every hybrid module in the core... Maybe factored out to the importer if occur too often.

Will give a shot and you can decline the PR later if it gets too messy, ok?

@trotterdylan
Copy link
Contributor

trotterdylan commented Apr 27, 2017 via email

@alanjds
Copy link
Contributor Author

alanjds commented Apr 28, 2017

https://github.com/google/grumpy/blob/master/runtime/object.go#L127

Let me guess: __getattr__ and __setattr__ support are yet to be implemented.

Will open another issue/branch and freeze this one.

@trotterdylan
Copy link
Contributor

You are correct. Those have not yet been implemented. It should be pretty straightforward to add though.

@alanjds
Copy link
Contributor Author

alanjds commented Apr 28, 2017

(after send a WIP PR here)

@alanjds
Copy link
Contributor Author

alanjds commented Apr 28, 2017

Yes. Will just do in another branch/issue

@alanjds
Copy link
Contributor Author

alanjds commented Apr 29, 2017

...and what is my surprise to discover that __getattribute__ is indeed implemented!
Changed to use it instead of __getattr__ and it worked.

Now will continue the fix from the #302 base branch.

@alanjds
Copy link
Contributor Author

alanjds commented May 1, 2017

Fix proposal: #304.

Needs #302 (Replaceable sys.stdout via SysmoduleDict) merged first.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants