Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write http.client in terms of Web APIs #140

Open
mdboom opened this issue Sep 5, 2018 · 21 comments
Open

Write http.client in terms of Web APIs #140

mdboom opened this issue Sep 5, 2018 · 21 comments
Labels
enhancement New feature or request help wanted Extra attention is needed roadmap

Comments

@mdboom
Copy link
Collaborator

mdboom commented Sep 5, 2018

This might not even be possible given blocking issues.

However, if we could write http.client in terms of Web APIs, we might be able to get things like pip partially working. As it stands, Python libraries built on top of raw sockets don't (can't) work.

@mdboom mdboom added the enhancement New feature or request label Sep 5, 2018
@anshuldutt21
Copy link

Hi, I would like to contribute to this issue.

@rth
Copy link
Member

rth commented Jun 5, 2021

If I'm not mistaken, @kikocorreoso mentioned that another possibility could to look at the work done in Brython where some of these standard library modules might be re-implemented in JavaScript.

I quickly checked, and for instance http.client according to git commit messages, is identical to upstream, while _socket.py doesn't do anything, but maybe I'm missing something. Also brython-dev/brython#1032 (comment) suggests that there are no replacements for low level network connectivity. Also couldn't find anything about it in Skulpt.

Anyway it would indeed be a good idea to read earlier discussion on this subject in the issue tracker of these projects.

@kikocorreoso
Copy link

@rth my comment was more in the vein of using js libs instead of py libs when it makes sense. Some of them are maybe implemented in Brython, batavia, skulpt,..., and coul be reused in some way.

For instance, re was very slow as it was implemented in pure Python in Brython so @PierreQuentel reimplemented the functionality in JS.
brython-dev/brython#1519

I suppose re is in WASM in PyOdide so maybe this example it is not very useful. I was thinking more in pure PY libs that have been rewritten in JS to adapt some behaviour to the browser/ for performance reasons, etc.

I don't know if this could help in terms of "DoNotReinventTheWheel", performance,...

@rth
Copy link
Member

rth commented Jun 8, 2021

this could help in terms of "DoNotReinventTheWheel"

Yes, absolutely. Thanks for your comment! We should definitely look at what could be used/adapted in JS before implementing stuff :)

@hoodmane
Copy link
Member

hoodmane commented Jun 8, 2021 via email

@datakurre
Copy link

@hoodmane I’d fancy to check and try out the piece of http.client API, which you had implemented, but I was unable to find the correct branch. Would you be able to link your version here?

@hoodmane
Copy link
Member

Yeah, the actual partial http_client implementation is here:
https://github.com/hoodmane/pyodide/blob/comlink-demo/src/pyodide-py/pyodide/http_client.py

It uses a comlink fork which I have here:
https://github.com/hoodmane/pyodide/tree/comlink-demo/comlink

The actual demo is here:
https://github.com/hoodmane/pyodide/tree/comlink-demo/demos/syncio

I can't remember how well this stuff works. My plan is to work on the comlink port in this separate repository:
https://github.com/hoodmane/synclink
I suppose it would be good to make a comparable demo that uses that repo for the comlink fork.

@datakurre
Copy link

@hoodmane Thank you for the links. Unfortunately, that ended up being too much for me to get that work within the time I had, but at least that took me through learning building working pyodide from source 💪

@rth
Copy link
Member

rth commented Nov 13, 2021

Interesting work @hoodmane ! So what do you think should be next steps on this?
Threading #237 doesn't look that far away, most major browsers now support it I think.

For this comlink demo, I think it would help to make this a bit more visible? Maybe move some of it to the pyodide org?

Otherwise taking a different approach, aren't there some proxy that could change the MIME type of a response from binary to plain/text, so that we could still fetch it with pyodide.open_url? Either a external proxy or even in a service worker? Though I guess the latter, even if it works, is not very different in complexity from running a web worker.

@ricardoprins
Copy link
Contributor

I'm just too lazy to read everything - I confess.

Since this (and #398, consequently) have a significant impact on pyscript (I'm surfing the hype as well), I want to help to get this done. So, which are the necessary steps to finish this task (and consequently solving indirectly the requests' issue)?

I wanna help, but I want to understand the "bigger picture" first.

@hoodmane
Copy link
Member

@ricardoprins can we set up a meeting?

@ricardoprins
Copy link
Contributor

Sure, that would be great.

@hoodmane
Copy link
Member

hoodmane commented May 23, 2022

It's weird that github has no DM feature. I guess you could use private repos for that purpose as a hack.

@iuriguilherme
Copy link

I have a question. Why make it blocking when there's aiohttp?

@rth
Copy link
Member

rth commented Jun 10, 2022

Because there are a lot of libraries that are sync and use http.client (either directly or via requests, etc) and won't be able to use aiohttp or another async function as a replacement. Unless #2664 is implemented, but it would take time.

For the cases where async use is possible, we have added pyodide.http.pyfetch which has a somewhat similar API (but without the session context manager)

@rtpg
Copy link

rtpg commented Sep 2, 2022

Serious question: while synchronisity is a problem for JS because of the interaction model with the top level, if Pyodide's code evaluation entrypoints were all async (that is to say, runCode is also async) then the python-level code could all be synchronous and things could be papered over with Asyncify at the Python/C FFI layer, maybe? After all, CPython itself already has a similar yielding concept in place.

Given there is in theory full control of the Python VM here I want to believe there is a way forward that doesn't involve too much pain.

@hoodmane
Copy link
Member

hoodmane commented Sep 2, 2022

In order for asyncify to work, all C, C++, Rust, fortran, etc code would have to be linked with it both in the main module and in side modules. I think the performance cost would be significant and we would probably have to find and fix bugs in asyncify. If someone does this and profiles it to be okay for performance we might consider it. But I think the costs are too high.

@twinsant
Copy link

So, what's the progress?

@ross-spencer
Copy link

ross-spencer commented Jan 27, 2023

Can I ask a question about security?

My understanding currently is that if you try and POST data you'll get an error such as:

Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/lib/python3.10/urllib/request.py", line 1377, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 23] Host is unreachable>

If this feature is implemented, will it be entirely up to the underlying library to "promise" not to forward data loaded into the browser to another source? Are there other controls on this type of issue?

@rth
Copy link
Member

rth commented Jan 27, 2023

Can I ask a question about security?

Sure. That error doesn't mean you cannot make that post request, only that you can't make it with urllib. Making it with pyodide.http.pyfetch (or via JS functions) would work. So generally libraries can make arbitrary network connections both when running on host Python and in the browser.

You can whitelist the allowed domains in the browser with CORS apparently.

@alekssamos
Copy link

alekssamos commented May 20, 2023

since PHP and Wordpress exist, maybe you can make sockets?

I found a project where they compile PHP, SQLite and run Wordpress.
https://wordpress.wasmlabs.dev/

I think PHP can use sockets somehow.
Otherwise, how does the browser interact with this PHP?
Maybe you can still add sockets to pyodide?
And there will be libraries urllib, requests, aiohttp, httpx.
Or is it impossible and will have to be done only exclusively through js?
And web sockets (ws) can be to do?

What do you think about it?

Yes, I read the FAQ (1, 2, 3) where it was mentioned, but since I found PHP + Wordpress, I wanted to ask again.
searched here, there have already been similar topics.
So, these are the limitations of the web assembler virtual machine itself, the browser, or restrictions only on the pyodide side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed roadmap
Projects
None yet
Development

No branches or pull requests