Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a ProxyResponse class to fastapi_proxy_lib #15

Closed
gjmhmm8 opened this issue Dec 2, 2023 · 21 comments · Fixed by #22
Closed

Add a ProxyResponse class to fastapi_proxy_lib #15

gjmhmm8 opened this issue Dec 2, 2023 · 21 comments · Fixed by #22
Labels
enhancement New feature or request

Comments

@gjmhmm8
Copy link

gjmhmm8 commented Dec 2, 2023

I would like to request a new feature for fastapi_proxy_lib, which is a library that allows fastapi users to easily create proxy endpoints. The feature is a ProxyResponse class that can be used as a return value for fastapi routes. The ProxyResponse class would take care of sending a request to a target URL, receiving the response, and forwarding it to the client. The ProxyResponse class would also allow the user to customize the request and response headers, as well as the response content, by passing optional arguments or a callback function.

Example

Here is an example of how the ProxyResponse class could be used in a fastapi app:

import fastapi
from fastapi_proxy_lib.fastapi import ProxyResponse
app=fastapi.FastAPI()
def my_respfun(headers,status_code,content):
    # do something with headers and status_code, such as parsing, modifying, filtering, etc.
    yield {
        'headers':headers,
        'status_code':status_code
    }
    yield from content

@app.get('/foo')
def foo():
    return ProxyResponse(
        url="http://www.example.com/foo/",
        method='GET',
        reqheaders={"User-Agent": "My Custom Agent"},
        respheaders={"Content-Type": "text/plain"},
        respfun=my_respfun
    )

In this example, the /foo endpoint would proxy the request to http://www.example.com/foo/, using the GET method and the custom User-Agent header. The response would be forwarded to the client, with the Content-Type header set to text/plain. The response content would also be processed by the my_respfun function, which could modify the headers, status code, or content as needed.

The ProxyResponse class would have the following constructor parameters:

  • url: The target URL to proxy the request to. Required.
  • method: The HTTP method to use for the proxy request. Optional. Default: The same method as the original request.
  • params: The query parameters to use for the proxy request. Optional. Default: The same parameters as the original request.
  • data: The request body to use for the proxy request. Optional. Default: The same body as the original request.
  • json: The JSON data to use for the proxy request. Optional. Default: None. If provided, it will override the data parameter and set the Content-Type header to application/json.
  • files: The files to use for the proxy request. Optional. Default: None. If provided, it will override the data parameter and set the Content-Type header to multipart/form-data.
  • reqheaders: A dictionary of custom headers to use for the proxy request. Optional. Default: The same headers as the original request, except for the Host header, which will be set to the target URL's host.
  • respheaders: A dictionary of custom headers to use for the proxy response. Optional. Default: The same headers as the target response, except for the Transfer-Encoding header, which will be removed if the response is not streaming.
  • respfun: A callback function to process the response headers, status code, and content. Optional. Default: None. If provided, it should take three arguments: headers, status_code, and content, and return a generator that yields a dictionary with the keys headers and status_code, followed by the modified content. The headers and status_code values will be used for the proxy response, and the content will be streamed to the client.
@gjmhmm8 gjmhmm8 added the enhancement New feature or request label Dec 2, 2023
@WSH032
Copy link
Owner

WSH032 commented Dec 2, 2023

Thank you very much for your valuable advice.


Can you provide more specific use cases? For example, what specific functionality do you intend to implement using this feature?

Can fastapi_proxy_lib.core.http fulfill your requirements?


It seems like you've read the source code of fastapi_proxy_lib, right? (just out of curiosity)

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 3, 2023

用例差不多就这些,大差不差就行,fastapi_proxy_lib.core.http的话文档不全,具体也没怎么看,建议先把文档完善了

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 3, 2023

代理这东西最重要的估计就是鉴权和双向修改,那个示例啥都没有,与其那样还不如上nginx或者302

@WSH032
Copy link
Owner

WSH032 commented Dec 3, 2023

你提到的这些算是高级用例了,要配合httpx来弄。


如果要鉴权的话,你看看这个,https://www.python-httpx.org/advanced/#customizing-authentication

fastapi_proxy_lib的所有代理都有一个client形参,你把自定义的httpx.AsyncClient传进去就行。

比如

class MyCustomAuth(httpx.Auth):
    def __init__(self, token):
        self.token = token

    def auth_flow(self, request):
        # Send the request, with a custom `X-Authentication` header.
        request.headers['X-Authentication'] = self.token
        yield request

reverse_http_app(client=httpx.AsyncClient(auth=MyCustomAuth("mytoken")), base_url=base_url)

双向修改的话,对于请求和响应的headers(包括cookies)的修改我都能理解,但是修改body似乎是一个很奇怪的用例。我不知道这样做的实际业务场景是什么。

如果你希望弄respfun这类回调的话,可以参考https://www.python-httpx.org/advanced/#writing-custom-transports

给他传一个自定义transports进去就行。

import httpx

class MyTransport(httpx.AsyncHTTPTransport):

    async def handle_async_request(self, request):
        request.headers["foo"] = "bar"

        target_resp = await super().handle_async_request(request)

        target_resp.headers["a"] = "b"
        return target_resp

reverse_http_app(client=httpx.AsyncClient(transport=MyTransport()), base_url=base_url)

非常不建议修改httpx.response中涉及body的部分

  • 修改前要把body全部加载完,增加IO时间

你提到的nginx或者302,和这个项目的应用场景不太一样。

  • nginx必须要再开一个端口给他吧?(我没用过,不太了解),有些人是不想再暴露其他端口的了,参考这个
  • 302代码的话,没法给127.0.0.1这类本地目标服务器做代理。

文档的话,我英文不太行,所以也就不太想弄高级部分的文档。

欢迎对于文档的PR,我会合并它。

Welcome PR about docs !

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 3, 2023

ProxyResponse还是有必要加一个
对于body的修改正常来说可能性不大,主要也就是请求前后修改,但留一个总是好的,比如说代理视频流的同时进行转码,而且yield这块应该不会增加io,真要大面积修改也是代理函数内部的事
对请求前后的headers和status_code的修改还是很多的,像bilibili验证Referer那些,或者某些api验证cookie或key,又不想暴露,鉴权那块奇奇怪怪的鉴权方式一大堆,不只是'X-Authentication',最好还是留足修改的代码空间
上传文件那种流式的也没怎么研究过,也不太清楚怎么弄比较好
文档那块英文不行写个中文的,大不了别人机翻
ws代理正常来说如果要改的话有个ws客户端就行,不改或修改很少的话直接调用ws代理函数应该算是比较好的

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 3, 2023

对于单独写类实现的话感觉相对来说会麻烦很多,有些时候也就是简单改个headers,或者对返回的herders做一些改动,而你这个只能改client,还保不准以后httpx还更不更,不更的话引用你的库的那些又得大面积重构

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 3, 2023

'''
@app.get("/proxy/{vid}/{video}.mp4")
async def resp(vid:str,video:Union[str,None],req:Request):
hd=dict(req.headers)
del hd['host']
ua='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3970.5 Safari/537.36'

headers={
    'referer': 'https://www.bilibili.com/video/',
    'user-agent':ua
}
hd.update(headers)
url = get_bili_url(vid,video)
if not url:
    return FileResponse('404.mp4')
if url == 404:
    logging.warning('get url failed')
    return Response(status_code=500)
rp_req = client.build_request("GET", url, headers=hd)
rp_resp = await client.send(rp_req, stream=True)
return StreamingResponse(
    rp_resp.aiter_raw(),
    status_code=rp_resp.status_code,
    headers=rp_resp.headers,  # type: ignore
    background=BackgroundTask(rp_resp.aclose),
)

'''
之前用这玩意代理bilibili的url,动不动就是‘WARNING:asyncio:socket.send() raised exception.’,还一堆播放器不兼容,再加上Referer和ua验证,人都麻了

@WSH032
Copy link
Owner

WSH032 commented Dec 3, 2023

而你这个只能改client,还保不准以后httpx还更不更,不更的话引用你的库的那些又得大面积重构

这个库在设计的时候就是把对httpx的生态支持当作一等公民来设计的。上面给出的例子都是httpx的稳定公共API,可以放心使用。

就算发生了你说的httpx更改API的情况,fastapi-proxy-lib目前没有限制httpx的版本(不过子依赖httpx-ws好像限制了),安装的时候固定一下它的版本就行

pip install fastapi-proxy-lib httpx==0.25

我可以提供一个HelpTransport来方便修改响应的status_code, header.

但是body的话,可能很难设计出一个API来满足大多数需求。

@WSH032
Copy link
Owner

WSH032 commented Dec 3, 2023

我还是不太明白你最开始要求的ProxyResponse的实际应用场景是什么?

代理鉴权确实是个很常用的场景,httpx.Auth对鉴权提供了完善的支持,可能有些人不了解httpx.Auth或者httpx.Transport这类高级功能;我可以完善这部分文档,或者提供一个Helper类。

但是大步幅修改body和修改status_code都是很奇怪的用例。似乎超出了透明代理的功能。

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 4, 2023

看了一下,改body倒是可以用httpx.AsyncHTTPTransport(最好写个helper类套一层,免得改api),毕竟不常用,文档注明就行,前后改headers和自定义鉴权还是有必要,就比如说代理bilibili视频下载url,最方便的就是像写api那样直接鉴权前端并返回ProxyResponse同时更改headers

@WSH032
Copy link
Owner

WSH032 commented Dec 5, 2023

我想了一下,能实现你的需求的妥协方法,用fastapi_proxy_lib.core.http.BaseHttpProxy

如果要修改请求头或者请求体的话,就用我前面提到的httpx.Auth就行。

如果你想修改响应头,代码,响应体这些,你把最终返回的starlette.Response拿去修改就行。

不建议使用httpx.AsyncHTTPTransport,因为它返回响应的处理涉及到了fastapi-proxy-lib的细节


Example:

import httpx
from fastapi import FastAPI, Request
from fastapi.responses import StreamingResponse
from fastapi_proxy_lib.core.http import BaseHttpProxy


class MyCustomAuth(httpx.Auth):

    def auth_flow(self, request: httpx.Request):
        request.headers.update({
            "referer": "https://www.bilibili.com/video/",
            "user-agent": "My User Agent 1.0",
        })
        yield request

proxy = BaseHttpProxy(httpx.AsyncClient(auth=MyCustomAuth()))

app = FastAPI()

@app.get("/proxy/{vid}/{video}.mp4")
async def _(request: Request):
    target_url = httpx.URL(get_bili_url(vid, video))
    target_resp = await proxy.send_request_to_target(request=request, target_url=target_url)

    if isinstance(target_resp, StreamingResponse):
        # do some processing, whatever you want
        new_resp =  StreamingResponse(
            content=target_resp.body_iterator,
            status_code=target_resp.status_code,
            headers=target_resp.headers,
            media_type=target_resp.media_type,
        )
    else:
        new_resp = target_resp

    return new_resp

很难提供一个helper API,因为需要考虑的东西太多了:

  • 修改请求体之前,怎么知道有没有请求体,请求体的格式是什么?jsonfile?如果修改需要依据其他上下文怎么办?
    • 所以直接让httpx.Auth把整个请求给你算了,你改好后再放回去就行
  • 同样的,修改响应,怎么知道有没有正确拿到目标服务器的响应?响应有没有body?需要响应上下文怎么办?

至于你提到的包装一层,免得修改API:
我前面给出的例子全都是公共API,哪怕不是,我也没有到细节或者私有变量。
fastapi-proxy-lib的类型标注是非常完善的,达到了100%,如果发生了API的修改,再运行之前静态检查就可以把错误告诉你了。
如果让我来官方实现,我也是要依据底层细节来实现的,这样更加不稳定(起码你自己实现还能固定依赖版本)。

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 5, 2023

改的话主要也就是前后改,body可以直接拿类来操作,就比如写个helper类,要改的话写个类继承,然后改完后再用super调用请求方法,返回后再改,而如果只是在前面改的话加个headers和body参数就行,拿到之后直接update,ProxyResponse倒是可以继承StreamingResponse,然后改写初始化方法,还有代理文件传输url时概率出现WARNING:asyncio:socket.send() raised exception.,最好看看会不会有问题

@WSH032
Copy link
Owner

WSH032 commented Dec 7, 2023

抱歉回复晚了,前几天有点累。


改的话主要也就是前后改,body可以直接拿类来操作,就比如写个helper类,要改的话写个类继承,然后改完后再用super调用请求方法,返回后再改,而如果只是在前面改的话加个headers和body参数就行,拿到之后直接update,ProxyResponse倒是可以继承StreamingResponse,然后改写初始化方法

我是感觉,即使我提供了helper类,也没减少多少代码量。你看这个回答里面也就多了一个类声明而已。

重要的是我写个文档来说明这些。


至于ProxyResponse的话,直接以那个回答里面给的方式修改响应不是比回调式可读性更好、更灵活吗?

@app.get("/proxy/{vid}/{video}.mp4")
async def _(request: Request):
    target_url = httpx.URL(get_bili_url(vid, video))
    target_resp = await proxy.send_request_to_target(request=request, target_url=target_url)

    if isinstance(target_resp, StreamingResponse):
        # do some processing, whatever you want
        new_resp =  StreamingResponse(
            content=target_resp.body_iterator,
            status_code=target_resp.status_code,
            headers=target_resp.headers,
            media_type=target_resp.media_type,
        )

直接继承StreamingResponse也是逃不掉httpx.AsyncClient的,因为需要客户端的连接池来提高性能。

你最开始的要求,完全可以用现有的功能来实现(就是我上面给出的例子),而且灵活性还更好,也就多了几行代码而已。

引入新的东西就要增加更多维护成本、测试。


还有代理文件传输url时概率出现WARNING:asyncio:socket.send() raised exception.,最好看看会不会有问题

这个是有测试的,如果你发现代理bilibili那边有问题,再发一个issue,我会研究下。

# 测试目标服务文件转发正常
file_str = "你好"
r = await client_for_conn_to_proxy_server.put(
proxy_server_base_url + f"put/echo_file?content={file_str}",
)
assert r.content.decode("utf-8") == file_str

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 7, 2023

主要是麻烦,而且之前总有些奇奇怪怪的bug,如WARNING:asyncio:socket.send() raised exception.刷屏,不处理好的话分分钟服务器就得蹦

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 7, 2023

最好还是专门写一个ProxyResponse来处理,一是在需求没那么复杂的情况下减少一定的代码量,二是避免一些乱七八糟的bug

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 7, 2023

初始化参数的话就request target_url transport应该就差不多了,后面 headers属性和status_code再做成可更改的,简单的直接就改requests和后面response就行,复杂的定义transport也能解决

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 7, 2023

对于transport最好导出一个helper类来继承,免得后面改api

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 8, 2023

core.zip
测试一下鲁棒性,没问题的话就看看给合了

@WSH032
Copy link
Owner

WSH032 commented Dec 17, 2023

为回复晚了感到抱歉。


不要直接发文件,最好按照贡献指南来操作。

zip里面我大概看了下,有涉及底层ASGI协议的部分。我不喜欢处理底层的细节,这样维护起来很麻烦。

而且还缺少测试。


在 PR #22 里面,我把 调整请求或者响应 的特性写进文档了。
你结合 BaseHttpProxy,应该可以解决你的需求。


最开始你提到的 class ProxyResponse,我倾向于不去实现。

这个库设计的时候就十分重视代理的透明性(无损转发),修改应该保持到最低限度(如代理鉴权等)。ProxyResponse修改太多了,那还不如重新用httpx发起新请求。

如果没有新的问题的话,我建议关闭这个issue;如果有别人也有类似的需求,我会重新打开issue。

@WSH032
Copy link
Owner

WSH032 commented Dec 18, 2023

按计划关闭,如有新的需求或者想法再重新打开吧。

@gjmhmm8
Copy link
Author

gjmhmm8 commented Dec 20, 2023

ProxyResponse还是建议加进去,那个zip说实在的也就加了个ProxyResponse,其他没改

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants