-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async support #1538
Comments
In what way would you add in async support? Given most operations are done remotely on GitHub and our code is waiting for a response or a JSON blob back, how would it help? |
That's a good reason for async introduction! I mean... PyGithub should handle API access as coroutines, otherwise most of the time it's waiting for server's reponse instead of doing something useful. It's a common concept on nodejs and it's currently supported on @octokit/rest.js Obs: maybe I misunderstood how PyGithub currently works, but I think sequential and synchronous API wrappers are less efficient than an async one. |
I could add a new AsyncRequester class and make in all other classes asynchronous methods for interacting with it, while preserving the logic. Asynchronous code could help in tasks where you need to produce a large number of queries, for example, in a search. It would also allow you to quickly work with multiple accounts on Github |
Which sounds like a complete redesign, along with adding support for utilising multiple accounts. I love your enthusiasm, but I think it's an awful lot of work for not enough gain. |
It might involve a lot of work, but I'd love to see it implemented. Currently I'm stuck with JS on my research because of @octokit/rest.js performance overcomes PyGithub's. If some help is wanted I would be glad to work on this too. Btw I think multiple accounts support would be too much! Isn't just async a tremenduous first step towards performance gains? |
Maybe, I can do any editions and show it in pull request? For test |
Asyncio sounds like a good idea given my usecase. I am trying to read all files in a repository recursively, and synchronous requests are just too slow (I might be missing something like rate limiting on github's api but we could definitely make such operations faster). |
I would strongly suggest using something like GitPython for that rather than requesting everything via the GitHub API. |
Thanks for the interesting suggestion, I'll give it a try as it does make sense to do it that way. |
I don't think this issue should be closed. Using |
This short YouTube video demonstrate how asyncio could enhance http requests performance in python programs: https://youtu.be/m_a0fN48Alw Please @OlegYurchik, reopen this issue! |
for my use case, I'm creating a discord bot that has a few commands that interact with Github. discord.py has an event-based aio system. so any library that doesn't support async-await is not usable. asyncio without this kind of use case doesn't make the advantage any lesser. for example, if getting the content of the first comment of an issue takes 1 sec, it would take 100s to process the whole thing. with asyncio 100 such requests would take 2-3s. which is 98s faster than the current implementation with no extra resource bottleneck. I would say I really don't have that much of a problem with performance with my case though. it's the fact that we can't use blocking implementations of packages in async frameworks like discord.py and fastapi. because at a higher scale, doing such might lead the program to miss some events when it's in the blocked state |
Especially all the PaginatedList of various objects(issue, pullrequests etc) can be made faster with HTTPx and asyncio
|
Aren't PaginatedLists necessarily sequential? I mean, the point of paginated lists is querying only pages whose data you intend to use. How would you implement async PaginatedLists? What do you do when the pages you will need aren't known ahead of time? And what about situations where you want to iterate in a specific order? In my opinion, methods returning PaginatedLists could have an optional parameter which describes a number of pages to be pre-fetched asynchronously. The returned object could be of another type (let's say it'll be named AsyncList) that implements PaginatedLists' interface, hence could be used interchangeably. This solves the aforementioned problem by letting the user decide between performance and previsibility. |
Yes. That's a good Idea. Since this lib is slow for big GitHub projects in addition I am using ghapi where we can query issues and commits using page= & per_page= parameter . Just In case if it helps anyone.. |
I'd like to see this implemented and would be interested in helping. |
There seem to be some alternatives now for async: |
I have been relying on https://github.com/yanyongyu/githubkit which is the most featured for python github libraries. It has a fully typed api using pydantic and supports both sync and async requests using httpx. Works wonderfully with something like fastapi. |
Hello! There's an opportunity that can be considered. Why?
You could generate the async code from your existing sync code, thus avoiding the "extra" maintenance burden. |
What do you think about PyGithub async support? If I do async support for PyGithub - do you accept it?
The text was updated successfully, but these errors were encountered: