New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: scrape new expanded_assets page #83
Conversation
I'm really not sure that this is the best approach, but for now it fixes the issue: https://leanprover.zulipchat.com/#narrow/stream/113488-general/topic/binary.20package.20was.20not.20provided.20for.20'linux' |
A quick fix is needed and what you did here works for now, and I tested these changes on my windows machine and they work fine, but long term we should chat about switching to the
|
Elan switched from using the GitHub API to scraping the web page in 2018 due to the API rate limit: dab74012. I'm not speaking for the devs here, but not forcing the user to create a GH API token (and still staying under the API request limit) is an important factor to have in mind. |
That looks like a different thing going through a web interface at https://api.github.com with a |
@lovettchris Surely |
The code is here: https://github.com/cli/cli but what kind of throttling issues did you run into before? Were you hitting it during CI jobs because every CI job shared the same "api_key: $GH_TOKEN" ? I have never seen any throttling issues with |
@collares already explained why we can't use
|
I see, well I doubt a normal user that has no github account would need to use elan more than 60 times per hour. So then perhaps it's just a matter of ensuring the CI build uses an authenticated header using the $GITHUB_TOKEN than comes with the CI environment. Then for power users that are running boat loads of tests locally that are calling elan all day long, perhaps we could discover those users are authenticated using |
A normal university for sure could need to use elan more than 60 times per hour from the same IP though |
Ok, you are talking about a "university" it sounds like you are saying the rate limit applies to 'elan' and not the user. Perhaps that is the problem we need to fix here, but I guess the only way to fix it is through user authentication with github which you are explicitly trying to avoid? I'm not sure why we need to avoid that... but I notice this tidbit on the rate-limits, I wonder if we can tap into this:
Could it be as simple as me getting you a token for elan since I'm in a github enterprise organization? It says "The request is from a GitHub App that's owned by a GitHub Enterprise Cloud organization." -- so I wonder how they verify ownership? |
@lovettchris This is not a reply to your previous message, but just a random comment: You're coming from Lean 4, which is currently aimed at programmers, so I think you (understandably) have a programmer-oriented perspective. I'd guess that a plurality of Lean users today are not programmers and probably will never have the need for a GitHub account. Even if that's not the case, I think people like that (e.g., a group of first-year math undergrads installing Lean for their "Introduction to Proof" class, or someone who just completed the Natural Number Game and wants to explore further) will still be an important user demographic for the foreseeable future. |
Sure, I definitely have a CS lens on all this, for example, I was thinking about how every CS class these days publishes the class assignments in a github repo... (or gitlab or something similar). I was also thinking about how even web designers who have no CS background are using git to share their designs, same with documentation writers, game designers. I've even worked with mechanical engineers in the Chicago water department who use github for all kinds of things... Perhaps we are bending over backwards just a little too far. That's all. Happiness is known when to say "no", and perhaps flakey HTML scaping hits that limit :-) |
@lovettchris I think it is important to consider different perspectives here. You seem to have encountered Git in a lot of contexts outside developing CS projects (which is quite cool 😄). I, despite having been through almost 9 years of CS at a reasonably well ranked university, have not (none of my CS classes even ever used it -- though it was discussion topic among students and there are other classes in the department which do). Also, even as a longtime Git/GitHub user myself, I did not have As a side note, the one time I did encounter a class using version control outside CS was a professor in mathematics who was using Subversion, not Git. 😆 |
No description provided.