New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Made download_blogs requests async #19
Conversation
Oh wow that was quick, here you go https://antoniosbarotsis.github.io/index.xml
# Youtubers
https://www.youtube.com/feeds/videos.xml?channel_id=UCiSIL42pQRpc-8JNiYDFyzQ # Chris Biscardi
https://www.youtube.com/feeds/videos.xml?channel_id=UCUMwY9iS8oMyWDYIe6_RmoA # NoBoilerplate
https://www.youtube.com/feeds/videos.xml?channel_id=UC8ENHE5xdFSwx71u3fDH5Xw # ThePrimeagen
https://www.youtube.com/feeds/videos.xml?channel_id=UCsBjURrPoezykLs9EqgamOA # Fireship
https://www.youtube.com/feeds/videos.xml?channel_id=UC2Xd-TjJByJyK2w1zNwY0zQ # Beyond Fireship
# Rust stuff
https://blog.rust-lang.org/feed.xml
https://blog.rust-lang.org/inside-rust/feed.xml
https://this-week-in-rust.org/rss.xml
https://rust.libhunt.com/newsletter/feed
https://rustsec.org/feed.xml
# Console.dev
https://console.dev/tools/rss.xml
https://console.dev/betas/rss.xml
https://console.dev/articles/rss.xml
https://console.dev/interviews/rss.xml
https://console.dev/profiles/rss.xml
https://blog.console.dev/rss/
https://blog.jetbrains.com/feed/
https://github.blog/feed/
https://stackoverflow.blog/feed/
# Random
https://dusted.codes/feed/rss
https://vladmihalcea.com/blog/feed/
https://arraying.de/index.xml
https://ossinsight.io/blog/rss.xml
https://www.smartercode.io/feed
https://fasterthanli.me/index.xml
https://raphlinus.github.io/feed.xml
https://www.huy.rocks/rss.xml
https://gideonwolfe.com/index.xml
https://blog.m-ou.se/index.xml
https://liam.rs/index.xml
https://qristin.wordpress.com/feed/
https://wakatime.com/blog/all.atom
https://developerlife.com/feed.xml
# Other
https://grafana.com/blog/index.xml
https://www.elastic.co/blog/feed
https://redis.com/blog/feed
https://www.stephanboyer.com/rss
https://medium.com/feed/@clockwork-labs Unrelated but let me know if you think the getting started docs could be improved in some way :) |
It seems to be a bit slower than the master branch (for me at least) # Yours
Benchmark 1: cargo run
Time (mean ± σ): 20.506 s ± 3.317 s [User: 3.331 s, System: 0.333 s]
Range (min … max): 16.724 s … 22.921 s 3 runs
# Master
Benchmark 1: cargo run
Time (mean ± σ): 14.180 s ± 0.784 s [User: 4.654 s, System: 0.145 s]
Range (min … max): 13.630 s … 15.078 s 3 runs If I use the 4 feeds that you used I get very comparable results # Yours
Benchmark 1: cargo run
Time (mean ± σ): 747.2 ms ± 47.1 ms [User: 445.7 ms, System: 170.0 ms]
Range (min … max): 715.5 ms … 801.4 ms 3 runs
# Master
Benchmark 1: cargo run
Time (mean ± σ): 922.0 ms ± 75.0 ms [User: 507.6 ms, System: 115.7 ms]
Range (min … max): 869.8 ms … 1008.0 ms 3 runs |
Oh, thank you for your feedback, I am testing it also, but I get varying results, not sure why, maybe ratelimiting or something. |
I ran mine a few times and I seem to be getting roughly the same as the ones I posted (for the big input). It seems that the pipeline is failing, this seems to be an issue with OpenSSL and can likely be solved by adding [target.'cfg(unix)'.dependencies]
openssl = "0.10.42" to Edit: Actually not sure since the |
This is definitely something introduced with the new HTTP library you added, I didn't get these in master. |
I didn't try this but it seems the issue is here. The inputs include say |
Yest, it seems that reqwest shows the content type header in this format, |
I guess we could also try splitting, either works so up to you. The library itself is probably fine, it's just a matter of getting the string to cooperate 😄 |
Now I am getting good results:
But one request is still failing due to bad Content Type header, this one Is it fast for you now? |
I am getting all sorts of errors. The requests seem to be too quick for my laptop to keep up
I am also getting this?
I got that by adding I am not sure if these issues are because of my internet connection, OS or laptop but considering you have been getting better results, it's probably not the code. The extra dependency doesn't seem to be working, as I mentioned above;
Maybe adding Considering that this seems to have external issues that we can't control, would it make sense to make this a feature? Similar to how the I seem to be still getting worse performance + some connection errors but this could be potentially faster than the current master branch on say AWS Lambda. |
Ahhh, I still think I can fix this. It is possible to limmit the number of concurrent connections, see the first answer here, I went with The jetbrains site has something weird, I think we could add a request header for the accepted content types like this, maybe it will fix it, again I will check tomorrow. Regarding openssl, in my opinion the best we can do is move to another async http client implementation, see hyper, using unlike reqwests can use rust-tls, I think this is better because of the reduced number of external dependencies. Thanks for your help, I am sorry you could not see the results I was getting. |
Great to hear! I was also about to go as it is getting late here as well.
If it works, it works. I am hosting this on Linux anyway, Windows is weird sometimes. Even if we can't make this better on Windows, it will still be faster for 2 other major platforms and that's still useful. Thanks a lot for your contributions! |
Hello again, I have fixed the jetbrains page issue, they were rejecting requests without a UserAgent header, so i just added a mock one. I also added
Edit: Removed unwrap to fix the linting error. It seems that openssl is still being build despite using hyper-tls. |
Sorry for all these commits, I did not know clippy was that strict! |
Don't worry about it, I just defined an absurd amount of rules. I'll review this later today, thanks for the commits! Also the build seems to be passing which is great. |
I am getting this now Benchmark 1: cargo run
Time (mean ± σ): 5.037 s ± 0.052 s [User: 4.535 s, System: 0.293 s]
Range (min … max): 4.977 s … 5.069 s 3 runs so I guess you did manage to make this work on windows as well after all, good job! I'll try this on AWS real quick and if that works fine (which it should since the pipeline is passing) I'll merge. |
Looking at the code, I think I actually managed to make this work yesterday when I was trying stuff out on my own before opening the issue. I had tried this general pattern of |
I think it was the logger but we could try to revert that line and see if it crashes
Good idea haha |
I think the problem is with my computer, I am getting this error now on your master 😮💨
Dont really know what can I have done to mess up even the currently working one! |
Why did this become so cursed. Both branches work fine for me now. ?????????????????? |
I do not now hahaha I am removing all cargo packages and installing again, just to be sure. I think we both deserve the tshirt after this is working. |
I have tested this on another machine and it also works for me, I believe I messed something up in my rust installation while testing. If you still want to merge I would recommend further testing, maybe a friend or something because I cannot confidently say this will not fail. |
Appreciate the concern ♥ I will force my friends to run this tomorrow after running it myself a few times. |
I'll try to get at least one person to run the code on each platform, are you using Mac or Linux? |
Great! I am using linux. |
The pipeline passing is definitely reassuring here, made it build, test and run the code in the same I think I will wait until #20 is resolved at least partially (just the extra Arm image part) and if that seems to also be working without issues then I'd say it is safe to assume this is working fine and I'll finally merge. |
The GitHub Runners have pretty fast internet connections but probably some weird hardware resources so the speedup is smaller (from 10 to 7 seconds) but that is expected and looks fine to me. It also seems to work fine even if ran multiple times (the benchmark runs the code 5 times) which is good to know. Are you still getting those errors locally? |
That is great to hear!
Unfortunately, yes, at this point I think my internet connection is just bad, I tested with my uni connection this morning and everything ran well, I also tested it in a uni server and it ran well. Running the current master version (still with ureq and no async) it gives me the same errors at home, but if I activate my uni vpn it works without a problem. This is two runs of the current master, one without and one with VPN.
Sometimes it runs without errors and sometimes I get those, not always on the same domains. I have also tested this at home with a Raspberry Pi 3 and the same thing happens. I just do not know what is the problem, but whatever it is I think is beyond my control and not related to tokio / async / reqwest. |
Yes seems like it's your wifi then. I didn't find much regarding the error with one Google search but I'll ask around in case anyone has an idea about this out of curiosity. Only thing I can think of is packet loss. This again means that the PR is very likely fine but I'll wait as I mentioned above regardless. Worst case scenario, if it is indeed buggy then I'll have more time to encounter the issue. |
Great, let me know if you find something out!
Sure, sounds good also I am sorry this took so long, in my mind it was a simple change. I do not study / work in the software engineering field so this was definitely a learning experience for me, I have loved seeing the CI workflow you have implemented, it works really well. This is also my first hacktoberfest PR so there is that.
To be fair I do not know, TCP protocol should handle packet retransmissions and timeouts, but it could be something like that. |
I really do not mind especially seeing as how I would likely not do this myself, at least not now.
That is great to hear :)
This is definitely way above whatever I did for my first Hacktoberfest pr 🗿 |
I have tested the code manually as well and it seems fine, performance went from 20 to 12 seconds! |
Thanks a lot! |
Hello, I saw your message about issue #18 on Hacktoberfest's Discord. I have tried to address it, I have set tokio as a non optional dependency.
Using this feeds.txt:
This is the output of hyperfine:
Please feel free to correct anything, this is my first time working with async and tokio in rust.
It would be helpful if you posted your feeds.txt, so i could test it more thoroughly. Thanks.