Add scraping functions#21
Conversation
donaldong
left a comment
There was a problem hiding this comment.
It's a good start!
To make it better, I think it would be valuable to provide some unit tests along with this script.
For example, assert(get_points('donaldong') > 100) would be true, get_points('a') should raise Error404, get_points('xiaowuc1') should raise ErrorInvalid because Nick Wu is not at CSUMB.
Maybe we sould move this file as backend/scoreboard/kattis_scrapping.py. Then we can have backend/scoreboard/test_kattis_scrapping.py, then the test will be automatically discovered by Django. Here's more information: https://docs.djangoproject.com/en/2.1/topics/testing/overview/
| @@ -0,0 +1,86 @@ | |||
| import requests | |||
There was a problem hiding this comment.
I think this is a duplicated file, perhaps checked in by accident.
| @@ -0,0 +1,2 @@ | |||
| from django.test import TestCase | |||
|
|
|||
There was a problem hiding this comment.
Looks like this file is still under construction?
There was a problem hiding this comment.
I probably should have left that off the commit. I wanted to show that I hadn't ignored the test cases, but its a work in progress.
There was a problem hiding this comment.
Oh btw, I think the file(s) are still in the scripts/ dir.
I think it would better to place them in backend/scoreboard because those functions will be used by the backend API.
There was a problem hiding this comment.
oh, yes, I agree. Another woops on my part when I placed the folder
There was a problem hiding this comment.
I think that's fixed now
There was a problem hiding this comment.
Cool! Thanks for adding the tests!
Please rebase on top of the master branch, and do a force push to your master branch. Now we have the python linter, I think there are some lint errors here.
We're missing some docstrings. Here is an example: the test I wrote earlier https://github.com/progteam/parrot/blob/7865b948f6d7a8973c4f982c404f6c522842a4d2/backend/test_redirect_404.py
|
|
||
|
|
||
| class KattisScrapeTestCase(TestCase): | ||
| def setUp(self): |
There was a problem hiding this comment.
If there's nothing to do here, we don't have to have this method.
| try: | ||
| kattis_scraping.get_points("ryan-beck") | ||
| except kattis_scraping.Error404: | ||
| self.fail() |
There was a problem hiding this comment.
I think it's better to assert it gonna give us as a valid number
|
|
||
| def test_get_points_not_throws_exception_invalid(self): | ||
| try: | ||
| kattis_scraping.get_points("ryan-beck") |
There was a problem hiding this comment.
This would be a duplicate test case. I think we can combine them.
|
|
||
| def test_get_rank_not_throws_exception_404(self): | ||
| try: | ||
| kattis_scraping.get_rank("ryan-beck") |
There was a problem hiding this comment.
Similar to the previous comment.
|
|
||
| def test_get_rank_throws_exception_invalid(self): | ||
| with self.assertRaises(kattis_scraping.ErrorInvalid): | ||
| kattis_scraping.get_rank("xiaowuc1") |
There was a problem hiding this comment.
Maybe it's more readable to group get_points and get_rank together?
I'm thinking about test_scraping_throws_invalid and test_scraping_throws_404.
| from bs4 import BeautifulSoup | ||
|
|
||
|
|
||
| class Error(Exception): |
There was a problem hiding this comment.
Hmm... This is a little redundant to me
|
Oh, another thing to consider: |
|
I think |
|
The previous bug is caused by missing cities since we're identifying the school name by the index for the anchor tag. |
Implemented two functions for retrieving data from Kattis: - get_score() scrapes the score of the profile that matches passed user_handle - get_rank() scrapes the rank of the profile that matches passed user_handle Both functions use is_student() to determine if passed profile is from CSUMB. PR: #21 Resolves: #17
Implemented two functions for retrieving data from Kattis: - get_score() scrapes the score of the profile that matches passed user_handle - get_rank() scrapes the rank of the profile that matches passed user_handle Both functions use is_student() to determine if passed profile is from CSUMB. PR: #21 Resolves: #17
|
Fixed these issues and pushed. |
Implemented two functions for retrieving data from Kattis:
get_score() scrapes the score of the profile that matches passed user_handle
get_rank() scrapes the rank of the profile that matches passed user_handle
Both functions use is_student() to determine if passed profile is from CSUMB.