New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unbundle requests and use requests.get instead of urllib.urlretrieve #5138

Merged
merged 3 commits into from Jun 7, 2016

Conversation

Projects
None yet
2 participants
@benjaoming
Member

benjaoming commented Jun 6, 2016

Summary

Fixes issues with m2crypto not able to handle https connections.

TODO

If not all TODOs are marked, this PR is considered WIP (work in progress)

  • Have tests been written for the new code? If you're fixing a bug, write a regression test (or have a really good reason for not writing one... and I mean really good!)
  • Has documentation been written/updated?
  • New dependencies (if any) added to requirements file

Reviewer guidance

If there's any concern about including this PR in 0.16.6, please state it in the discussion, but I would caution that the old download method was probably less reliable than using the latest version of requests @jamalex

Issues addressed

#5126

@benjaoming benjaoming added the bug label Jun 6, 2016

@benjaoming benjaoming added this to the 0.16.6 milestone Jun 6, 2016

@benjaoming benjaoming added the has PR label Jun 6, 2016

# (this is done in this file because putting it in __init__.py causes circular imports,
# and putting it in urls.py doesn't get loaded by management commands)
base_headers = requests.defaults.defaults["base_headers"]
base_headers["User-Agent"] = ("ka-lite/%s " % VERSION) + base_headers["User-Agent"]

This comment has been minimized.

@benjaoming

benjaoming Jun 6, 2016

Member

What components of KA Lite should report the version in the user agent? Monkey-patching requests is impossible in recent versions as the default headers aren't accessible but hard-coded @jamalex

@benjaoming

benjaoming Jun 6, 2016

Member

What components of KA Lite should report the version in the user agent? Monkey-patching requests is impossible in recent versions as the default headers aren't accessible but hard-coded @jamalex

This comment has been minimized.

@jamalex

jamalex Jun 6, 2016

Member

This is to help with diagnosing errors (and calculating version stats) based on our log files on the central servers. Please see if there's some alternative way to set the default, even if it involves some monkey-patching.

@jamalex

jamalex Jun 6, 2016

Member

This is to help with diagnosing errors (and calculating version stats) based on our log files on the central servers. Please see if there's some alternative way to set the default, even if it involves some monkey-patching.

This comment has been minimized.

@benjaoming

benjaoming Jun 6, 2016

Member

I can't set the default, but I can add the headers explicitly where we use requests. I'm guessing this is only relevant for requesting pointing to our own servers?

@benjaoming

benjaoming Jun 6, 2016

Member

I can't set the default, but I can add the headers explicitly where we use requests. I'm guessing this is only relevant for requesting pointing to our own servers?

This comment has been minimized.

@jamalex

jamalex Jun 6, 2016

Member

Yeah, only relevant for stuff going to our servers.

@jamalex

jamalex Jun 6, 2016

Member

Yeah, only relevant for stuff going to our servers.

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 6, 2016

Member

This is where the requests library is used currently, I don't see any reason of concern, but should we change the User-Agent on any of these requests @jamalex ?

kalite/distributed/static/js/distributed/perseus/ke/build/update_title_tags.py
9:import requests
13:    r = requests.get("http://www.khanacademy.org/api/internal/exercises")

kalite/distributed/static/js/distributed/perseus/ke/build/close-old-issues.py
10:import requests
39:        r = requests.get(url, auth=github_auth)
52:    r = requests.get("https://api.github.com/repos/Khan/khan-exercises/issues/"
65:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
73:    except requests.HTTPError:
83:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
91:    except requests.HTTPError:

kalite/i18n/management/commands/create_dummy_language_pack.py
13:import requests
57:        resp = requests.get(url)
59:    except requests.ConnectionError as e:

kalite/i18n/management/commands/setup_in_context_translations.py
5:import requests
40:        resp = requests.get(url)

kalite/i18n/tests/create_dummy_language_pack_tests.py
1:import requests
17:    @patch.object(requests, "get", autospec=True)

kalite/static/js/distributed/perseus/ke/build/update_title_tags.py
9:import requests
13:    r = requests.get("http://www.khanacademy.org/api/internal/exercises")

kalite/static/js/distributed/perseus/ke/build/close-old-issues.py
10:import requests
39:        r = requests.get(url, auth=github_auth)
52:    r = requests.get("https://api.github.com/repos/Khan/khan-exercises/issues/"
65:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
73:    except requests.HTTPError:
83:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
91:    except requests.HTTPError:

kalite/contentload/management/commands/unpack_assessment_zip.py
1:import requests
47:            r = requests.get(ziplocation, prefetch=False)

kalite/contentload/tests/unpack_assessment_zip.py
4:import requests
52:    @patch.object(requests, "get")
60:        self.assertEqual(get_method.call_count, 0, "requests.get was called even if we should've skipped!")
66:    @patch.object(requests, "get", autospec=True)

Member

benjaoming commented Jun 6, 2016

This is where the requests library is used currently, I don't see any reason of concern, but should we change the User-Agent on any of these requests @jamalex ?

kalite/distributed/static/js/distributed/perseus/ke/build/update_title_tags.py
9:import requests
13:    r = requests.get("http://www.khanacademy.org/api/internal/exercises")

kalite/distributed/static/js/distributed/perseus/ke/build/close-old-issues.py
10:import requests
39:        r = requests.get(url, auth=github_auth)
52:    r = requests.get("https://api.github.com/repos/Khan/khan-exercises/issues/"
65:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
73:    except requests.HTTPError:
83:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
91:    except requests.HTTPError:

kalite/i18n/management/commands/create_dummy_language_pack.py
13:import requests
57:        resp = requests.get(url)
59:    except requests.ConnectionError as e:

kalite/i18n/management/commands/setup_in_context_translations.py
5:import requests
40:        resp = requests.get(url)

kalite/i18n/tests/create_dummy_language_pack_tests.py
1:import requests
17:    @patch.object(requests, "get", autospec=True)

kalite/static/js/distributed/perseus/ke/build/update_title_tags.py
9:import requests
13:    r = requests.get("http://www.khanacademy.org/api/internal/exercises")

kalite/static/js/distributed/perseus/ke/build/close-old-issues.py
10:import requests
39:        r = requests.get(url, auth=github_auth)
52:    r = requests.get("https://api.github.com/repos/Khan/khan-exercises/issues/"
65:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
73:    except requests.HTTPError:
83:    r = requests.post("https://api.github.com/repos/Khan/khan-exercises/"
91:    except requests.HTTPError:

kalite/contentload/management/commands/unpack_assessment_zip.py
1:import requests
47:            r = requests.get(ziplocation, prefetch=False)

kalite/contentload/tests/unpack_assessment_zip.py
4:import requests
52:    @patch.object(requests, "get")
60:        self.assertEqual(get_method.call_count, 0, "requests.get was called even if we should've skipped!")
66:    @patch.object(requests, "get", autospec=True)

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 6, 2016

Member

Confirmed working on Raspberry Pi with m2crypto 0.21.1 installed as a dependency of ka-lite-raspberry-pi.

Member

benjaoming commented Jun 6, 2016

Confirmed working on Raspberry Pi with m2crypto 0.21.1 installed as a dependency of ka-lite-raspberry-pi.

response = None
last_error = None
for retries in range(1, 1 + max_retries):

This comment has been minimized.

@jamalex

jamalex Jun 6, 2016

Member

You seem to have deleted all the error handling code here? That stuff was pretty critical, added to handle very common cases of 1) connections dropping out, and 2) video URLs actually being 404's, despite being in the KA API. We need to keep these checks, as otherwise it will become very brittle.

@jamalex

jamalex Jun 6, 2016

Member

You seem to have deleted all the error handling code here? That stuff was pretty critical, added to handle very common cases of 1) connections dropping out, and 2) video URLs actually being 404's, despite being in the KA API. We need to keep these checks, as otherwise it will become very brittle.

This comment has been minimized.

@benjaoming

benjaoming Jun 6, 2016

Member

make_request isn't called anywhere!

@benjaoming

benjaoming Jun 6, 2016

Member

make_request isn't called anywhere!

This comment has been minimized.

@jamalex

jamalex Jun 6, 2016

Member

Ah, cool, looks like it was made redundant at some point but never cleaned up... until now.

@jamalex

jamalex Jun 6, 2016

Member

Ah, cool, looks like it was made redundant at some point but never cleaned up... until now.

@jamalex

This comment has been minimized.

Show comment
Hide comment
@jamalex

jamalex Jun 6, 2016

Member

This is where the requests library is used currently, I don't see any reason of concern, but should we change the User-Agent on any of these requests @jamalex ?

Most of the ones you list there don't seem important (as they're not hitting our central server, so we won't be getting logs with the user agent in it for those), but that doesn't seem to be an exhaustive list. Here are some that are more important:
https://github.com/learningequality/ka-lite/blob/develop/python-packages/fle_utils/internet/functions.py#L22
https://github.com/learningequality/ka-lite/blob/develop/python-packages/securesync/api_client.py#L14
https://github.com/learningequality/ka-lite/blob/develop/python-packages/securesync/devices/views.py#L100
(I know these are under python-packages, but these aren't actually external python packages; that code is only used in the context of KA Lite)

Might be good to pull out this header calculation logic into a reusable function:
https://github.com/learningequality/ka-lite/pull/5138/files#diff-5e841025f58a948893e991da20eeebaaR64
(though not sure the best way to avoid having securesync and fle_utils depend on kalite...)

Member

jamalex commented Jun 6, 2016

This is where the requests library is used currently, I don't see any reason of concern, but should we change the User-Agent on any of these requests @jamalex ?

Most of the ones you list there don't seem important (as they're not hitting our central server, so we won't be getting logs with the user agent in it for those), but that doesn't seem to be an exhaustive list. Here are some that are more important:
https://github.com/learningequality/ka-lite/blob/develop/python-packages/fle_utils/internet/functions.py#L22
https://github.com/learningequality/ka-lite/blob/develop/python-packages/securesync/api_client.py#L14
https://github.com/learningequality/ka-lite/blob/develop/python-packages/securesync/devices/views.py#L100
(I know these are under python-packages, but these aren't actually external python packages; that code is only used in the context of KA Lite)

Might be good to pull out this header calculation logic into a reusable function:
https://github.com/learningequality/ka-lite/pull/5138/files#diff-5e841025f58a948893e991da20eeebaaR64
(though not sure the best way to avoid having securesync and fle_utils depend on kalite...)

"Content-Length" not in response or
not str(len(open(thumb_filepath, "rb").read())) == response["Content-Length"]):
"content-length" not in response.headers or
not len(open(thumb_filepath, "rb").read()) == int(response.headers['content-length'])):

This comment has been minimized.

@benjaoming

benjaoming Jun 6, 2016

Member

Expand and look at line 55-68, that's where all the error handling essentially takes place, even a catch-all-exceptions block :/

@benjaoming

benjaoming Jun 6, 2016

Member

Expand and look at line 55-68, that's where all the error handling essentially takes place, even a catch-all-exceptions block :/

This comment has been minimized.

@jamalex

jamalex Jun 6, 2016

Member

Cool, looks like it should cover the bases, though it seems it doesn't do "retries" at all, but that wasn't due to your changes. We maintain a list of failed_youtube_ids, but don't use it for anything other than getting a count of failures. Not ideal, but not a big problem.

@jamalex

jamalex Jun 6, 2016

Member

Cool, looks like it should cover the bases, though it seems it doesn't do "retries" at all, but that wasn't due to your changes. We maintain a list of failed_youtube_ids, but don't use it for anything other than getting a count of failures. Not ideal, but not a big problem.

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 6, 2016

Member

(though not sure the best way to avoid having securesync and fle_utils depend on kalite...)

They already did when kalite was monkey-patching requests :)

Member

benjaoming commented Jun 6, 2016

(though not sure the best way to avoid having securesync and fle_utils depend on kalite...)

They already did when kalite was monkey-patching requests :)

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 6, 2016

Member

Thanks for searching through fle_utils btw, adding that change as well.

Member

benjaoming commented Jun 6, 2016

Thanks for searching through fle_utils btw, adding that change as well.

@jamalex

This comment has been minimized.

Show comment
Hide comment
@jamalex

jamalex Jun 6, 2016

Member

They already did when kalite was monkey-patching requests :)

Kind of soft dependency; implicit, and optional. They would still function fine without kalite. More like a "hook" style "dependency". :D

Member

jamalex commented Jun 6, 2016

They already did when kalite was monkey-patching requests :)

Kind of soft dependency; implicit, and optional. They would still function fine without kalite. More like a "hook" style "dependency". :D

@jamalex

This comment has been minimized.

Show comment
Hide comment
@jamalex

jamalex Jun 6, 2016

Member

Thanks for searching through fle_utils btw, adding that change as well.

Cool -- there were a bunch of other uses of requests in there, but they didn't seem so critical. The ones I listed are the places that hit the central server in ways we may want to examine the logs for.

Member

jamalex commented Jun 6, 2016

Thanks for searching through fle_utils btw, adding that change as well.

Cool -- there were a bunch of other uses of requests in there, but they didn't seem so critical. The ones I listed are the places that hit the central server in ways we may want to examine the logs for.

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 6, 2016

Member

They would still function fine without kalite.

If the intention was to have correct user agent stats, you would be failing but much more severely than a circular import, you'd have wrong stats and think they were right! So all this functionality really depends on the monkey patch firing.

Member

benjaoming commented Jun 6, 2016

They would still function fine without kalite.

If the intention was to have correct user agent stats, you would be failing but much more severely than a circular import, you'd have wrong stats and think they were right! So all this functionality really depends on the monkey patch firing.

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming
Member

benjaoming commented Jun 6, 2016

@jamalex done in 5884508

@jamalex

This comment has been minimized.

Show comment
Hide comment
@jamalex

jamalex Jun 7, 2016

Member

Ok, looks good to me! Merging now, but please make sure data syncing also still works (with a basic check -- registering and creating a local record and syncing it up, with no errors), in addition to the video downloading, as both depend on requests.

Member

jamalex commented Jun 7, 2016

Ok, looks good to me! Merging now, but please make sure data syncing also still works (with a basic check -- registering and creating a local record and syncing it up, with no errors), in addition to the video downloading, as both depend on requests.

@jamalex jamalex merged commit 1164dc6 into learningequality:0.16.x Jun 7, 2016

1 check passed

ci/circleci Your tests passed on CircleCI!
Details

@jamalex jamalex removed the has PR label Jun 7, 2016

@benjaoming

This comment has been minimized.

Show comment
Hide comment
@benjaoming

benjaoming Jun 8, 2016

Member

@jamalex

Tested:

  • One-click Device registration OK
  • Sync'ing OK
  • Video download OK

screenshot from 2016-06-08 13 43 24

Member

benjaoming commented Jun 8, 2016

@jamalex

Tested:

  • One-click Device registration OK
  • Sync'ing OK
  • Video download OK

screenshot from 2016-06-08 13 43 24

@jamalex

This comment has been minimized.

Show comment
Hide comment
@jamalex

jamalex Jun 8, 2016

Member

💥 👍 ⚡️

Member

jamalex commented Jun 8, 2016

💥 👍 ⚡️

@benjaoming benjaoming deleted the benjaoming:m2crypto-dl branch Jun 9, 2016

@benjaoming benjaoming referenced this pull request Jul 26, 2016

Closed

Fix Unclean shutdown in OS X installer. #5212

14 of 14 tasks complete

@benjaoming benjaoming referenced this pull request Jul 27, 2016

Merged

Disable unpack assessment zip command tests #5219

3 of 3 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment