-
Notifications
You must be signed in to change notification settings - Fork 3
Add browser header for podcast extraction #2514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) " | ||
"AppleWebKit/537.36 (KHTML, like Gecko) " | ||
"Chrome/39.0.2171.95 Safari/537.36" |
Copilot
AI
Sep 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user-agent string references Chrome version 39.0.2171.95 from 2014, which is extremely outdated. Consider using a more recent user-agent string to avoid potential blocking by servers that filter out very old browsers.
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) " | |
"AppleWebKit/537.36 (KHTML, like Gecko) " | |
"Chrome/39.0.2171.95 Safari/537.36" | |
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) " | |
"AppleWebKit/537.36 (KHTML, like Gecko) " | |
"Chrome/120.0.0.0 Safari/537.36" |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@abeglova Any advice on verifying this guy after merge prior to release? It looks like |
yeah - after this gets to RC you should login to the rc web server and follow the steps you described to test locally to test it on rc. No need to wait until the job runs |
What are the relevant tickets?
Fixes https://github.com/mitodl/hq/issues/4803
Description (What does it do?)
Adds a browser user agent to the extract task so we can access the lockthequill rss feed.
How can this be tested?
OPEN_PODCAST_DATA_BRANCH=master
andGITHUB_ACCESS_TOKEN
to your access token value in backend.local.env. This will allow you to pull the config values from the yaml files in https://github.com/mitodl/open-podcast-data/tree/master./manage.py backpopulate_podcast_data
from your web container.Observe that the https://lockthequill.buzzsprout.com/ resource is present. If it hits an HTTPError, it will simply skip processing that entry.
Additional Questions
backpopulate_podcast_data --delete
(in order to ensure a clean testing environment) that I started getting a LOT of errors out oflearning_resources_search.tasks.upsert_learning_resource
on my worker. Is that expected?