From 920efd7a9559cb14b5dd367910e5f9a4fe67a58d Mon Sep 17 00:00:00 2001 From: Honza Javorek Date: Thu, 4 Sep 2025 11:12:39 +0200 Subject: [PATCH 1/2] feat: use the tip admonition for exercise hints --- .../01_devtools_inspecting.md | 8 +++++++- .../02_devtools_locating_elements.md | 6 +++++- .../06_locating_elements.md | 8 +++++++- .../scraping_basics_javascript2/07_extracting_data.md | 4 +++- .../scraping_basics_javascript2/10_crawling.md | 10 ++++++++-- .../scraping_basics_javascript2/12_framework.md | 8 +++++++- .../scraping_basics_python/01_devtools_inspecting.md | 8 +++++++- .../02_devtools_locating_elements.md | 6 +++++- .../scraping_basics_python/06_locating_elements.md | 8 +++++++- .../scraping_basics_python/07_extracting_data.md | 4 +++- .../webscraping/scraping_basics_python/10_crawling.md | 10 ++++++++-- .../scraping_basics_python/11_scraping_variants.md | 4 ++++ .../webscraping/scraping_basics_python/12_framework.md | 8 +++++++- 13 files changed, 78 insertions(+), 14 deletions(-) diff --git a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md index 6b14e144e3..d1ae82dfc6 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md @@ -144,7 +144,13 @@ We're not here for playing around with elements, though—we want to create a sc ### Find FIFA logo -Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. Hint: You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. +Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. + +:::tip Hint + +You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. + +:::
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md index f148552fcb..ac29662770 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md @@ -188,7 +188,11 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -Hint: Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). +:::tip Hint + +Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). + +::: ![Articles on Guardian's page about F1](./images/devtools-exercise-guardian1.png) diff --git a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md index 9b210d5274..451da415ea 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md @@ -273,11 +273,17 @@ Djibouti ### Use CSS selectors to their max -Simplify the code from previous exercise. Use a single for loop and a single CSS selector. You may want to check out the following pages: +Simplify the code from previous exercise. Use a single for loop and a single CSS selector. + +:::tip Hints + +You may want to check out the following pages: - [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) - [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child) +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md index 7ca821eef8..bd511ca2a6 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md @@ -349,13 +349,15 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -Hints: +:::tip Hints - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Cheerio gives you [.attr()](https://cheerio.js.org/docs/api/classes/Cheerio#attr) to access attributes. - In JavaScript you can use an ISO 8601 string to create a [`Date`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date) object. - To get the date, you can call `.toDateString()` on `Date` objects. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md index 513873f98a..7ef838b678 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md @@ -226,7 +226,11 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -Hint: Locating cells in tables is sometimes easier if you know how to [filter](https://cheerio.js.org/docs/api/classes/Cheerio#filter) or [navigate up](https://cheerio.js.org/docs/api/classes/Cheerio#parent) in the HTML element tree. +:::tip Hint + +Locating cells in tables is sometimes easier if you know how to [filter](https://cheerio.js.org/docs/api/classes/Cheerio#filter) or [navigate up](https://cheerio.js.org/docs/api/classes/Cheerio#parent) in the HTML element tree. + +:::
Solution @@ -290,11 +294,13 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -Hints: +:::tip Hints - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md index bfa205fb40..aadc29177a 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md @@ -410,11 +410,13 @@ If you export the dataset as JSON, it should look something like this: ] ``` -Hints: +:::tip Hints - The website uses `DD/MM/YYYY` format for the date of birth. You'll need to change the format to the ISO 8601 standard with dashes: `YYYY-MM-DD` - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). +::: +
Solution @@ -503,8 +505,12 @@ async requestHandler({ ..., addRequests }) { }, ``` +:::tip Hint + When navigating to the first IMDb search result, you might find it helpful to know that `enqueueLinks()` accepts a `limit` option, letting you specify the max number of HTTP requests to enqueue. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md index 81a62bb5ea..be85fddd00 100644 --- a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md @@ -143,7 +143,13 @@ We're not here for playing around with elements, though—we want to create a sc ### Find FIFA logo -Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. Hint: You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. +Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. + +:::tip Hint + +You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. + +:::
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md index 3a77ec607e..860a6c6fa8 100644 --- a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md @@ -189,7 +189,11 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -Hint: Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). +:::tip Hint + +Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). + +::: ![Articles on Guardian's page about F1](./images/devtools-exercise-guardian1.png) diff --git a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md index 4193c0b139..63de75a0c1 100644 --- a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md @@ -270,11 +270,17 @@ Djibouti ### Use CSS selectors to their max -Simplify the code from previous exercise. Use a single for loop and a single CSS selector. You may want to check out the following pages: +Simplify the code from previous exercise. Use a single for loop and a single CSS selector. + +:::tip Hints + +You may want to check out the following pages: - [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) - [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child) +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md index 47023acd0c..1eda6a0746 100644 --- a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md @@ -326,13 +326,15 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -Hints: +:::tip Hints - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Beautiful Soup gives you [access to attributes as if they were dictionary keys](https://beautiful-soup-4.readthedocs.io/en/latest/#attributes). - In Python you can create `datetime` objects using `datetime.fromisoformat()`, a [built-in method for parsing ISO 8601 strings](https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat). - To get the date, you can call `.strftime('%a %b %d %Y')` on `datetime` objects. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/10_crawling.md b/sources/academy/webscraping/scraping_basics_python/10_crawling.md index a48ddea3a5..1bd037a3a7 100644 --- a/sources/academy/webscraping/scraping_basics_python/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_python/10_crawling.md @@ -205,7 +205,11 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -Hint: Locating cells in tables is sometimes easier if you know how to [navigate up](https://beautiful-soup-4.readthedocs.io/en/latest/index.html#going-up) in the HTML element soup. +:::tip Hint + +Locating cells in tables is sometimes easier if you know how to [navigate up](https://beautiful-soup-4.readthedocs.io/en/latest/index.html#going-up) in the HTML element soup. + +:::
Solution @@ -258,11 +262,13 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -Hints: +:::tip Hints - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md index f183a926bb..0f8237f173 100644 --- a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md +++ b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md @@ -331,8 +331,12 @@ Your output should look something like this: ... ``` +:::tip Hint + You can find everything you need for working with dates and times in Python's [`datetime`](https://docs.python.org/3/library/datetime.html) module, including `date.today()`, `datetime.fromisoformat()`, `datetime.date()`, and `timedelta()`. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/12_framework.md b/sources/academy/webscraping/scraping_basics_python/12_framework.md index 3a7f70660b..f96db0ed5b 100644 --- a/sources/academy/webscraping/scraping_basics_python/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_python/12_framework.md @@ -453,11 +453,13 @@ If you export the dataset as JSON, it should look something like this: ] ``` -Hints: +:::tip Hints - Use Python's `datetime.strptime(text, "%d/%m/%Y").date()` to parse dates in the `DD/MM/YYYY` format. Check out the [docs](https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime) for more details. - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). +::: +
Solution @@ -553,8 +555,12 @@ async def main(): ... ``` +:::tip Hint + When navigating to the first IMDb search result, you might find it helpful to know that `context.enqueue_links()` accepts a `limit` keyword argument, letting you specify the max number of HTTP requests to enqueue. +::: +
Solution From 1eb1b634033cf90391c091109748ed58907b5a4b Mon Sep 17 00:00:00 2001 From: Honza Javorek Date: Thu, 4 Sep 2025 12:08:51 +0200 Subject: [PATCH 2/2] =?UTF-8?q?feat:=20make=20it=20great=20with=20Need=20a?= =?UTF-8?q?=20nudge=3F=E2=84=A2=EF=B8=8F?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- .../scraping_basics_javascript2/01_devtools_inspecting.md | 2 +- .../02_devtools_locating_elements.md | 2 +- .../scraping_basics_javascript2/06_locating_elements.md | 2 +- .../scraping_basics_javascript2/07_extracting_data.md | 2 +- .../webscraping/scraping_basics_javascript2/10_crawling.md | 4 ++-- .../webscraping/scraping_basics_javascript2/12_framework.md | 4 ++-- .../scraping_basics_python/01_devtools_inspecting.md | 2 +- .../scraping_basics_python/02_devtools_locating_elements.md | 2 +- .../scraping_basics_python/06_locating_elements.md | 2 +- .../webscraping/scraping_basics_python/07_extracting_data.md | 2 +- .../academy/webscraping/scraping_basics_python/10_crawling.md | 4 ++-- .../scraping_basics_python/11_scraping_variants.md | 2 +- .../webscraping/scraping_basics_python/12_framework.md | 4 ++-- 13 files changed, 17 insertions(+), 17 deletions(-) diff --git a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md index d1ae82dfc6..2540bfd21b 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md @@ -146,7 +146,7 @@ We're not here for playing around with elements, though—we want to create a sc Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. -:::tip Hint +:::tip Need a nudge? You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. diff --git a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md index ac29662770..0796418c9e 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md @@ -188,7 +188,7 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -:::tip Hint +:::tip Need a nudge? Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). diff --git a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md index 451da415ea..09101ee358 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md @@ -275,7 +275,7 @@ Djibouti Simplify the code from previous exercise. Use a single for loop and a single CSS selector. -:::tip Hints +:::tip Need a nudge? You may want to check out the following pages: diff --git a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md index bd511ca2a6..e7b81e9450 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md @@ -349,7 +349,7 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -:::tip Hints +:::tip Need a nudge? - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Cheerio gives you [.attr()](https://cheerio.js.org/docs/api/classes/Cheerio#attr) to access attributes. diff --git a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md index 7ef838b678..43c386b93c 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md @@ -226,7 +226,7 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -:::tip Hint +:::tip Need a nudge? Locating cells in tables is sometimes easier if you know how to [filter](https://cheerio.js.org/docs/api/classes/Cheerio#filter) or [navigate up](https://cheerio.js.org/docs/api/classes/Cheerio#parent) in the HTML element tree. @@ -294,7 +294,7 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -:::tip Hints +:::tip Need a nudge? - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. diff --git a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md index aadc29177a..bc43ea0508 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md @@ -410,7 +410,7 @@ If you export the dataset as JSON, it should look something like this: ] ``` -:::tip Hints +:::tip Need a nudge? - The website uses `DD/MM/YYYY` format for the date of birth. You'll need to change the format to the ISO 8601 standard with dashes: `YYYY-MM-DD` - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). @@ -505,7 +505,7 @@ async requestHandler({ ..., addRequests }) { }, ``` -:::tip Hint +:::tip Need a nudge? When navigating to the first IMDb search result, you might find it helpful to know that `enqueueLinks()` accepts a `limit` option, letting you specify the max number of HTTP requests to enqueue. diff --git a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md index be85fddd00..0332766a62 100644 --- a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md @@ -145,7 +145,7 @@ We're not here for playing around with elements, though—we want to create a sc Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. -:::tip Hint +:::tip Need a nudge? You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. diff --git a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md index 860a6c6fa8..154c7d1a19 100644 --- a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md @@ -189,7 +189,7 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -:::tip Hint +:::tip Need a nudge? Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). diff --git a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md index 63de75a0c1..fa8a38fc6d 100644 --- a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md @@ -272,7 +272,7 @@ Djibouti Simplify the code from previous exercise. Use a single for loop and a single CSS selector. -:::tip Hints +:::tip Need a nudge? You may want to check out the following pages: diff --git a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md index 1eda6a0746..01814edde9 100644 --- a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md @@ -326,7 +326,7 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -:::tip Hints +:::tip Need a nudge? - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Beautiful Soup gives you [access to attributes as if they were dictionary keys](https://beautiful-soup-4.readthedocs.io/en/latest/#attributes). diff --git a/sources/academy/webscraping/scraping_basics_python/10_crawling.md b/sources/academy/webscraping/scraping_basics_python/10_crawling.md index 1bd037a3a7..5605fff180 100644 --- a/sources/academy/webscraping/scraping_basics_python/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_python/10_crawling.md @@ -205,7 +205,7 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -:::tip Hint +:::tip Need a nudge? Locating cells in tables is sometimes easier if you know how to [navigate up](https://beautiful-soup-4.readthedocs.io/en/latest/index.html#going-up) in the HTML element soup. @@ -262,7 +262,7 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -:::tip Hints +:::tip Need a nudge? - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. diff --git a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md index 0f8237f173..cdd3496af6 100644 --- a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md +++ b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md @@ -331,7 +331,7 @@ Your output should look something like this: ... ``` -:::tip Hint +:::tip Need a nudge? You can find everything you need for working with dates and times in Python's [`datetime`](https://docs.python.org/3/library/datetime.html) module, including `date.today()`, `datetime.fromisoformat()`, `datetime.date()`, and `timedelta()`. diff --git a/sources/academy/webscraping/scraping_basics_python/12_framework.md b/sources/academy/webscraping/scraping_basics_python/12_framework.md index f96db0ed5b..714f200333 100644 --- a/sources/academy/webscraping/scraping_basics_python/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_python/12_framework.md @@ -453,7 +453,7 @@ If you export the dataset as JSON, it should look something like this: ] ``` -:::tip Hints +:::tip Need a nudge? - Use Python's `datetime.strptime(text, "%d/%m/%Y").date()` to parse dates in the `DD/MM/YYYY` format. Check out the [docs](https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime) for more details. - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). @@ -555,7 +555,7 @@ async def main(): ... ``` -:::tip Hint +:::tip Need a nudge? When navigating to the first IMDb search result, you might find it helpful to know that `context.enqueue_links()` accepts a `limit` keyword argument, letting you specify the max number of HTTP requests to enqueue.