diff --git a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md index 6b14e144e3..2540bfd21b 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/01_devtools_inspecting.md @@ -144,7 +144,13 @@ We're not here for playing around with elements, though—we want to create a sc ### Find FIFA logo -Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. Hint: You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. +Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. + +:::tip Need a nudge? + +You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. + +:::
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md index f148552fcb..0796418c9e 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/02_devtools_locating_elements.md @@ -188,7 +188,11 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -Hint: Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). +:::tip Need a nudge? + +Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). + +::: ![Articles on Guardian's page about F1](./images/devtools-exercise-guardian1.png) diff --git a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md index 9b210d5274..09101ee358 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/06_locating_elements.md @@ -273,11 +273,17 @@ Djibouti ### Use CSS selectors to their max -Simplify the code from previous exercise. Use a single for loop and a single CSS selector. You may want to check out the following pages: +Simplify the code from previous exercise. Use a single for loop and a single CSS selector. + +:::tip Need a nudge? + +You may want to check out the following pages: - [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) - [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child) +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md index 7ca821eef8..e7b81e9450 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/07_extracting_data.md @@ -349,13 +349,15 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -Hints: +:::tip Need a nudge? - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Cheerio gives you [.attr()](https://cheerio.js.org/docs/api/classes/Cheerio#attr) to access attributes. - In JavaScript you can use an ISO 8601 string to create a [`Date`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Date) object. - To get the date, you can call `.toDateString()` on `Date` objects. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md index 513873f98a..43c386b93c 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/10_crawling.md @@ -226,7 +226,11 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -Hint: Locating cells in tables is sometimes easier if you know how to [filter](https://cheerio.js.org/docs/api/classes/Cheerio#filter) or [navigate up](https://cheerio.js.org/docs/api/classes/Cheerio#parent) in the HTML element tree. +:::tip Need a nudge? + +Locating cells in tables is sometimes easier if you know how to [filter](https://cheerio.js.org/docs/api/classes/Cheerio#filter) or [navigate up](https://cheerio.js.org/docs/api/classes/Cheerio#parent) in the HTML element tree. + +:::
Solution @@ -290,11 +294,13 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -Hints: +:::tip Need a nudge? - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md index bfa205fb40..bc43ea0508 100644 --- a/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_javascript2/12_framework.md @@ -410,11 +410,13 @@ If you export the dataset as JSON, it should look something like this: ] ``` -Hints: +:::tip Need a nudge? - The website uses `DD/MM/YYYY` format for the date of birth. You'll need to change the format to the ISO 8601 standard with dashes: `YYYY-MM-DD` - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). +::: +
Solution @@ -503,8 +505,12 @@ async requestHandler({ ..., addRequests }) { }, ``` +:::tip Need a nudge? + When navigating to the first IMDb search result, you might find it helpful to know that `enqueueLinks()` accepts a `limit` option, letting you specify the max number of HTTP requests to enqueue. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md index 81a62bb5ea..0332766a62 100644 --- a/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md +++ b/sources/academy/webscraping/scraping_basics_python/01_devtools_inspecting.md @@ -143,7 +143,13 @@ We're not here for playing around with elements, though—we want to create a sc ### Find FIFA logo -Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. Hint: You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. +Open the [FIFA website](https://www.fifa.com/) and use the DevTools to figure out the URL of FIFA's logo image file. + +:::tip Need a nudge? + +You're looking for an [`img`](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img) element with a `src` attribute. + +:::
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md index 3a77ec607e..154c7d1a19 100644 --- a/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/02_devtools_locating_elements.md @@ -189,7 +189,11 @@ Go to Shein's [Jewelry & Accessories](https://shein.com/RecommendSelection/Jewel Go to Guardian's [page about F1](https://www.theguardian.com/sport/formulaone). Use the **Console** to find all HTML elements representing the articles. -Hint: Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). +:::tip Need a nudge? + +Learn about the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator). + +::: ![Articles on Guardian's page about F1](./images/devtools-exercise-guardian1.png) diff --git a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md index 4193c0b139..fa8a38fc6d 100644 --- a/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md +++ b/sources/academy/webscraping/scraping_basics_python/06_locating_elements.md @@ -270,11 +270,17 @@ Djibouti ### Use CSS selectors to their max -Simplify the code from previous exercise. Use a single for loop and a single CSS selector. You may want to check out the following pages: +Simplify the code from previous exercise. Use a single for loop and a single CSS selector. + +:::tip Need a nudge? + +You may want to check out the following pages: - [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) - [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child) +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md index 47023acd0c..01814edde9 100644 --- a/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md +++ b/sources/academy/webscraping/scraping_basics_python/07_extracting_data.md @@ -326,13 +326,15 @@ Hamilton reveals distress over ‘devastating’ groundhog accident at Canadian ... ``` -Hints: +:::tip Need a nudge? - HTML's `time` element can have an attribute `datetime`, which [contains data in a machine-readable format](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/time), such as the ISO 8601. - Beautiful Soup gives you [access to attributes as if they were dictionary keys](https://beautiful-soup-4.readthedocs.io/en/latest/#attributes). - In Python you can create `datetime` objects using `datetime.fromisoformat()`, a [built-in method for parsing ISO 8601 strings](https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat). - To get the date, you can call `.strftime('%a %b %d %Y')` on `datetime` objects. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/10_crawling.md b/sources/academy/webscraping/scraping_basics_python/10_crawling.md index a48ddea3a5..5605fff180 100644 --- a/sources/academy/webscraping/scraping_basics_python/10_crawling.md +++ b/sources/academy/webscraping/scraping_basics_python/10_crawling.md @@ -205,7 +205,11 @@ https://en.wikipedia.org/wiki/Cameroon +237 ... ``` -Hint: Locating cells in tables is sometimes easier if you know how to [navigate up](https://beautiful-soup-4.readthedocs.io/en/latest/index.html#going-up) in the HTML element soup. +:::tip Need a nudge? + +Locating cells in tables is sometimes easier if you know how to [navigate up](https://beautiful-soup-4.readthedocs.io/en/latest/index.html#going-up) in the HTML element soup. + +:::
Solution @@ -258,11 +262,13 @@ PA Media: Lewis Hamilton reveals lifelong battle with depression after school bu ... ``` -Hints: +:::tip Need a nudge? - You can use [attribute selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors) to select HTML elements based on their attribute values. - Sometimes a person authors the article, but other times it's contributed by a news agency. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md index f183a926bb..cdd3496af6 100644 --- a/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md +++ b/sources/academy/webscraping/scraping_basics_python/11_scraping_variants.md @@ -331,8 +331,12 @@ Your output should look something like this: ... ``` +:::tip Need a nudge? + You can find everything you need for working with dates and times in Python's [`datetime`](https://docs.python.org/3/library/datetime.html) module, including `date.today()`, `datetime.fromisoformat()`, `datetime.date()`, and `timedelta()`. +::: +
Solution diff --git a/sources/academy/webscraping/scraping_basics_python/12_framework.md b/sources/academy/webscraping/scraping_basics_python/12_framework.md index 3a7f70660b..714f200333 100644 --- a/sources/academy/webscraping/scraping_basics_python/12_framework.md +++ b/sources/academy/webscraping/scraping_basics_python/12_framework.md @@ -453,11 +453,13 @@ If you export the dataset as JSON, it should look something like this: ] ``` -Hints: +:::tip Need a nudge? - Use Python's `datetime.strptime(text, "%d/%m/%Y").date()` to parse dates in the `DD/MM/YYYY` format. Check out the [docs](https://docs.python.org/3/library/datetime.html#datetime.datetime.strptime) for more details. - To locate the Instagram URL, use the attribute selector `a[href*='instagram']`. Learn more about attribute selectors in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/CSS/Attribute_selectors). +::: +
Solution @@ -553,8 +555,12 @@ async def main(): ... ``` +:::tip Need a nudge? + When navigating to the first IMDb search result, you might find it helpful to know that `context.enqueue_links()` accepts a `limit` keyword argument, letting you specify the max number of HTTP requests to enqueue. +::: +
Solution