Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,6 @@ These elements aren't visible to regular visitors. They're there just in case Ja
Using our knowledge of Beautiful Soup, we can locate the options and extract the data we need:

```py
...

listing_url = "https://warehouse-theme-metal.myshopify.com/collections/sales"
listing_soup = download(listing_url)

Expand All @@ -89,8 +87,6 @@ for product in listing_soup.select(".product-item"):
else:
item["variant_name"] = None
data.append(item)

...
```

The CSS selector `.product-form__option.no-js` matches elements with both `product-form__option` and `no-js` classes. Then we're using the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) to match all `option` elements somewhere inside the `.product-form__option.no-js` wrapper.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -534,7 +534,6 @@ If you export the dataset as JSON, it should look something like this:
To scrape IMDb data, you'll need to construct a `Request` object with the appropriate search URL for each movie title. The following code snippet gives you an idea of how to do this:

```py
...
from urllib.parse import quote_plus

async def main():
Expand All @@ -550,7 +549,6 @@ async def main():
await context.add_requests(requests)

...
...
```

When navigating to the first search result, you might find it helpful to know that `context.enqueue_links()` accepts a `limit` keyword argument, letting you specify the max number of HTTP requests to enqueue.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,6 @@ These elements aren't visible to regular visitors. They're there just in case Ja
Using our knowledge of Beautiful Soup, we can locate the options and extract the data we need:

```py
...

listing_url = "https://warehouse-theme-metal.myshopify.com/collections/sales"
listing_soup = download(listing_url)

Expand All @@ -88,8 +86,6 @@ for product in listing_soup.select(".product-item"):
else:
item["variant_name"] = None
data.append(item)

...
```

The CSS selector `.product-form__option.no-js` matches elements with both `product-form__option` and `no-js` classes. Then we're using the [descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator) to match all `option` elements somewhere inside the `.product-form__option.no-js` wrapper.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -533,7 +533,6 @@ If you export the dataset as JSON, it should look something like this:
To scrape IMDb data, you'll need to construct a `Request` object with the appropriate search URL for each movie title. The following code snippet gives you an idea of how to do this:

```py
...
from urllib.parse import quote_plus

async def main():
Expand All @@ -549,7 +548,6 @@ async def main():
await context.add_requests(requests)

...
...
```

When navigating to the first search result, you might find it helpful to know that `context.enqueue_links()` accepts a `limit` keyword argument, letting you specify the max number of HTTP requests to enqueue.
Expand Down
Loading