Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -231,8 +231,8 @@ Open the `products.csv` file we created in the lesson using a spreadsheet applic
Let's use [Google Sheets](https://www.google.com/sheets/about/), which is free to use. After logging in with a Google account:

1. Go to **File > Import**, choose **Upload**, and select the file. Import the data using the default settings. You should see a table with all the data.
2. Select the header row. Go to **Data > Create filter**.
3. Use the filter icon that appears next to `minPrice`. Choose **Filter by condition**, select **Greater than**, and enter **500** in the text field. Confirm the dialog. You should see only the filtered data.
1. Select the header row. Go to **Data > Create filter**.
1. Use the filter icon that appears next to `minPrice`. Choose **Filter by condition**, select **Greater than**, and enter **500** in the text field. Confirm the dialog. You should see only the filtered data.

![CSV in Google Sheets](images/csv-sheets.png)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -215,8 +215,8 @@ Open the `products.csv` file we created in the lesson using a spreadsheet applic
Let's use [Google Sheets](https://www.google.com/sheets/about/), which is free to use. After logging in with a Google account:

1. Go to **File > Import**, choose **Upload**, and select the file. Import the data using the default settings. You should see a table with all the data.
2. Select the header row. Go to **Data > Create filter**.
3. Use the filter icon that appears next to `min_price`. Choose **Filter by condition**, select **Greater than**, and enter **500** in the text field. Confirm the dialog. You should see only the filtered data.
1. Select the header row. Go to **Data > Create filter**.
1. Use the filter icon that appears next to `min_price`. Choose **Filter by condition**, select **Greater than**, and enter **500** in the text field. Confirm the dialog. You should see only the filtered data.

![CSV in Google Sheets](images/csv-sheets.png)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,12 +66,12 @@ if __name__ == '__main__':
In the code, we do the following:

1. We import the necessary modules and define an asynchronous `main()` function.
2. Inside `main()`, we first create a crawler object, which manages the scraping process. In this case, it's a crawler based on Beautiful Soup.
3. Next, we define a nested asynchronous function called `handle_listing()`. It receives a `context` parameter, and Python type hints show it's of type `BeautifulSoupCrawlingContext`. Type hints help editors suggest what we can do with the object.
4. We use a Python decorator (the line starting with `@`) to register `handle_listing()` as the _default handler_ for processing HTTP responses.
5. Inside the handler, we extract the page title from the `soup` object and print its text without whitespace.
6. At the end of the function, we run the crawler on a product listing URL and await its completion.
7. The last two lines ensure that if the file is executed directly, Python will properly run the `main()` function using its asynchronous event loop.
1. Inside `main()`, we first create a crawler object, which manages the scraping process. In this case, it's a crawler based on Beautiful Soup.
1. Next, we define a nested asynchronous function called `handle_listing()`. It receives a `context` parameter, and Python type hints show it's of type `BeautifulSoupCrawlingContext`. Type hints help editors suggest what we can do with the object.
1. We use a Python decorator (the line starting with `@`) to register `handle_listing()` as the _default handler_ for processing HTTP responses.
1. Inside the handler, we extract the page title from the `soup` object and print its text without whitespace.
1. At the end of the function, we run the crawler on a product listing URL and await its completion.
1. The last two lines ensure that if the file is executed directly, Python will properly run the `main()` function using its asynchronous event loop.

Don't worry if some of this is new. We don't need to know exactly how [`asyncio`](https://docs.python.org/3/library/asyncio.html), decorators, or type hints work. Let's stick to the practical side and observe what the program does when executed:

Expand Down
Loading