In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("hw05.ipynb")

<div class="alert alert-success" markdown="1">

#### Homework 5

# Web Scraping and APIs

### EECS 398-003: Practical Data Science, Fall 2024

#### Due Thursday, October 3rd at 11:59PM
    
</div>

## Instructions

Welcome to Homework 5! In this homework, you will work with and wrangle real-world data from websites & APIs. We'll be writing some HTML, scraping it from the web and parsing it, and making requests to APIs to build structured DataFrames from JSON. See the [Readings section of the Resources tab on the course website](https://practicaldsc.org/resources/#readings) for supplemental resources.

You are given six slip days throughout the semester to extend deadlines. See the [Syllabus](https://practicaldsc.org/syllabus) for more details. With the exception of using slip days, late work will not be accepted unless you have made special arrangements with your instructor.

To access this notebook, you'll need to clone our [public GitHub repository](https://github.com/practicaldsc/fa24/). The [⚙️ Environment Setup](https://practicaldsc.org/env-setup) page on the course website walks you through the necessary steps. Once you're done, you'll submit your completed notebook to Gradescope.

Please start early and submit often. You can submit as many times as you'd like to Gradescope, and we'll grade your **most recent** submission. Remember that the public `grader.check` tests in your notebook are not comprehensive, and that your work will also be graded On a hidden test cases on Gradescope after the submission deadline.

This homework is worth a total of **34 points**, 30 of which come from the autograder and 4 of which are for completing our Pre-Midterm Survey (Question 0). The number of points each question is worth is listed at the start of each question. **The four questions in the assignment are independent, so feel free to move around if you get stuck**. Tip: if you're using Jupyter Lab, you can see a Table of Contents for the notebook by going to View > Table of Contents.

<!-- <a name='like-dataframe'>

</a>

<div class="alert alert-warning" markdown="1">
    
**Note**: Throughout this homework, you'll see statements like this frequently:

<blockquote>Complete the implementation of the function ____, which takes in a DataFrame <code>df</code> like <code>other_df</code> and _____.</blockquote>

What this means is that you should assume that `df` has the same number of columns as `other_df`, with the same column titles and data types, but potentially a different number of rows in a different order, with a potentially different index. You should always also assume that `df` has at least one row.

We have you implement functions like this to prevent you from hard-coding your answers to one specific dataset.

</div>
 -->
 
<div class="alert alert-danger" markdown="1">
Unlike in recent homeworks, <tt>for</tt>-loops are <strong>allowed</strong> throughout this entire homework.

</div>

To get started, run the import cell above, plus the cell at the top of the notebook that imports and initializes `otter`.

In [None]:
import os
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup

## Question 0: Pre-Midterm Survey 📬 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">5 Points</div>

We'd like to get your feedback on how the class is going so far, now that we're just over a month in. We've put together a survey that asks you to provide feedback on all aspects of the course. You can provide as much or as little detail as you'd like. We expect it will take 15 minutes to complete.

This survey is **NOT anonymous**, since it's for class credit – it counts towards your score for Homework 5. The responses to the survey will be visible to course staff.

<center><h3>Access the survey <a href="https://docs.google.com/forms/d/e/1FAIpQLSfCT2TfFUWF0gbnfuV_at0bG3w0Za9-KuLIA7cpZm0NL5jbKQ/viewform"><b>here</b></a>.</center>

We will manually add 4 points to your Homework 5 score after the deadline, once we verify that you submitted this survey. Make sure to sign in with your @umich.edu email.

We really appreciate your feedback, thanks! 😊

## Question 1: Practice with HTML Tags 📎 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">4 Points</div>

In Question 2, you'll spend plenty of time parsing HTML source code. But before you get your hands dirty trying to extract information from HTML written by other people, it is a good idea to write some basic HTML code yourself. This exercise will help you better understand how the code in a `.html` file is structured.

For this question, you'll create a very basic `.html` file, named `hw05_q01.html`, that satisfies the following conditions:

- It must have `<title>` and `<head>` tags.
- It must also have `<body>` tags. Within the `<body>` tags, it must have:
    - At least two headers.
    * At least three images.
        - At least one image must be a local file.
        - At least one image must be linked to online source.
        - At least one image has to have default text when it cannot be displayed.
    * At least three references (hyperlinks) to different web pages.
    * At least one table with two rows and two columns.
    

Make sure to save your file as `hw05_q01.html`. **When submitting this homework to Gradescope, make sure to also upload `hw05_q01.html` along with the _local_ image that you embedded in your site.** You can upload multiple files to Gradescope at a time.
   

Some guidance: 

- You can write and view basic HTML with a Jupyter Notebook, using either a Markdown cell or by using the `IPython.display.HTML` function (which takes in a string of HTML and renders it).
- If you write your HTML code within a Jupyter Notebook, you should later copy your code into a text editor and save it with the `.html` extension. You could also write your HTML in a text editor directly.
- Be sure to open your final `.html` file in a browser and make sure it looks correct on its own.

In [None]:
grader.check("q01")

## Question 2: Scraping an Online Bookstore 📚

Browse through the following fake online bookstore: http://books.toscrape.com/. This website is meant for toying with scraping.

By the end of this question, you'll scrape this website, collecting data on all the books that have:
- **_at least_ a four-star rating**, and
- **a price _strictly_ less than £50**, and 
- **belong to specific categories** (more details below). 

This is a multi-step question, which we've broken into several sub-questions to help you organize your work.

### Question 2.1 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">4 Points</div>

Complete the implementation of the function `extract_book_links`, which takes in the content of a page that contains book listings as a **string of HTML**, and returns a **list** of URLs of book-specific pages for all books with:
- **_at least_ a four-star rating**, and
-  **a price _strictly_ less than £50**.

Example behavior is given below.

```python
>>> out = extract_book_links(open('data/products.html', encoding='utf-8').read())
>>> len(out)
6

>>> out[1]
'scarlet-the-lunar-chronicles-2_218/index.html'

>>> out[-1]
'ready-player-one_209/index.html'
```

Some guidance:
- The URLs should appear in the order in which they appear in the string of HTML. Additionally, the URLs shouldn't contain the protocol, i.e. `'http://books.toscrape.com/catalogue/'`. The protocols should be added into the URLs when you actually make the requests in Question 2.3.
- Throughout this question, you should use the "Inspect" tool in your browser to view the source code of the pages you're trying to scrape. The public tests for this question are run on the file `data/products.html`, but your code should also work on any page of book listings from https://books.toscrape.com/, e.g. https://books.toscrape.com/catalogue/page-3.html. So, to test your work, you may want to request a few specific pages **outside** of your function; `extract_book_links` itself should not make any requests.

In [None]:
def extract_book_links(text):
    ...

# Feel free to change this input to make sure your function works correctly.
extract_book_links(open('data/products.html', encoding='utf-8').read())

In [None]:
grader.check("q02_1")

### Question 2.2 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">5 Points</div>

Complete the implementation of the function `get_product_info`, which takes in the content of a single book-specific page as a **string of HTML**, and a list `categories` of book categories. If the input book is in the list of `categories`, `get_product_info` should return a **dictionary** corresponding to a row in the DataFrame in the image above (where the keys are the column names and the values are the row values). If the input book is not in the list of `categories`, return `None`.

Example behavior is given below.

```python
>>> html_string = open('data/Frankenstein.html', encoding='utf-8').read()
>>> out = get_product_info(html_string, ['Default'])
>>> type(out)
dict

>>> out.keys()
dict_keys(['UPC', 'Product Type', 'Price (excl. tax)', 'Price (incl. tax)', 'Tax', 'Availability', 'Number of reviews', 'Category', 'Rating', 'Description', 'Title'])

>>> out['Rating']
'Two'

>>> out['Price (incl. tax)']
'£38.00'
```

Some guidance:
- The public tests for this question are run on the file `data/Frankenstein.html`, but your code should also work on any individual book's page from https://books.toscrape.com/, e.g. https://books.toscrape.com/catalogue/sharp-objects_997/index.html. So, to test your work, you may want to request a few specific pages **outside** of your function; `extract_book_links` itself should not make any requests.
- Don't worry about the types of the values in your returned dictionary. That is, it's fine if your `'Number of reviews'` value is not stored as type `int`, and it's fine if your `'Price'` value is not stored as type `float`.

In [None]:
def get_product_info(text, categories):
    ...

# Feel free to change this input to make sure your function works correctly.
get_product_info(open('data/Frankenstein.html', encoding='utf-8').read(), ['Default'])

In [None]:
grader.check("q02_2")

### Question 2.3 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">4 Points</div>

Finally, put everything together. Complete the implementation of the function `scrape_books`, which takes in an integer `k` and a list `categories` of book categories. `scrape_books` should use `requests` to scrape the first `k` pages of the bookstore and return a DataFrame of only the books that have:
- **_at least_ a four-star rating**, and
- **a price _strictly_ less than £50**, and
- **a category that is in the list `categories`**.

Example behavior is given below.

```python
>>> scrape_books(5, ['Default', 'Romance'])
```
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>UPC</th>
      <th>Product Type</th>
      <th>Price (excl. tax)</th>
      <th>Price (incl. tax)</th>
      <th>Tax</th>
      <th>Availability</th>
      <th>Number of reviews</th>
      <th>Category</th>
      <th>Rating</th>
      <th>Description</th>
      <th>Title</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>e10e1e165dc8be4a</td>
      <td>Books</td>
      <td>Â£22.60</td>
      <td>Â£22.60</td>
      <td>Â£0.00</td>
      <td>In stock (19 available)</td>
      <td>0</td>
      <td>Default</td>
      <td>Four</td>
      <td>For readers of Laura Hillenbrand's Seabiscuit ...</td>
      <td>The Boys in the Boat: Nine Americans and Their...</td>
    </tr>
    <tr>
      <th>1</th>
      <td>c2e46a2ee3b4a322</td>
      <td>Books</td>
      <td>Â£25.27</td>
      <td>Â£25.27</td>
      <td>Â£0.00</td>
      <td>In stock (19 available)</td>
      <td>0</td>
      <td>Romance</td>
      <td>Five</td>
      <td>A Michelin two-star chef at twenty-eight, Viol...</td>
      <td>Chase Me (Paris Nights #2)</td>
    </tr>
    <tr>
      <th>2</th>
      <td>00bfed9e18bb36f3</td>
      <td>Books</td>
      <td>Â£34.53</td>
      <td>Â£34.53</td>
      <td>Â£0.00</td>
      <td>In stock (19 available)</td>
      <td>0</td>
      <td>Romance</td>
      <td>Five</td>
      <td>No matter how busy he keeps himself, successfu...</td>
      <td>Black Dust</td>
    </tr>
    <tr>
      <th>3</th>
      <td>8c9e6bf2467d740d</td>
      <td>Books</td>
      <td>Â£20.59</td>
      <td>Â£20.59</td>
      <td>Â£0.00</td>
      <td>In stock (16 available)</td>
      <td>0</td>
      <td>Default</td>
      <td>Five</td>
      <td>Slay Procrastination, Distraction, and Overwhe...</td>
      <td>The Inefficiency Assassin: Time Management Tac...</td>
    </tr>
  </tbody>
</table>


<br>

Some guidance:

- The first page of the bookstore is at http://books.toscrape.com/catalogue/page-1.html. Subsequent pages can be found by clicking the "Next" button at the bottom of the page. Look at how the URLs change each time you navigate to a new page; think about how to use [f-strings](https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals) (or some other string formatting technique) to generate these URLs.
- **`scrape_books` should run in under 180 seconds on the entire bookstore (`k = 50`). `scrape_books` is also the only function that should make `GET` requests; the other two functions parse already-existing HTML.**
- It's fine if your `'Price'` column contains symbols other than `'£'`, as in the example above.

In [None]:
def scrape_books(k, categories):
    ...
    
# Feel free to change this input to make sure your function works correctly.
scrape_books(5, ['Default', 'Romance'])

In [None]:
grader.check("q02_3")

## Question 3: Stock Stats 🤑

You're aspiring for a finance job in Chicago, and decide to put your new data wrangling skills to the test by pulling stock data from an API and calculating various statistics. The API we will work in this question is hosted by Financial Modeling Prep, and can be found at https://site.financialmodelingprep.com/developer/docs#daily-chart-charts. Specifically, we will use the "**Daily Chart EOD**" endpoint – search for it at the linked page.

Some relevant definitions:
- Ticker: A short code that refers to a stock. For example, Apple's ticker is AAPL and Ford's ticker is F. 
- Open: The price of a stock at the beginning of a trading day.
- Close: The price of a stock at the end of a trading day.
- Volume: The total number of shares traded in a day.
- Percent change: The difference in price with respect to the original price, as a percentage.

To make requests to the aforementioned API, you will need an API key. **In order to get one, you will need to make an account at the website.** Once you've signed up, you can use the API key that comes with the free plan. It has a limit of 250 requests per day, which should be more than enough. You will have to encode your API key in the URL that you make requests to; see a complete example of such a request at the [**documentation**](https://site.financialmodelingprep.com/developer/docs#Stock-Historical-Price).

### Question 3.1 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">4 Points</div>

Complete the implementation of the function `stock_history`, which takes in a string `ticker` and two integers, `year` and `month`, and returns a DataFrame containing the price history for that stock in that month. Keep all of the attributes that are returned by the API.

Example behavior is given below.

```python
>>> out = stock_history('F', 2024, 8)
>>> out.shape
(22, 13)

# August 31st was a Saturday, which is why it doesn't appear – US markets are closed on weekends!
>>> out.head()
```

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>date</th>
      <th>open</th>
      <th>high</th>
      <th>low</th>
      <th>close</th>
      <th>adjClose</th>
      <th>volume</th>
      <th>unadjustedVolume</th>
      <th>change</th>
      <th>changePercent</th>
      <th>vwap</th>
      <th>label</th>
      <th>changeOverTime</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>2024-08-30</td>
      <td>11.15</td>
      <td>11.23</td>
      <td>11.06</td>
      <td>11.19</td>
      <td>11.19</td>
      <td>44977100</td>
      <td>44977100</td>
      <td>0.04</td>
      <td>0.35874</td>
      <td>11.1575</td>
      <td>August 30, 24</td>
      <td>0.003587</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2024-08-29</td>
      <td>11.02</td>
      <td>11.20</td>
      <td>10.99</td>
      <td>11.11</td>
      <td>11.11</td>
      <td>44989200</td>
      <td>44989200</td>
      <td>0.09</td>
      <td>0.81670</td>
      <td>11.0800</td>
      <td>August 29, 24</td>
      <td>0.008167</td>
    </tr>
    <tr>
      <th>2</th>
      <td>2024-08-28</td>
      <td>11.10</td>
      <td>11.19</td>
      <td>10.98</td>
      <td>11.04</td>
      <td>11.04</td>
      <td>35442200</td>
      <td>35442200</td>
      <td>-0.06</td>
      <td>-0.54054</td>
      <td>11.0775</td>
      <td>August 28, 24</td>
      <td>-0.005405</td>
    </tr>
    <tr>
      <th>3</th>
      <td>2024-08-27</td>
      <td>11.12</td>
      <td>11.22</td>
      <td>10.99</td>
      <td>11.14</td>
      <td>11.14</td>
      <td>44841000</td>
      <td>44841000</td>
      <td>0.02</td>
      <td>0.17986</td>
      <td>11.1175</td>
      <td>August 27, 24</td>
      <td>0.001799</td>
    </tr>
    <tr>
      <th>4</th>
      <td>2024-08-26</td>
      <td>11.32</td>
      <td>11.37</td>
      <td>11.07</td>
      <td>11.11</td>
      <td>11.11</td>
      <td>53070331</td>
      <td>53070331</td>
      <td>-0.21</td>
      <td>-1.86000</td>
      <td>11.2175</td>
      <td>August 26, 24</td>
      <td>-0.018600</td>
    </tr>
  </tbody>
</table>




Some guidance:
- Read the API [**documentation**](https://site.financialmodelingprep.com/developer/docs#Stock-Historical-Price) if you get stuck! In particular, [this page](https://site.financialmodelingprep.com/playground?url=daily-chart-charts) will help you craft the URL you need to make a request to.
- To format the starting and ending dates you'll need to pass to the API, [`pd.date_range`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html) might be helpful, along with `pd.to_datetime`.
- The [`requests.get`](https://docs.python-requests.org/en/master/user/quickstart/) function returns a Response object, not the data itself. Use the `json` method on the Response object to extract the relevant JSON, as we did in [Lecture 10](https://practicaldsc.org/resources/lectures/lec10/lec10-filled.html#APIs-and-JSON) (you don't need to `import json` to do this).
- You can instantiate a DataFrame using a sequence of dictionaries as input to `pd.DataFrame`. Once you've gotten your response object back, you're done 99% of the work involved with this question.

In [None]:
def stock_history(ticker, year, month):
    ...
    
# Feel free to change this input to make sure your function works correctly.
stock_history('F', 2024, 8).head()

In [None]:
grader.check("q03_1")

### Question 3.2 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">3 Points</div>

Create a function `stock_stats` that takes in a DataFrame outputted by `stock_history` and returns a **tuple** of two numbers:
1. The percent change of the stock throughout the month as a **percentage**.
2. An estimate of the total transaction volume **in billion of dollars** for that month.

Example behavior is given below.

```python
>>> stock_stats(stock_history('F', 2024, 8))
('+3.04%', '13.16B')
```

Both values in the tuple should be **strings** that contain numbers rounded to two decimal places. Add a plus or minus sign in front of the percent change, and make sure that the total transaction volume string ends in a `'B'`. Both strings should **always** have two decimal places, e.g. if a percentage change is 3% exactly, render it as `'+3.00%'`. Python [string formatting](https://docs.python.org/3/tutorial/inputoutput.html) will be helpful.

Let's illustrate both calculations. For example, suppose there are only three days in March – March 1st, March 2nd, and March 3rd. If BYND (Beyond Meat) opens at \\$4 on March 1st and closes at \\$5 on March 3rd, its **percent change** for the month of March is: $$\frac{\$5-\$4}{\$4} = +25.00\%$$

That is, when computing the percent change, use the opening price on the first day of the month as the starting price and the closing price on the last day of the month as the ending price.

<br>

**To compute the total transaction volume**, assume that on any given day, the average price of a share is the midpoint of the high and low price for that day, i.e. $\frac{\text{high} + \text{low}}{2}$.

$$ \text{Estimated Total Transaction Volume (in dollars)} = \text{Volume (number of shares traded)} \cdot \text{Average Price} $$




Suppose the high and low prices and volumes of BYND on each day are given below.
- March 1st: high \\$5, low \\$3, volume 500 million (0.5 billion).
- March 2nd: high \\$5.5, low \\$2.5, volume 1 billion.
- March 3rd: high \\$5.25, low \\$4, volume 500 million (0.5 billion).

Then, the estimated total transaction volume is:
$$\left( \frac{\$5 + \$3}{2} \cdot 0.5 B \right) + \left( \frac{\$5.5 + \$2.5}{2} \cdot 1 B \right) + \left( \frac{\$5.25 + \$4}{2} \cdot 0.5 B \right) = 8.3125B$$

In [None]:
def stock_stats(history):
    ...
    
# Feel free to change this input to make sure your function works correctly.
stock_stats(stock_history('F', 2024, 8))

In [None]:
grader.check("q03_2")

## Question 4: Comment Threads 🧵 <div style="display:inline-block; vertical-align: middle; padding:7px 7px; font-size:10px; font-weight:light; color:white; background-color:#e84c4a; border-radius:7px; text-align:left;">6 Points</div>

You regularly browse [Hacker News](https://news.ycombinator.com/) to keep up with the latest news in tech. An example link to a Hacker News article is https://news.ycombinator.com/item?id=18344932. Note that this article has 18 comments and has a `storyid` of 18344932. 

The problem now is that you don't have internet access on your phone during your morning commute to school, so you want to save the interesting stories' comment threads beforehand locally. You find their [API documentation](https://github.com/HackerNews/API) and decide to get to work.

Complete the implementation of the function `get_comments`, which takes in a `storyid` and returns a DataFrame of all the comments below the news story. **Make sure the order of the comments in your DataFrame is from top to bottom, just as you see on the website**. 

Example behavior is given below.

```python
>>> get_comments(18344932).head()
```

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>id</th>
      <th>by</th>
      <th>parent</th>
      <th>text</th>
      <th>time</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>18380397</td>
      <td>valyala</td>
      <td>18344932</td>
      <td>TimescaleDB is great for storing time series c...</td>
      <td>2018-11-05 06:53:19</td>
    </tr>
    <tr>
      <th>1</th>
      <td>18346406</td>
      <td>msiggy</td>
      <td>18344932</td>
      <td>I&amp;#x27;m excited to give this database a try i...</td>
      <td>2018-10-31 15:20:22</td>
    </tr>
    <tr>
      <th>2</th>
      <td>18348601</td>
      <td>sman393</td>
      <td>18344932</td>
      <td>Can this be used side by side on normal Postgr...</td>
      <td>2018-10-31 19:29:39</td>
    </tr>
    <tr>
      <th>3</th>
      <td>18348631</td>
      <td>RobAtticus</td>
      <td>18348601</td>
      <td>Yep, absolutely. Regular PostgreSQL tables coe...</td>
      <td>2018-10-31 19:34:52</td>
    </tr>
    <tr>
      <th>4</th>
      <td>18348984</td>
      <td>sman393</td>
      <td>18348631</td>
      <td>Good to hear! how does the current TimescaleDB...</td>
      <td>2018-10-31 20:23:46</td>
    </tr>
  </tbody>
</table>

As you see above, the DataFrame that `get_comments` returns should have 5 columns:
- `'id'`: The unique ID of the comment.
- `'by'`: The author of the comment.
- `'parent'`: The unique ID of the comment this comment is replying to.
- `'text'`: The actual comment.
- `'time'`: When the comment was created (in `pd.Timestamp` format).

Some guidance:
- The URL to make requests to is `'https://hacker-news.firebaseio.com/v0/item/{id}.json'`, however, the `{id}` should be replaced with the ID of the article or page you are trying to access. 
- Again, do not `import json` – instead, use the `json` method on the Response object you get back.
- Use depth-first search when traversing the comments tree. You will have to do this manually, since you cannot use BeautifulSoup (which is only for HTML documents, not JSON objects). Haven't taken (or finished) EECS 281 and don't know what depth-first search is? Don't worry – the video linked in the green box below will walk through the general idea.
- Make sure the length of your returned DataFrame is the same as value for the `'descendants'` key in the response JSON, both of which correspond to the number of comments for the story.
- You should ignore "dead" comments (you will know them when you see them), as well as "dead" comments' children. 
- Remember, you're allowed to use loops in this function (both `for`-loops and, **hint**, other types of loops!). You may also want to create at least one helper function.

<div class="alert alert-block alert-success">
You may find <a href="https://www.youtube.com/watch?v=uOfwW-onmpc"><b>this hint video 🎥</b></a> helpful!

Also, the following Python behavior may be useful:

```python
>>> places = [280, 203, 398, 492]
>>> places.pop(0)
280
>>> places
[203, 398, 492]

>>> places = [183, 101] + places
>>> places
[183, 101, 203, 398, 492]

```
 
</div>

In [None]:
def get_comments(storyid):
    ...
    
# Feel free to change this input to make sure your function works correctly.
# For example, test your code on newer Hacker News articles, like
# https://news.ycombinator.com/item?id=41660308!
get_comments(18344932)

In [None]:
grader.check("q04")

## Finish Line 🏁

Congratulations! You're ready to submit Homework 5.

To submit your homework:

1. Select `Kernel -> Restart & Run All` to ensure that you have executed all cells, including the test cells.
2. Read through the notebook to make sure everything is fine and all tests passed.
3. Run the cell below to run all tests, and make sure that they all pass.
4. Download your notebook using `File -> Download as -> Notebook (.ipynb)`, then upload your notebook to Gradescope under "Homework 5".
5. Stick around while the Gradescope autograder grades your work. Make sure you see that all **public tests** have passed on Gradescope. **Remember that homeworks have hidden tests, which you will not see your scores on until a few days after the deadline!**
6. Check that you have a confirmation email from Gradescope and save it as proof of your submission.

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [None]:
grader.check_all()