New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug? - Axios not interpreting large comments correctly #5450
Comments
That is dynamically generated content on the client side, so it doesn't actually exist in the initial HTML code. Use Puppeteer or Playwright instead of Axios. You can disable JS in Chrome Dev Tools or use Postman to see what the original content looks like. |
I disabled JS in Chrome Dev Tools. I still see the comments. You suggested using Puppeteer. Does that mean there is a bug in Axios? I would prefer not to use another tool unless I have to. I am just not sure why I can see the first grouping (players) but Axios/Cheerio can not see either of the next two groupings (Coaches or Contributors). Are you able to parse those somehow? |
This means that you are trying to use the wrong tool because you need to execute JS on the page in order to get a dynamically generated DOM, you cannot get something from the server response that is not there. |
Going to close since this issue is not related to Axios. |
Describe the issue
Unsing node.js, I am trying to scrape a particular webpage (https://www.pro-football-reference.com/hof/). I am specifically trying to scrape all three sections (Players, Coaches and Contributors). I can scrape the first section (players) with no issues. However, the other two are not being read correctly. It looks like it has something to do with large comment blocks that are placed before both the Coaches and Contributors sections. When I look at the axios.get response, the large comments are mostly removed, but the DOM returned (for Coaches) looks like this:
Can anyone help? Thanks in advance! - DLT
Example Code
Expected behavior
This is what I see for players:
Players
...
This is the response I see for coaches:
Coaches
The text was updated successfully, but these errors were encountered: