Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a topic about reaching data that selectors cannot reach #3703

Merged
merged 1 commit into from Jun 25, 2019

Conversation

@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Mar 27, 2019

Write a new topic that addresses Can scrapy be used to scrape dynamic content from websites that are using AJAX?, one of the StackOverflow questions about Scrapy with the most votes.

Note to self: https://docs.scrapy.org/en/latest/topics/developer-tools.html

@codecov
Copy link

@codecov codecov bot commented Mar 27, 2019

Codecov Report

Merging #3703 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3703   +/-   ##
=======================================
  Coverage   85.46%   85.46%           
=======================================
  Files         169      169           
  Lines        9664     9664           
  Branches     1440     1440           
=======================================
  Hits         8259     8259           
  Misses       1157     1157           
  Partials      248      248

@Gallaecio Gallaecio changed the title [WIP] Add a topic about scraping JavaScript-rendered webpages Add a topic about scraping JavaScript-rendered webpages Mar 29, 2019
@Gallaecio Gallaecio force-pushed the ajax-docs branch 3 times, most recently from d2143dc to 1bd21b5 Mar 29, 2019
@Gallaecio Gallaecio changed the title Add a topic about scraping JavaScript-rendered webpages [WIP] Add a topic about scraping JavaScript-rendered webpages Apr 4, 2019
@Gallaecio Gallaecio changed the title [WIP] Add a topic about scraping JavaScript-rendered webpages Add a topic about scraping JavaScript-rendered webpages Apr 4, 2019
@Gallaecio Gallaecio changed the title Add a topic about scraping JavaScript-rendered webpages [WIP] Add a topic about scraping JavaScript-rendered webpages Apr 4, 2019
@Gallaecio Gallaecio changed the title [WIP] Add a topic about scraping JavaScript-rendered webpages Add a topic about reaching data that selectors cannot reach Apr 4, 2019
@dangra
Copy link
Member

@dangra dangra commented Jun 17, 2019

hi @Gallaecio, this doc topic looks good and ready to merge, isn't it?

@dangra dangra changed the title Add a topic about reaching data that selectors cannot reach [MRG+1] Add a topic about reaching data that selectors cannot reach Jun 17, 2019
@Gallaecio
Copy link
Member Author

@Gallaecio Gallaecio commented Jun 17, 2019

It is ready by me. But maybe someone else (@kmike?) wants to review it first.

Once you get the expected response, you can :ref:`extract the desired data from
it <topics-handling-response-formats>`.

If you cannot manage to get the expected response,
Copy link
Member Author

@Gallaecio Gallaecio Jun 18, 2019

Actually, it seems I did not finish writing here 😳

@Gallaecio Gallaecio changed the title [MRG+1] Add a topic about reaching data that selectors cannot reach [WIP] Add a topic about reaching data that selectors cannot reach Jun 18, 2019
@Gallaecio
Copy link
Member Author

@Gallaecio Gallaecio commented Jun 19, 2019

Completed.

@Gallaecio Gallaecio changed the title [WIP] Add a topic about reaching data that selectors cannot reach Add a topic about reaching data that selectors cannot reach Jun 19, 2019
@kmike
Copy link
Member

@kmike kmike commented Jun 20, 2019

Hi @Gallaecio! I think it is a much-needed section of the documentation. I've checked it, and it looks good to me, not comments or suggestions :)

However, I wonder if that's possible to find a better name for this section. "Data which selectors can't reach", "beyond selectors" doesn't sound precise - the problem is not with selectors (and in fact we're suggesting to use them later), the problem is that data is not available in the downloaded HTML, or that data can't be easily extracted by parsing HTML itself, and instead requires parsing some other format embedded in HTML (such as CSS or JS).

@Gallaecio
Copy link
Member Author

@Gallaecio Gallaecio commented Jun 21, 2019

I must say I’m not happy with the name either. Any suggestions?

@Gallaecio Gallaecio changed the title Add a topic about reaching data that selectors cannot reach [WIP] Add a topic about reaching data that selectors cannot reach Jun 24, 2019
@Gallaecio
Copy link
Member Author

@Gallaecio Gallaecio commented Jun 24, 2019

If we extend the topic to cover the easiest scenario as well, that where selectors work as expected, we could go for a simple, generic title in the lines of “Extracting data” or “Locating and extracting the desired data”. I’m not sure, though.

@raphapassini
Copy link
Contributor

@raphapassini raphapassini commented Jun 24, 2019

I would vote for something in the lines "Selecting dynamically loaded content"

@Gallaecio
Copy link
Member Author

@Gallaecio Gallaecio commented Jun 24, 2019

Hmm… "Dynamically-loaded" definitely fits everything covered one way or another. I like it.

@Gallaecio Gallaecio changed the title [WIP] Add a topic about reaching data that selectors cannot reach Add a topic about reaching data that selectors cannot reach Jun 24, 2019
@kmike
Copy link
Member

@kmike kmike commented Jun 25, 2019

👍 👍

@kmike kmike merged commit e5f12fa into scrapy:master Jun 25, 2019
3 checks passed
@kmike kmike added this to the v1.7 milestone Jun 25, 2019
@Gallaecio Gallaecio mentioned this pull request Aug 5, 2019
@pablohoffman
Copy link
Member

@pablohoffman pablohoffman commented Aug 21, 2019

great addition to the doc! 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants