Skip to content

Commit

Permalink
Improve example of execution expression
Browse files Browse the repository at this point in the history
  • Loading branch information
tac0x2a committed Mar 23, 2021
1 parent 9a8ce62 commit 7afa83c
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 7 deletions.
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,12 @@ agent = Mechanize.new
root_page = agent.get("http://some.scraping.page.net/")

result = root.inject(agent, root_page)
# => [ {"title" => "PageTitle", "content" => "Page Contents" }, ... ]
# => [
# {"title" => "PageTitle 01", "content" => "Page Contents 01" },
# {"title" => "PageTitle 02", "content" => "Page Contents 02" },
# ...
# {"title" => "PageTitle N", "content" => "Page Contents N" }
# ]
```

## Dev
Expand Down
10 changes: 7 additions & 3 deletions USAGE.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,14 @@ agent = Mechanize.new
root_page = agent.get("http://some.scraping.page.net/")

result = root.inject(agent, root_page)
# => [ {"title" => "PageTitle1", "content" => "Page Contents1" },
# {"title" => "PageTitle2", "content" => "Page Contents2" }, ... ]

# => [
# => {"title" => "PageTitle 01", "content" => "Page Contents 01" },
# => {"title" => "PageTitle 02", "content" => "Page Contents 02" },
# => ...
# => {"title" => "PageTitle N", "content" => "Page Contents N" }
# => ]
```

この例では、 LinkNode(`links_root`)の xpath で指定された各リンク先のページから、TextNode(`text_title`,`text_content`) の xpath で指定された2つのテキストをスクレイピングする例です.

(言い換えると、`//*[@id="menu"]/ul/li/a` で示される各リンクを開いて、`//*[@id="contents"]/h2``//*[@id="contents"]/p[1]` で指定されたテキストをスクレイピングします)
Expand Down
10 changes: 7 additions & 3 deletions USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,14 @@ agent = Mechanize.new
root_page = agent.get("http://some.scraping.page.net/")

result = root.inject(agent, root_page)
# => [ {"title" => "PageTitle1", "content" => "Page Contents1" },
# {"title" => "PageTitle2", "content" => "Page Contents2" }, ... ]

# => [
# {"title" => "PageTitle 01", "content" => "Page Contents 01" },
# {"title" => "PageTitle 02", "content" => "Page Contents 02" },
# ...
# {"title" => "PageTitle N", "content" => "Page Contents N" }
# ]
```

This example, from the pages of each link that is expressed by the xpath of LinkNode(`links_root`), to scraping the two text that is expressed by the xpath of TextNode(`text_title`,`text_content`).

(i.e. open each links `//*[@id="menu"]/ul/li/a` and, scrape `//*[@id="contents"]/h2` and `//*[@id="contents"]/p[1]`.)
Expand Down

0 comments on commit 7afa83c

Please sign in to comment.