Scrapy Tutorial (Created in 04 Apr, 2022)

這是一個使用 Scrapy 的入門教學專案，示範如何建立爬蟲來抓取範例網站的資料。專案結構簡單，適合初學者快速上手。

📂 專案結構

scrapy_tutorial/
├── books_to_scrape/      # Scrapy 專案目錄，包含 spiders 與 pipelines
├── scrapy.cfg            # Scrapy 專案設定檔
├── requirements.txt      # 依賴套件清單
├── scrapy.html           # 範例 HTML 檔案
└── .gitignore

🚀 安裝與使用

1. 建立虛擬環境（建議）

python -m venv venv
source venv/bin/activate   # macOS/Linux
venv\Scripts\activate      # Windows

2. 安裝依賴套件

pip install -r requirements.txt

3. 進入 Scrapy 專案目錄

cd books_to_scrape

4. 執行爬蟲

scrapy crawl books

預設會將資料輸出到終端機。若要存成 JSON：

scrapy crawl books -o output.json

📝 範例說明

books_to_scrape：爬取 Books to Scrape 網站的範例爬蟲。
scrapy.html：提供一個簡單的 HTML 範例檔，方便測試 XPath/CSS Selector。
requirements.txt：包含 Scrapy 與相關套件。

🔧 技術重點

使用 Scrapy 框架 建立爬蟲
學習 XPath / CSS Selector 提取資料
示範 輸出 JSON/CSV 格式
基本專案結構與設定檔

📚 參考資源

Scrapy 官方文件
Books to Scrape — 練習用的範例網站

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapy Tutorial (Created in 04 Apr, 2022)

📂 專案結構

🚀 安裝與使用

📝 範例說明

🔧 技術重點

📚 參考資源

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
books_to_scrape		books_to_scrape
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
scrapy.cfg		scrapy.cfg
scrapy.html		scrapy.html

Folders and files

Latest commit

History

Repository files navigation

Scrapy Tutorial (Created in 04 Apr, 2022)

📂 專案結構

🚀 安裝與使用

📝 範例說明

🔧 技術重點

📚 參考資源

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages