Skip to content

Scrapy is an open source and collaborative framework for extracting the data you need from websites. This repository include some scrapy project for making efficent my daily works.

Notifications You must be signed in to change notification settings

n20010/scrapy_projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Summary

This repository include some my scrapy projects.
scrapy_tutorial is a archive repository I learned about scrapy.
get_image include some projects what is scrape images from actual website.

What is Scrapy

An open source and collaborative framework
for extracting the data you need from websites.
In a fast, simple, yet extensible way.

Usage

  1. Get Anaconda enviroment

  2. Create virtual enviroment in Anaconda

    You shuld select python3 version 3.8.
    2021/09 scrapy is not available on python3 ver3.9.

  3. Install some component need execute Scrapy in anaconda virtual enviroment

    pip install -r requirements.txt
    
  4. Install VScode in your local enviroment and add python extensions

  5. Make downloadfoloder_path.txt on same hierarchy README.md

    You need contents on downloadfolder_path.txt is only filepath
    that is to your folder it will save download files.

  6. Clone this repository and move to project you need

  7. Use this command, spider is execute

    scrapy crawl <spider name you need>
    

Main commands

bench

Execute simple benchmark test

scrapy bench

startproject

Make new scrapy project

scrapy startproject <project name>

genspider

Make new spider on currnt project

(When type URL, remove 'https://' and last '/' is the best way)

scrapy genspider (-t template name) <spider name> URL

crawl

Execute spider

scrapy crawl <spider name>

shell

Execute scrapy shell
You can check Xpath, CSS etc...

scrapy shell

Spetial Thanks

About

Scrapy is an open source and collaborative framework for extracting the data you need from websites. This repository include some scrapy project for making efficent my daily works.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages