No description, website, or topics provided.
Python
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
miao
.gitignore
Output.txt
README.md
scrapy.cfg
test.txt

README.md

学习python爬虫

lessons learned and skills learned

  • basically scrapy usage
  • python 3 print function should always have parenthesis
  • xpath basic knowledge learned
  • python yield, this key word usually in some generator function, and if that generator is being executed, it will stop and return the value when meet yield expression. Next time when the generator is called, begin from where you left last time.
  • python yield is very useful when some item you only read once. It can avoid genarating an array cost a lot of memory. Espically when there are huge number of items.
  • In the scrapy project, the project name and the spider name should not be the same. Or it will failed when you import some module in some file