Skip to content

The right way in which a programmer surfs the internet !

Notifications You must be signed in to change notification settings

chuanwang66/silly_browse

Repository files navigation

silly_browse

The right way in which a programmer surfs the Internet !

  • a crawler built by Selenium & openpyxl 呼出浏览器
# C:\Python35\Scripts\pip.exe install selenium
# C:\Python35\Scripts\pip.exe install openpyxl
# C:\Python35\python.exe super_fish.py
  • a crawler built by grequests & lxml 自动获取cookie,不呼出浏览器,支持并发抓取 (但不支持非阻塞IO)
# C:\Python35\Scripts\pip.exe install requests
# C:\Python35\Scripts\pip.exe install grequests
# C:\Python35\Scripts\pip.exe install lxml
# C:\Python35\Scripts\pip.exe install openpyxl
# C:\Python35\python.exe super_fish2.py
  • super_fish2_session.py 把super_fish2.py改造成session方式实现

  • super_fish3.py 尝试用 "线程池+requests"方式 实现 非阻塞网络IO
    这里只是为了展示用法,只在获取cookie的请求中用了这种方式
    在开发中,我们往往更需要的是这种 "非阻塞IO",而不是 "并发IO"(除非你在做爬虫)

    super_fish3_test.py重点展示super_fish3.py的原理

  • 下载喜马拉雅(非付费)音频

# C:\Python35\python.exe super_fish_ximalaya.py

About

The right way in which a programmer surfs the internet !

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages