Skip to content
View Yanxueshan's full-sized avatar

Block or report Yanxueshan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. Scrapy-Redis-Zhihu Scrapy-Redis-Zhihu Public

    基于scrapy-redis实现分布式爬虫,爬取知乎所有问题及对应的回答,集成selenium模拟登录、英文验证码及倒立文字验证码识别、随机生成User-Agent、IP代理、处理302重定向问题等等

    Python 54 21

  2. Python-DataStructur Python-DataStructur Public

    使用python实现常用的数据结构,包括数组/链表/队列/栈/集合/映射/二分搜索树/最大堆/线段树/Trie/并查集/AVL树/哈希表

    Python 11 3

  3. Crawl-Lagou Crawl-Lagou Public

    通过Scrapy的CrawlSpider对拉钩网进行整站爬取并入库,通过selenium进行模拟登录,通过Scrapy自定义随机生成User-Agent/IP代理/集成Selenium的DownloaderMiddleware,通过Scrapy信号机制统计爬取成功的URL总数量,通过Scrapy数据收集机制获取爬取失败的failed_url并写入到json文件中,方便后期分析

    Python 8 2

  4. Scrapy-Redis-Jobbole Scrapy-Redis-Jobbole Public

    使用scrapy-redis实现分布式爬虫,爬取伯乐在线所有文章,集成bloomfilter对url进行去重,并基于Twisted将MySQL插入变成异步执行

    Python 8 5

  5. Python-Advanced-Program Python-Advanced-Program Public

    深入理解python中的一些高级知识,例如垃圾回收/元类编程/迭代器/生成器/线程/进程/协程等

    HTML 7 5

  6. chat chat Public

    基于socket和IO多路复用完成多人聊天室

    Python 4 1