Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于初始化selenium并登陆淘宝的一些尝试 #15

Closed
siuszy opened this issue Feb 2, 2020 · 10 comments
Closed

关于初始化selenium并登陆淘宝的一些尝试 #15

siuszy opened this issue Feb 2, 2020 · 10 comments

Comments

@siuszy
Copy link

siuszy commented Feb 2, 2020

学习崔老师本节内容并尝试后发现,现今淘宝对于selenium的反爬机制做的非常到位。目前对于淘宝登录过程中常出现的滑动验证码的研究,最简单的有更改window.navigator.webdriver的值为undefined ,即以开发者模式打开chrome,本人尝试后发现无效。另外,调用ActionChains模块去完成滑动操作经常失败,甚至之后本人人工操作仍经常失败。更多讨论见如下链接: https://www.zhihu.com/question/285659525?sort=created。 这里给出一个比较简易的登录方式,即选择微博登录绕开可能存在滑动验证码的情况,也期待各位大牛能给出解决滑动验证码的思路!

try:
    chrome_options = webdriver.ChromeOptions()
    #chrome_options.add_argument('--headless')
    # 下一行代码是为了以开发者模式打开chrome
    chrome_options.add_experimental_option('excludeSwitches',['enable-automation'])
    browser = webdriver.Chrome(options=chrome_options)
    browser.get("https://s.taobao.com/search?q=iPad")
    button = browser.find_element_by_class_name('login-switch')
    button.click()
    button = browser.find_element_by_class_name('weibo-login')
    button.click()
    user_name = browser.find_element_by_name('username')
    user_name.clear()
    user_name.send_keys('*****') #输入微博名 需要事先绑定淘宝
    time.sleep(1)
    user_keys = browser.find_element_by_name('password')
    user_keys.clear()
    user_keys.send_keys('*****') #输入微博密码
    time.sleep(1)
    button = browser.find_element_by_class_name('W_btn_g')
    button.click()
    time.sleep(1)
    cookies = browser.get_cookies()
    ses=requests.Session() # 维持登录状态
    c = requests.cookies.RequestsCookieJar()
    for item in cookies:
        c.set(item["name"],item["value"])
        ses.cookies.update(c)
        ses=requests.Session()
        time.sleep(1)
    print('登录成功')
except:
    print("登录失败")
@Germey
Copy link
Member

Germey commented Feb 9, 2020

感谢!已经加到 README

@zengweigang111
Copy link

感谢!已经加到 README
哈哈 在这里遇到崔老师了

@R4x7651
Copy link

R4x7651 commented Feb 19, 2020

使用微博或者支付宝登录的验证码可以调用阿里云的验证码识别API,准确还快

@literence
Copy link

学习崔老师本节内容并尝试后发现,现今淘宝对于selenium的反爬机制做的非常到位。目前对于淘宝登录过程中常出现的滑动验证码的研究,最简单的有更改window.navigator.webdriver的值为undefined ,即以开发者模式打开chrome,本人尝试后发现无效。另外,调用ActionChains模块去完成滑动操作经常失败,甚至之后本人人工操作仍经常失败。更多讨论见如下链接: https://www.zhihu.com/question/285659525?sort=created。 这里给出一个比较简易的登录方式,即选择微博登录绕开可能存在滑动验证码的情况,也期待各位大牛能给出解决滑动验证码的思路!

try:
    chrome_options = webdriver.ChromeOptions()
    #chrome_options.add_argument('--headless')
    # 下一行代码是为了以开发者模式打开chrome
    chrome_options.add_experimental_option('excludeSwitches',['enable-automation'])
    browser = webdriver.Chrome(options=chrome_options)
    browser.get("https://s.taobao.com/search?q=iPad")
    button = browser.find_element_by_class_name('login-switch')
    button.click()
    button = browser.find_element_by_class_name('weibo-login')
    button.click()
    user_name = browser.find_element_by_name('username')
    user_name.clear()
    user_name.send_keys('*****') #输入微博名 需要事先绑定淘宝
    time.sleep(1)
    user_keys = browser.find_element_by_name('password')
    user_keys.clear()
    user_keys.send_keys('*****') #输入微博密码
    time.sleep(1)
    button = browser.find_element_by_class_name('W_btn_g')
    button.click()
    time.sleep(1)
    cookies = browser.get_cookies()
    ses=requests.Session() # 维持登录状态
    c = requests.cookies.RequestsCookieJar()
    for item in cookies:
        c.set(item["name"],item["value"])
        ses.cookies.update(c)
        ses=requests.Session()
        time.sleep(1)
    print('登录成功')
except:
    print("登录失败")

获取输入密码框和输入密码那里运行不了啊

@freedom6
Copy link

presence_of_element_located((By.CSS_SELECTOR, '#mainsrp-pager div.form > input'))各位大佬,这里边的 > 到底是什么意思

@Germey
Copy link
Member

Germey commented Feb 29, 2020

@freedom6 直接子节点

@PanZ12580
Copy link

有没有大佬会处理微博登录偶尔会出现的那个验证码,那个验证码中间的一条线我怎么处理都处理不了,tesserocr识别不了

@lililib
Copy link

lililib commented May 3, 2020

from selenium.webdriver import ChromeOptions
option = ChromeOptions()
option.add_experimental_option('excludeSwitches', ['enable-automation'])#开启实验性功能
browser=webdriver.Chrome(options=option)
script = '''
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
'''
browser.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {"source": script})
browser.get('https://www.taobao.com')
这样可以更改window.navigator.webdriver的值为undefined

@cqllzp
Copy link

cqllzp commented Mar 25, 2021

请问现在爬虫统计销量是没有办法做到的了?

@Pankangjun
Copy link

from selenium.webdriver import ChromeOptions
option = ChromeOptions()
option.add_experimental_option('excludeSwitches', ['enable-automation'])#开启实验性功能
browser=webdriver.Chrome(options=option)
script = '''
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
'''
browser.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {"source": script})
browser.get('https://www.taobao.com')
这样可以更改window.navigator.webdriver的值为undefined

太感谢了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants