Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How to use this with selenium-python on my current page? #23

Closed
M-Zubair10 opened this issue Mar 27, 2022 · 25 comments
Closed
Labels
fixed BUG 已修复或问题已解决

Comments

@M-Zubair10
Copy link

How to use this with selenium-python on my current page?

@QIN2DIM QIN2DIM changed the title [Question] [Question] How to use this with selenium-python on my current page? Mar 27, 2022
@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 27, 2022

Imitate and modify.

def runner(
sample_site: str,
lang: Optional[str] = "zh",
silence: Optional[bool] = False,
onnx_prefix: Optional[str] = None,
):
"""人机挑战演示 顶级接口"""
logger.info("Starting demo project...")
# Instantiating embedded models
yolo = YOLO(DIR_MODEL, onnx_prefix=onnx_prefix)
# Instantiating Challenger Components
challenger = ArmorCaptcha(dir_workspace=DIR_CHALLENGE, lang=lang, debug=True)
challenger_utils = ArmorUtils()
# Instantiating the Challenger Drive
ctx = get_challenge_ctx(silence=silence, lang=lang)
try:
for _ in range(5):
try:
# Read the hCaptcha challenge test site
ctx.get(sample_site)
# Necessary waiting time
time.sleep(3)
# Detects if a clickable `hcaptcha checkbox` appears on the current page.
# The `sample site` must pop up the `checkbox`, where the flexible wait time defaults to 5s.
# If the `checkbox` does not load in 5s, your network is in a bad state.
if challenger_utils.face_the_checkbox(ctx):
start = time.time()
# Enter iframe-checkbox --> Process hcaptcha checkbox --> Exit iframe-checkbox
challenger.anti_checkbox(ctx)
# Enter iframe-content --> process hcaptcha challenge --> exit iframe-content
result = challenger.anti_hcaptcha(ctx, model=yolo)
if not result:
continue
challenger.log(
f"End of demo - total: {round(time.time() - start, 2)}s"
)
break
# Do not capture the `ChallengeReset` signal in the outermost layer.
# In the demo project, we wanted the human challenge to pop up, not pass after processing the checkbox.
# So when this happens, we reload the page to activate hcaptcha repeatedly.
# But in your project, if you've passed the challenge by just handling the checkbox,
# there's no need to refresh the page!
except (WebDriverException, ChallengeReset):
continue
finally:
input("[EXIT] Press any key to exit...")
ctx.quit()

@QIN2DIM QIN2DIM added the fixed BUG 已修复或问题已解决 label Mar 27, 2022
@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 27, 2022 via email

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 27, 2022

@M-Zubair10 in progress🐒

@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 27, 2022 via email

@Revadike
Copy link

maybe u can also make a standalone proxy server exe/linux

@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 28, 2022 via email

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 28, 2022

@M-Zubair10 The solution for this project uses the ONNX compression model, and you can call it directly using the opencv-python interface without caring about the details of deep learning. If you look at requirement.txt, you will see that torch or tensorflow is missing.

@M-Zubair10 @Revadike btw I don't have a good grasp of pypi and container server, which will take some time to learn. and then, I've been busy with school lately, so I may not be able to push it anytime soon.🤦‍♂️

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 29, 2022

After a round of research I found that you guys are actually asking the same question...

@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 29, 2022 via email

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 29, 2022

这个项目名叫 hcaptcha-challenger 而非 hcaptcha-solver。我的本意是搭建一个服务接口用以演示 YOLOv5(ONNX) 嵌入式解决方案 带来的提升,我只在乎识别的速度以及挑战的通过效率。

对于 @Revadike 这位老哥的问题,用什么方法对抗 hCaptcha 并不重要,重要的是获取挑战结束后返回的 TOKEN 。

对于 @M-Zubair10 的问题,如果你只是想在你编写的 selenium 程序里使用这个挑战方法,你可以按照我上面说的,根据 demo 改一下你的上下文业务既可。如果你要把它打包成 pypi,编码难度极高。

@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 29, 2022 via email

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 29, 2022

The project is called hcaptcha-challenger, not hcaptcha-solver, and my intention was to build a service interface to demonstrate the improvements made by the YOLOv5 (ONNX) embedded solution, I only cared about the speed of recognition and the efficiency of passing challenges.

To @Revadike, it doesn't matter what method is used against hCaptcha, what matters is getting the TOKEN returned at the end of the challenge.

To @M-Zubair10, if you just want to use this challenge method in your selenium application, you can change your contextual business according to the demo as I mentioned above. If you want to package it as a pypi, it's extremely difficult to code.

@Revadike
Copy link

I meant like an API server

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 29, 2022

ah~ @Revadike I know what you mean, you need a cross programming language solution.

@QIN2DIM
Copy link
Owner

QIN2DIM commented Mar 30, 2022

@M-Zubair10 ah - - I think I didn't express my meaning accurately.

The coding difficulty I mentioned yesterday is mainly due to the fact that hcaptcha-challenger cannot be started in various selenium contexts. Because the challenge label is multilingual,

  • when you use selenium on a PC to trigger the challenge, the lang of label depends on the --lang parameter of the WebDriver Options;
  • and when you use selenium on linux, the lang of label depends on the value of the process environment variable LANGUAGE.

Obviously all such recognition operations involve label matching, and if the label text is translated into other languages because of multilingual issues, not only will the model fail, but the whole challenge logic will be seriously flawed.

I would like to make this solution available to developers using a variety of languages with a single set of code, however, the two factors I just mentioned that determine the lang of label are set before the WebDriver starts. To reference hcaptcha-challenger's method in the process for a challenge, you would have to do redundant transcoding, which is unnecessary.

@M-Zubair10
Copy link
Author

M-Zubair10 commented Mar 30, 2022 via email

@izoomrud
Copy link

izoomrud commented Apr 1, 2022

For @Revadike,it doesn't matter what method is used against hCaptcha, it's important to get the TOKEN returned at the end of the challenge.

How can I get TOKEN from there?

I tried putting token = ctx.find_element_by_tag_name('iframe').get_attribute("data-hcaptcha-response") after self.log("Challenge Success") in core.py but it doesn't work.

@QIN2DIM
Copy link
Owner

QIN2DIM commented Apr 1, 2022

but it doesn't work.

@izoomrud sure

@izoomrud
Copy link

izoomrud commented Apr 2, 2022

but it doesn't work.

@izoomrud sure

ctx.switch_to.default_content()
time.sleep(3)
token = ctx.find_elements(By.XPATH, "//iframe[@title='widget containing checkbox for hCaptcha security challenge']")
for value in token:
    file = open('token.txt', 'a')
    file.write(value.get_attribute('data-hcaptcha-response'))

.|.

@QIN2DIM
Copy link
Owner

QIN2DIM commented Apr 2, 2022

@izoomrud 有被可爱到,路漫漫其修远兮哈哈哈哈

@3281448091
Copy link

execute javascript to get the token.
just use getAttribute("data-hcaptcha-response") and then set the token to the webpage's title and do webdriver.title
and that will work

@QIN2DIM
Copy link
Owner

QIN2DIM commented Apr 4, 2022

execute javascript to get the token. just use getAttribute("data-hcaptcha-response") and then set the token to the webpage's title and do webdriver.title and that will work

yeah

@shahzain345
Copy link

Well, what u can do is execute javascript in your webdriver. You can use the hcaptcha.getResponse() method to get the response token, which you can then use in your script. The code should look like this.

token = ctx.execute_script("return hcaptcha.getResponse();")

@yeshenshuijiao
Copy link

可以使用playwright吗

@QIN2DIM
Copy link
Owner

QIN2DIM commented Apr 21, 2022

可以使用playwright吗

一样的,思路都是在运行时判断是否遇到 challenge,然后用相应的接口处理。但现在这个项目我做成了 demo了,耦合性非常强,如果你要自己的项目中使用的话,目前只能照葫芦画瓢敲一个出来了,因为 selenium 和 playwright 的 api 完全不一样。

@QIN2DIM QIN2DIM closed this as completed Jul 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed BUG 已修复或问题已解决
Projects
None yet
Development

No branches or pull requests

7 participants