Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Allow to change Language #2

Closed
skrrppop opened this issue Feb 20, 2022 · 20 comments
Closed

[Question] Allow to change Language #2

skrrppop opened this issue Feb 20, 2022 · 20 comments
Labels
fixed BUG 已修复或问题已解决

Comments

@skrrppop
Copy link

It doesnt work when changing chrome language to english
and changing get_labels to return label in english
But the image classifier just takes chinese

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 20, 2022

I am trying to edit the tag mapping.I didn't think anyone would pay attention to the project.(/▽\)

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 20, 2022

However, from the conclusion of my research, this i18n compatibility is not an easy job.

On the PC, hCaptcha determines what language to use based on the browser's startup parameters, and on Linux it is based on the session window's locale environment variable.


In addition, the tags given by hCaptcha may not be in the standard encoding format.You can run the following code in your favorite Python IDE, which is a very clever trick I found while researching how hCaptcha works.

print("AWESΟME" == "AWESOME")
print("bіcycle" == "bicycle")

So, we need to write at least one adaptor function manually to convert these unusual characters.

@QIN2DIM QIN2DIM changed the title Allow to change Language feat(add): Allow to change Language Feb 20, 2022
@QIN2DIM QIN2DIM added the feature 新特性或新需求 label Feb 20, 2022
@saitishmukhametov
Copy link

just change label_aliases and get_label function, since in my (russian) language we have different word endings u have to consider it into label_aliases.
worked perfectly fine for me

        try:
            _label = label_obj.text.split('с ')[1]
        except (AttributeError, IndexError):
        self.label_alias = {
            "велосипедами": "bicycle",
            "поездами": "train",
            "грузовиками": "truck",
            "автобусами": "bus",
            "автобусами": "bus",
            "самолетами": "aeroplane",
            "лодками": "boat",
            "лодками": "boat",
            "машинами": "car",
            "мотоциклами": "motorbike",
        }

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

Are you running on Linux? 👀

@saitishmukhametov
Copy link

nope

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

Yes, this does solve the problem.

But I'm looking for a way to automatically adapt to the language and match the corresponding splitting function to the locale environment variable.This doesn't look easy

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

nope

Did you comment out this line of code? 👀

options.add_argument("--lang=zh-CN") # 可能仅在 Windows 生效

@saitishmukhametov
Copy link

saitishmukhametov commented Feb 21, 2022

the problem is, that some sites force language on hcaptcha, about the lang stuff: i use only antihcaptcha class
greasyfork hcaptcha solver script solves this problem by uploading target image to wolframimage or something like this, where they get keywords and then compare them with long list of built-in keywords

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

stuff

The solution you propose seems to be quite time consuming. Note that the hCaptcha challenge has a time limit of roughly 130 seconds, and the page element will reset after that timeout.

force language

Yes, like EPIC login, they remove the checkbox and lock the element language.

regular

But if it's a regular hCaptcha it will have a section to switch the element language.

From the picture below you should understand why I wanted to unify the language in the first place. If we want to do adaptation, I think we need to rely on other modules, and the complexity of this split word is too terrible.

My idea was to find a way to use elements of a particular language in a variety of contexts. Once and for all.

hcaptcha-language-switcher1

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

I've standardized the challenge language in the test cases and it works on all major operating systems. d6aa5ed

that some sites force language on hcaptcha

Could you provide some application scenarios or links about this situation?

@skrrppop
Copy link
Author

Hey what i meant is when changing Language the label doesnt get recognized.
And if u try to translate it with using self.label_alias not all aliases are included or correct
Do you understand what i mean?
And thanks for youre project i learned alot with it

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

Hey what i meant is when changing Language the label doesnt get recognized.

So won't it be possible to identify accurately after the language is unified? 👀

And if u try to translate it with using self.label_alias not all aliases are included or correct

I don't quite understand the exact meaning of the phrase.

The purpose of self.label_alias is not to translate, but to correct miscoding.

I did a lot of testing and I found that the number of image categories in hCaptcha is constant and that the tags are not completely random and "infinite".So, with this simple mapping, you can cover all the cases.

At certain fixed times of the day, the challenge labels encountered are the same.

@skrrppop
Copy link
Author

I mean when ur getting the label when its in english and than using self.label_alias to get the correspondend chinese word so u can pass it to the image classificator .
It turned out that aeroplane in english is airplane on the label text

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

I still don't understand what you mean...🤦‍♂️

I think you may not be reading the purpose of this variable. I have mentioned above. The purpose of self.label_alias is not to translate, but to correct miscoding.

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

Oh, my God, there are three people talking in here! Now I know exactly what you need.

@skrrppop
Copy link
Author

I still don't understand what you mean...🤦‍♂️

I think you may not be reading the purpose of this variable. I have mentioned above. The purpose of self.label_alias is not to translate, but to correct miscoding.

Ah im an idiot . But when im trying to use an english label for example like 'train' for _label it doesnt work .

Image

Screenshot_1
)

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 21, 2022

WHAT! What did you do with the source code. 😂😂😂😂

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 22, 2022

Ultimately, I think it's a pseudo-need. I don't think it should matter what language is used to open the challenge.

If you do have to start a challenge in a particular language for some indescribable reason, you can refer to this friend's approach and write your own splitting methods as well as label alias.

A final reminder. The purpose of self.label_alias is not to translate, but to correct miscoding. So, this layer of mapping must be present even if you open the challenge in English.

@QIN2DIM QIN2DIM closed this as completed Feb 22, 2022
@QIN2DIM QIN2DIM changed the title feat(add): Allow to change Language [Question] Allow to change Language Feb 22, 2022
@QIN2DIM QIN2DIM added fixed BUG 已修复或问题已解决 and removed feature 新特性或新需求 labels Feb 22, 2022
@skrrppop
Copy link
Author

if i use _label = label_obj.text.split()[-1] i get the correct label but the programm doesnt work and the tactical_retreat function retuns True

@QIN2DIM
Copy link
Owner

QIN2DIM commented Feb 22, 2022

That's because you didn't add the cleansed noun to the self.label_alias.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed BUG 已修复或问题已解决
Projects
None yet
Development

No branches or pull requests

3 participants