Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot detection #1

Closed
Kajo576 opened this issue Feb 9, 2023 · 5 comments
Closed

Bot detection #1

Kajo576 opened this issue Feb 9, 2023 · 5 comments

Comments

@Kajo576
Copy link

Kajo576 commented Feb 9, 2023

The bot made 1-2 search then stopped, because of Emag bot detection. How can i solve this?

@viorellu
Copy link
Owner

viorellu commented Feb 9, 2023

Hello,

Can you describe how you ran it and what error it returned? Is the page you're searching on still available in your browser after you ran the script?

@Kajo576
Copy link
Author

Kajo576 commented Feb 9, 2023

I start the bot with "git bash" program on windows.
The bot correctly list the test items, then i see:
"================="
after 300 sec, not list again the items just still:
"================="
This is repeated.

When i open the web page in browser i become a captha and a message with unusual activity detected.
I solve the captha and i can browse on emag again. The bot find the test items again.
The bot made 1 search then again become the captha issue.

@viorellu
Copy link
Owner

viorellu commented Feb 9, 2023

I can confirm I can reproduce the steps, but, at least for me, this happens only if I'm not logged in on their website and if I poll the page too quickly (I'd say 10 seconds triggers the captcha but it's been hit and miss)

I'd suggest as possible solutions trying a VPN or use some proxy to hide your current IP, or altering the URL that is hardcoded in the script (I used that one over 18 months ago).

Alterantively, try using the other script - userinput.sh - and see if that wields better results when you customize the URL with emag's search filters. If that one works better I'll translate the prompts to english for ease of use (in order it asks for an URL, a keyword, max desired price, time to refresh results and number of results to show you. Only the URL is mandatory, the rest have predefined defaults they can use)

@Kajo576
Copy link
Author

Kajo576 commented Feb 9, 2023

Thank you for answer, i tested these steps, but no result. In 5 minutes the bot run in captcha.
i tryed vpn, other URL-s, userinput.sh, logged in - logged out, but the result is the same.

@viorellu
Copy link
Owner

viorellu commented Feb 9, 2023

It looks like emag did some tweaking to their website - I tested this again and it looks like a time limit kicks in the captcha verification which unfortunately makes the script pretty much useless.
I'll do some googling on how to avoid this, if at all possible.

$ i=1; while true; do echo "---------- Iteration $i ----------"; echo $(date +%H:%M:%S) "Results: $(curl -s https://www.emag.ro/search/placi_video/resigilate/filter/memorie-f2645,12-gb-v7440/rtx/c | tr "," "\n" | grep -e "product_name&quot" -e "quot;price&quot" | awk 'NR%2{printf "%s ",$0;next;}1' | awk -F ";" '{print $4 " " $NF}' | awk '{print $NF,$0}' | cut -c 2- | sort -n | cut -f2- -d' ' | sed "s/&quot//" | sed "s/u00ae//" | sed "s/u2122//" | tr \\ " " 2>/dev/null | grep -i rtx | wc -l)"; echo "--------------------"; i=$(($i+1));sleep 360; done
---------- Iteration 1 ----------
17:26:02 Results: 16 <-- expected number of results per query

---------- Iteration 2 ----------
17:32:02 Results: 16

---------- Iteration 3 ----------
17:38:03 Results: 16

---------- Iteration 4 ----------
17:44:03 Results: 0 <--- after `20 minutes the captcha kicked in

---------- Iteration 5 ----------
17:50:03 Results: 0

@viorellu viorellu closed this as completed Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants