New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
err: print(parse_product(url)) #1
Comments
well dont know why but after dumb repair its working but dont know why. i added
|
hi! - did you leave out the http part on your initial url? that looks the most likely thing, as the code still runs as is |
no my url looks like
but for some reason on the output list of urls looks like
|
are you able to paste your code here or share the link? |
s = HTMLSession() def get_product_links(page): def parse_product(url):
urls = get_product_links(1) |
thanks. if you check out the "href" tag for the links you are grabbing, they look like this: //www.ceskereality.cz/firmy/elektroinstalace-vd/ there is no schema for these links - no "https://" which is why it works when you add it in. either add it in where you do, or change it in the initial line like:
|
ok thanks it works with it. but i still dont know why it grabs it without https. is it some security ? |
because when you do this: links.append(item.find("a", first=True).attrs["href"]) it gets whatever is in the "href" attribute of that element. in this case it is exactly this: //www.ceskereality.cz/firmy/elektroinstalace-vd/ I've not seen that before. |
oh ok thank you very much for help. thanks to your YT i can learn it easily. but with this href i was really confused :D |
hi sorry for bothering you just starting with this scraper. everything works fine until i add
after that ill get
i know its easy for you but i really dont see it :(
The text was updated successfully, but these errors were encountered: