Skip to content

GET request with a header doesn't work. #1911

@DoctorEvil92

Description

@DoctorEvil92

I'm having problems with what should be a normal GET request with one header, it does work fine with vanilla requests. I tried also to give it every single header like seen in the network inspector, but it still gives a 404 code.
If you'll try to find it in the network inspector, I think you should clear cookies and go here [https://www.gadisline.com/aceite-girasol-abrilsol-botella-1-l] and then change the location in the upper right corner, otherwise this request doesn't really appear.

from crawlee.crawlers import HttpCrawler, HttpCrawlingContext
import asyncio
from crawlee import Request
import requests


async def main():
    # url is found here after changing the store https://www.gadisline.com/aceite-girasol-abrilsol-botella-1-l
    url = "https://catalog.gadisline.com/api/v3/catalog/products/619d2ab0-16d8-4a63-b662-1f9d8dc3e766/search"
    headers = {"store-id":"0c1b3b3a-ed32-43a8-a88e-f124f5920843"}

    # basic test with requests
    r = requests.get(url, headers=headers, timeout=30)
    print("normal requests size:", len(r.text), "\n" + r.text[0:50])

    # test with crawlee
    crawler = HttpCrawler()
    init = Request.from_url(url=url,
                            method="GET",
                            label="TEST_REQUEST",
                            headers=headers)

    @crawler.router.handler("TEST_REQUEST")
    async def test_handler(context:HttpCrawlingContext) -> None:
        print("inside handler!")
        return
    
    await crawler.run([init])


if __name__ == "__main__":
    asyncio.run(main())

Image

Metadata

Metadata

Assignees

Labels

t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions