Skip to content

Host header position causes certain sites to not respond #3265

Closed
@Hemable1

Description

Outbound requests add the Host header last instead of first which causes an issue fetching certain sites. Normally this shouldn't matter, however I'm coming across servers that won't respond unless it's defined first as it is in a browser.

Example:
This is through a browser and successfully responds:

GET / HTTP/1.1
Host: www.accuweather.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
Accept: */*
Accept-Encoding: gzip, deflate
Connection: close

This is through aiohttp and does not respond (notice the Host header position):

GET / HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
Accept: */*
Accept-Encoding: gzip, deflate
Host: www.accuweather.com
Connection: close

To replicate this behavior use the the following code and notice it will time out.

import asyncio
import aiohttp

async def req(url):
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0'
    }
    timeout = aiohttp.ClientTimeout(total=10)

    async with aiohttp.ClientSession(timeout=timeout, headers=headers) as session:
        async with session.get(url, ssl=False) as resp:
            print(resp.status)
            print(await resp.text())

loop = asyncio.get_event_loop()
loop.run_until_complete(req("https://www.accuweather.com"))

Tested on:
Windows 7x64
Python 3.7.0
aiohttp 3.3.2

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions