Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiving a 404 response on a simple GET request that returns 200 using standard requests library #6886

Closed
1 task done
sc0ned opened this issue Aug 19, 2022 · 6 comments
Closed
1 task done
Labels

Comments

@sc0ned
Copy link

sc0ned commented Aug 19, 2022

Describe the bug

I'm not sure what is happening here, but it seems that Twitter is somehow detecting and denying access to requests coming from aiohttp. Running a basic GET request using aiohttp returns a 404 page not found error, while running an identical request with the standard requests module produces the expected results.

To Reproduce

  1. Replace the url in the standard ClientSession example with "https://twitter.com", change the status assertion to 404 and run the following:
import aiohttp
import asyncio

async def fetch(client):
    async with client.get('https://twitter.com/') as resp:
        assert resp.status == 404
        return await resp.text()

async def main():
    async with aiohttp.ClientSession() as client:
        html = await fetch(client)
        print(html)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
  1. Run the same request using the standard requests module:
import requests
response = requests.get("https://twitter.com")
print(response.text)

Expected behavior

Both requests should result in a 200 status code, but aiohttp produces a 404 status.

Logs/tracebacks

N/A

Python Version

Python 3.9.6

aiohttp Version

3.8.1

multidict Version

6.0.2

yarl Version

1.7.2

OS

Windows 10

Related component

Client

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@sc0ned sc0ned added the bug label Aug 19, 2022
@Dreamsorcerer
Copy link
Member

I think I remember someone reporting this before. But, probably an issue for Twitter's support. Unless you can figure out what is causing this bizarre response from Twitter, there's not really anything we can do.

Interestingly, /tos and /privacy work fine, but seemingly not any of the main Twitter pages. I'm thinking that static pages are fine, but application pages have some weird logic on them..

@Dreamsorcerer
Copy link
Member

Hmm #4926 is the only issue I can find that might be related, maybe I misremembered..

@webknjaz
Copy link
Member

Could be something bizzare similar to #5643...

@webknjaz
Copy link
Member

@sc0ned Try setting SSLKEYLOGFILE while capturing the traffic via tcpdump/wireshark. Then follow https://hynek.me/til/tls-troubleshooting/#bonus-peeking-into-encrypted-tls-traffic / https://gitlab.com/wireshark/wireshark/-/wikis/TLS#tls-decryption. Finally, compare the HTTP requests both libs send.
If they are the same, maybe the problem is indeed on the transport level.

@dvdblk
Copy link

dvdblk commented Sep 15, 2023

If anyone encounters this; just use httpx.AsyncClient. The TLS behavior is the same as with requests and I get a 20x status code as expected.

@Dreamsorcerer
Copy link
Member

Seems to work today. Could be something we changed, or could be something they fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants