Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect cookie handling: third level domain incorrectly uses second level domain cookies #5411

Open
dvorapa opened this issue Apr 2, 2020 · 1 comment

Comments

@dvorapa
Copy link

dvorapa commented Apr 2, 2020

When visiting (logging in to) zh.wikisource.org, cookies for the following domain are set:

  • .wikisource.org

If one visits wikisource.org afterwards, cookies for the following domain are set:

  • wikisource.org

After i.e. logout from zh.wikisource.org, zh.wikisource.org sends cookie invalidation request for .wikisource.org. Requests invalidates them correctly, leaves wikisource.org cookies untouched correctly.
But for further communication with zh.wikisource.org, requests incorrectly use remaining wikisource.org cookie. Which is rejected by zh.wikisource.org and followed by cookie invalidation request for .wikisource.org.

Expected Result

It looks like the cookie handling library used by requests is doing something wrong (at least according to the RFCs), as it should not send the host-only cookie for wikisource.org to zh.wikisource.org. If it's using RFC 2109, "zh.wikisource.org" does not domain-match the cookie for "wikisource.org" because "wikisource.org" doesn't begin with a dot. If it's using RFC 6265, the domainless cookie for "wikisource.org" should have had the host-only-flag set meaning it should not be sent to "zh.wikisource.org". OTOH, it's possible it's being bug-compatible with browsers (RFC 6265 even notes that such a bug exists/existed in some agents in § 4.1.2.3).

Actual Result

As written above, requests use wikisource.org cookie for zh.wikisource.org. Which is rejected by zh.wikisource.org and followed by cookie invalidation request for .wikisource.org". That is fulfilled (no cookies for .wikisource.org are set), but requests still tries to push cookies for wikisource.org to zh.wikisource.org. So the actual result is endless loop (https://travis-ci.org/github/wikimedia/pywikibot/jobs/669558038#L13763).

Reproduction Steps

Simple:

import pywikibot
s=pywikibot.Site('zh', 'wikisource')
print(s.hostname())  # zh.wikisource.org
s.login()

s2=pywikibot.Site('mul', 'wikisource')
print(s2.hostname())  # wikisource.org
s2.login()

s.logout()
s.login()
# endless loop here

Elaborate:

import requests
s=requests.Session()

def login(url):
  while True:
    # login token
    r1=s.post(url=url, data={'action':'query','meta':'tokens','type':'login','format':'json'})
    logintoken = r1.json()['query']['tokens']['logintoken']
    # login (test account)
    r2=s.post(url=url, data={'action':'clientlogin','loginreturnurl':'https://example.com','logintoken':logintoken,'username':'Test20200402','password':'popokatepetl','format':'json'})
    print(r2.json())
    if r2.json().get('error') and r2.json()['error'].get('code') == 'badtoken':
      continue
    else:
      break
  #print(s.cookies)

def logout(url):
  # logout token
  r1=s.post(url=url, data={'action':'query','meta':'tokens','type':'csrf','format':'json'})
  logouttoken = r1.json()['query']['tokens']['csrftoken']
  # logout
  r2=s.post(url=url, data={'action':'logout','token':logouttoken,'format':'json'})
  print(r2.json())
  #print(s.cookies)

url1 = 'https://zh.wikisource.org/w/api.php'
url2='https://wikisource.org/w/api.php'

login(url1)

login(url2)

logout(url1)

login(url1)
# endless loop here

System Information

$ python -m requests.help

Reproduced on many PCs with many configurations and also on Travis-CI. Here is one:

{
  "chardet": {
    "version": "3.0.4"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "2.6"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.6.9"
  },
  "platform": {
    "release": "5.4.18",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.23.0"
  },
  "system_ssl": {
    "version": "1010104f"
  },
  "urllib3": {
    "version": "1.22"
  },
  "using_pyopenssl": false
}

Links:
Reported and described in more detail in: https://phabricator.wikimedia.org/T224712
Reproducible immediately in: https://repl.it/repls/HarmfulBiodegradableExperiments
Pastebin (just in case): https://pastebin.com/7d7Dn9p1

@dvorapa dvorapa changed the title Incorrect cookie handling Incorrect cookie handling: third level domains use second level domain cookies Apr 2, 2020
@dvorapa dvorapa changed the title Incorrect cookie handling: third level domains use second level domain cookies Incorrect cookie handling: third level domain uses second level domain cookies, ends up in endless loop Apr 2, 2020
@dvorapa dvorapa changed the title Incorrect cookie handling: third level domain uses second level domain cookies, ends up in endless loop Incorrect cookie handling: third level domain incorrectly uses second level domain cookies Apr 2, 2020
@dvorapa
Copy link
Author

dvorapa commented May 14, 2020

Is this a Python issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant