Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endless history: the history contains a self-reference #6295

Open
CachingFoX opened this issue Nov 26, 2022 · 5 comments
Open

Endless history: the history contains a self-reference #6295

CachingFoX opened this issue Nov 26, 2022 · 5 comments

Comments

@CachingFoX
Copy link

The history of a requests contains a self-reference to the history owner. The history will be endless.

Expected Result

If I traverse recursive the complete history of a requests, this will be a finally graph.
The history of a request is a tree without cycles.

R1 (history: 2)
     R2 (no history)
     R3 (no history)

Actual Result

If I traverse recursive the complete history of a requests, the program breaks with recursive error. RecursionError: maximum recursion depth exceeded while calling a Python object
The history contains a self-reference to the history owner.

The history of a request is a graph with a cycle.

R1 (history: 2)
     R2 (no history)
          R3 (history: 1)
               R3 (history: 1)
                    R3 (history: 1)
                         R3 (history: 1)
                              R3 (history: 1)
                                   ....
id=140537834271072 history=2
	index=0 id=140537834079136
	index=1 id=140537834080960
id=140537834079136 history=0
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960

....


id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
Traceback (most recent call last):
  File "/Users/andreas/PycharmProjects/cce/main.py", line 12, in <module>
    history(requests.get('https://coord.info/GC8T8E8'))
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  [Previous line repeated 993 more times]
  File "/Users/andreas/PycharmProjects/cce/main.py", line 5, in history
    print(f"id={id(r)} history={len(r.history)}")
RecursionError: maximum recursion depth exceeded while calling a Python object

Process finished with exit code 1

Reproduction Steps

import requests


def history(r):
    print(f"id={id(r)} history={len(r.history)}")
    for index, item in enumerate(r.history):
        print(f"\tindex={index} id={id(item)}")
    for index, item in enumerate(r.history):
        history(item)


history(requests.get('https://coord.info/GC8T8E8'))

System Information

{
  "chardet": {
    "version": null
  },
  "charset_normalizer": {
    "version": "2.1.1"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.4"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.9.2"
  },
  "platform": {
    "release": "21.6.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.28.1"
  },
  "system_ssl": {
    "version": "1010109f"
  },
  "urllib3": {
    "version": "1.26.13"
  },
  "using_charset_normalizer": true,
  "using_pyopenssl": false
}
@sethmlarson
Copy link
Member

We're aware, see #2690 which is documented in the README.

@CachingFoX
Copy link
Author

CachingFoX commented Feb 11, 2023

I'm confused. I do not talk about the GIT history. I talked about the history of an request. Please, can you reopen my issue?

@sethmlarson
Copy link
Member

Sorry about that!

@sethmlarson sethmlarson reopened this Feb 11, 2023
@sigmavirus24
Copy link
Contributor

You don't need to traverse it recursively. Each response should have the entire history of the prior response.

In other words,

  • You get r3 back which has a history that looks like [r2, r1].
  • r2 has a history that looks like [r1], etc.

You only need to look at this history of the one you're seeing as the final response. It will contain the whole history.

@sigmavirus24
Copy link
Contributor

So, this is a consequence of someone years ago arguing that each history response should have it's own history despite the linearity of the history and the fact that the top-level response has the entire context.

What's more helpful (in my view) for looking at this problem is:

>>> r = requests.get('https://httpbin.org/redirect/3')
>>> for resp in r.history:
...   print(id(resp), end='->'); print([(id(_), len(_.history)) for _ in resp.history])
... 
140058437748768->[]
140058437750976->[(140058437750976, 1)]
140058437751840->[(140058437750976, 1), (140058437751840, 2)]

Note that the 2nd response (which should have been the redirect request followed from the first item in the history) points to itself, not in fact to the first response.

While this is wrong, and can be fixed, it hasn't been noticed in years, because folks that care about the history are just looking at the top-level history (which contains everything) and don't actually need to look at each response's history.

I'd be in favor of just wiping those histories altogether instead of trying to make this look more correct. Either way, the place to fix this is in

hist.append(resp)
resp.history = hist[1:]

Namely, the first time through, we are called with the initial response that triggered a redirect, so we're doing

hist.append(resp)
resp.history = hist[1:]

This is correct because that 0th item in the final response's history has no other redirects.

We could also, write this two other ways:

resp.history = hist[:]
hist.append(resp)

This will create a new list based off of the existing history we know about, so that initial request will still be [], then the rest would look like:

...
140058437748768->[]
140058437750976->[(140058437748768, 0)]
140058437751840->[(140058437748768, 0), (140058437750976, 1)]


Alternatively, we could do

```py
hist.append(resp)
resp.history = hist[:-1]

Either way should make this work, but I think then, it has downstream affects on

if len(resp.history) >= self.max_redirects:
which we can easily change to

if len(hist) >= self.max_redirects:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants