Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle / catch "HTTP redirect too deep" error / exception #33

Open
infominer33 opened this issue Jun 22, 2020 · 10 comments
Open

handle / catch "HTTP redirect too deep" error / exception #33

infominer33 opened this issue Jun 22, 2020 · 10 comments
Labels

Comments

@infominer33
Copy link

This feed is updating and showing up on the planet... but I regularly get this error \ fail to build:

[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET

*** error: HTTP redirect too deep

##[error]Process completed with exit code 1.
@geraldb
Copy link
Member

geraldb commented Jun 23, 2020

Thanks for reporting. Weird - this might be a HTTP redirect that redirect to itself (thus, an endless loop). If I get to it I will try to check the HTTP headers for the HTTP status code (e.g. 3xx for redirect) and the redirect location.

@infominer33
Copy link
Author

Ideally it would just give up and move to the next feed after x times through the loop, and leave a note at the end or in an error log, rather than crashing.

I mentioned about the re-direct to the blog owner, but they said they had no other complaints about this, and since the feed items get populated ~90% of the runs, I don't know how much energy any of us need to pour into this particular infinite loop.

In any case.. I'm studying ruby pretty hard, as able, so hopefully sooner than later I'll have more to offer than bug reports. :)

@geraldb
Copy link
Member

geraldb commented Jun 24, 2020

Thanks for the update. I see than it must be a different error because it should give up on 7 tries or such an report an error and not crash. I try to check the link tomorrow to see if there's a HTTP redirect happening. Thanks for the patience.

@geraldb
Copy link
Member

geraldb commented Jun 25, 2020

Do you still get the error? If I try to fetch the feed (with pluto's http fetcher library) all works here. The test script:

require 'fetcher'


url = 'https://blog.bmannconsulting.com/feed.xml'

worker = Fetcher::Worker.new

response = worker.get( url )

puts response.code
puts response.message
puts response.content_type
puts response.body[0..100]

## try http  NOT https
url = 'http://blog.bmannconsulting.com/feed.xml'

response = worker.get( url )

puts response.code
puts response.message
puts response.content_type
puts response.body[0..100]

and the result:

[debug] fetch - get(_response) src: https://blog.bmannconsulting.com/feed.xml
[debug] using direct net http access; no proxy configured
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=5
[debug] 200 OK
[debug]   content_type: application/xml, content_length: 9430
200
OK
application/xml
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
        xmlns:atom="http://www.w3.org/2005/Atom"

[debug] fetch - get(_response) src: http://blog.bmannconsulting.com/feed.xml
[debug] using direct net http access; no proxy configured
[debug] GET /feed.xml uri=http://blog.bmannconsulting.com/feed.xml, redirect_limit=5
[debug] 200 OK
[debug]   content_type: application/xml, content_length: 9430
200
OK
application/xml
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
        xmlns:atom="http://www.w3.org/2005/Atom"

@geraldb
Copy link
Member

geraldb commented Jun 25, 2020

I tried again with a cache entry from your log above. It also works. The script:

require 'fetcher'


url = 'https://blog.bmannconsulting.com/feed.xml'

worker = Fetcher::Worker.new

worker.use_cache = true
worker.cache[ url ] = {
    'etag' => 'f192fa56e685ca02fed139e8fec11c4d-ssl-df'
}

response = worker.get( url )

puts response.code
puts response.message
puts response.content_type
puts response.body[0..100]   if response.body

## try http  NOT https
url = 'http://blog.bmannconsulting.com/feed.xml'

worker.cache[ url ] = {
    'etag' => 'f192fa56e685ca02fed139e8fec11c4d-ssl-df'
}

response = worker.get( url )

puts response.code
puts response.message
puts response.content_type
puts response.body[0..100]    if response.body

resulting in

[debug] fetch - get(_response) src: https://blog.bmannconsulting.com/feed.xml
[debug] using direct net http access; no proxy configured
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=5
[info] found cache entry for >https://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >f192fa56e685ca02fed139e8fec11c4d-ssl-df< for conditional GET
[debug] 304 Not Modified
304
Not Modified

[debug] fetch - get(_response) src: http://blog.bmannconsulting.com/feed.xml
[debug] using direct net http access; no proxy configured
[debug] GET /feed.xml uri=http://blog.bmannconsulting.com/feed.xml, redirect_limit=5
[info] found cache entry for >http://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >f192fa56e685ca02fed139e8fec11c4d-ssl-df< for conditional GET
[debug] 304 Not Modified
304
Not Modified

Anyways, please report back if you still get the error and what's the excact feed url in your planet.ini - maybe it's different?

@infominer33
Copy link
Author

ok, I tried replacing https for http in my ini... However, I can't reproduce the error reliably.

I've got it running hourly, (was running twice an hour), It fails to build only some of the time.. maybe 1-3x in 24 hours...
in another day I should be able to tell you if that made a difference or not.

@geraldb
Copy link
Member

geraldb commented Jun 25, 2020

Thanks for the update. I you have traceback / stacktrace or some more error logs that would help. If you run pluto with --verbose you should get a more detailed error (if that's possible).

@infominer33
Copy link
Author

good idea! just at a brief glance I can see some other things I should be tracking in that output..

From now on my action uses verbose, so I can provide better info for errors... and I can also debug any feed issues.

Thanks!

@infominer33
Copy link
Author

okie here's the --verbose output

[debug] fetch - get(_response) src: http://blog.bmannconsulting.com/feed.xml
[debug] using direct net http access; no proxy configured
[debug] GET /feed.xml uri=http://blog.bmannconsulting.com/feed.xml, redirect_limit=5
[info] found cache entry for >http://blog.bmannconsulting.com/feed.xml<
[info] adding header If-None-Match (etag) >"f192fa56e685ca02fed139e8fec11c4d-ssl-df"< for conditional GET
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=4
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=3
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=2
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=1
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml
[debug] GET /feed.xml uri=https://blog.bmannconsulting.com/feed.xml, redirect_limit=0
[debug] 301 Moved Permanently location=https://blog.bmannconsulting.com/feed.xml

*** error: HTTP redirect too deep

##[error]Process completed with exit code 1.

@geraldb
Copy link
Member

geraldb commented Jun 29, 2020

Hello, wow - thanks for your diligence and help. Good to know that it is a HTTP redirect - somewhat weird why it loops forever. You might change the feed_url to use https (to avoid the redirect from http to https location).
On the pluto side I will add the HTTP redirect too deep error to the exception handle so that it will get logged but not exit and I try to double check if the protocol (http/https) gets maybe lost in the location header. Thanks again. Cheers.

@geraldb geraldb added the bug label Jun 29, 2020
@geraldb geraldb changed the title *** error: HTTP redirect too deep handle / catch "HTTP redirect too deep" error / exception Jun 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants