Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Ping search engines require login #63

Closed
arufanov opened this Issue · 2 comments

2 participants

@arufanov

Hello,

Thank you very much for gem, I use it and like it.

I see that some search engine require login (authorization) before ping. For example: http://webmaster.yandex.ru/wmconsole/sitemap_list.xml?host=http://www.example.com%2Fsitemap_index.xml.gz requires authorization. How can I handle this situation from sitemap generator. Are you planning to add credentials to handle this case?

Also it would be great to allow ping search engines behind proxy including ntlm (gem "ruby-ntlm", ">= 0.0.1" ) proxy.

I am ready to participate and test.

Sincerely yours,
Artem Rufanov.

P.S.
Have a good day!

@kjvarga
Owner

Hi there,

This is quite a special case. Since sitemaps are generally public, I think most sitemap submission services would allow public ping, but I can see how they may want to restrict that to authenticated users to prevent abuse of their systems.

Internally SitemapGenerator uses 'openuri' so if NTLM can be made to work with that then that would be easy. Alternatively it's not a big deal to use Net::HTTP directly like in the ruby-ntlm Readme examples. Then there would need to be some configuration options for the user, password and domain or whatever it is that you need. Not too bad.

The Yandex login example that you sent is a bit trickier because it requires you to fill out a web form to login. That would necessitate using something like Mechanize to fill in the form programatically. That's not super difficult but every web form is different so you're looking at some kind of "adapter" scenario, so you would define a Yandex adapter to do the login and your credentials would probably be stored in the adapter instance. That would be a tall order for me to code up so if you are really interested in it, by all means take a stab at it. The logic would be in the adapter and not much would have to change in LinkSet#ping_search_engines.

If you want to just put something in manually for yourself I think the best way is to just put your code directly into your sitemap config file at the end.

Any thoughts?

@arufanov

Hi,

So there are 2 problems:
a) ping behind proxy (including NTLM)
b) ping engines that require some authorization.

You said that a) is the feature that will be useful at gem, but the b) problem is "adapter" scenario and you are ready to make stub to implement this stub anybody who interested in specific adapter. Is it correct resume? I agree with you if I understand you resume correctly. I will submit yandex adapter after implementation it by myself.
I don't know does 'openuri' can handle NTLM proxy, I need check it, but P.S. contains code that handle this case by (gem "ruby-ntlm", ">= 0.0.1" gem. May be it will be useful for you.

Artem.

P.S.
Code that handle NTLM proxy case:

# Define http constants
options = {:debug => false,  :http_timeout => 60, :method => :get,
           :redirect_count => 0, :max_redirects => 10, :headers => {},
           :parameters => { :page => 1, :per_page => 100},}

# Configure proxy
if yml_file[::Rails.env]["proxy_host"]
  proxy_host = yml_file[::Rails.env]["proxy_host"]
  proxy_port = yml_file[::Rails.env]["proxy_port"]
  proxy_username = yml_file[::Rails.env]["proxy_username"]
  proxy_password = yml_file[::Rails.env]["proxy_password"]
  proxy = Net::HTTP::Proxy(proxy_host, proxy_port, proxy_username, proxy_password)
  http = proxy.new(url.host, url.port)
  else
    http = Net::HTTP.new(url.host, url.port)
end

# Set ssl
if url.scheme == 'https'
  http.use_ssl = true
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE
end

# Configure session
http.open_timeout = http.read_timeout = options[:http_timeout]
http.set_debug_output $stderr if options[:debug]

# Make request for post of get
request = case options[:method]
  when :post
    request = Net::HTTP::Post.new(url.request_uri)
    request.set_form_data(options[:parameters])
    request
  else
    Net::HTTP::Get.new(url.request_uri)
end

# Set basic auth
if yml_file[::Rails.env]["username"]
  request.basic_auth yml_file[::Rails.env]["username"],
                     yml_file[::Rails.env]["password"]
end

# Set header and make request
options[:headers].each do |key, value|
  request[key] = value
end

# Start a new request, not redirects expects
if yml_file[::Rails.env]["proxy_auth"] == "NTLM"
  proxy_domain = yml_file[::Rails.env]["proxy_domain"]
  proxy_username = yml_file[::Rails.env]["proxy_username"]
  proxy_password = yml_file[::Rails.env]["proxy_password"]
  request.ntlm_auth(proxy_username, proxy_domain, proxy_password)
  response = http.request(request)
  http_response_body = response.body
else
  response = http.request(request)
  http_response_body = response.body
end
@kjvarga kjvarga closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.