HTTP request header support #48

Open
wants to merge 1 commit into
from

Conversation

Projects
None yet
6 participants
@ashmckenzie

Hi,

I needed to add some additional HTTP request headers and didn't see any support for that currently. Happy to change anything / add more specs to cover the change.

Ash.

@jgnagy

This comment has been minimized.

Show comment Hide comment
@jgnagy

jgnagy Apr 25, 2012

I agree that this should be in there... please accept this so everyone can benefit! :-)

jgnagy commented Apr 25, 2012

I agree that this should be in there... please accept this so everyone can benefit! :-)

@didlix

This comment has been minimized.

Show comment Hide comment
@didlix

didlix Oct 4, 2013

👍

didlix commented Oct 4, 2013

👍

@brutuscat

This comment has been minimized.

Show comment Hide comment
@brutuscat

brutuscat Dec 14, 2014

Contributor

@ashmckenzie would you please submit a new PR in the new fork called Medusa?
Instead of using Net::HTTP uses OpenURI, which means that the headers should be passed in the options argument as seen in the http.rb. Just make sure that the keys are Strings. See OpenURI option docs.

options must be a hash.
Each option with a string key specifies an extra header field for HTTP. I.e., it is ignored for FTP without HTTP proxy.
The hash may include other options, where keys are symbols

Contributor

brutuscat commented Dec 14, 2014

@ashmckenzie would you please submit a new PR in the new fork called Medusa?
Instead of using Net::HTTP uses OpenURI, which means that the headers should be passed in the options argument as seen in the http.rb. Just make sure that the keys are Strings. See OpenURI option docs.

options must be a hash.
Each option with a string key specifies an extra header field for HTTP. I.e., it is ignored for FTP without HTTP proxy.
The hash may include other options, where keys are symbols

@atgs-ghayakawa

This comment has been minimized.

Show comment Hide comment
@atgs-ghayakawa

atgs-ghayakawa Feb 27, 2015

👍
Thank you!
I was allowed used in crawling require Basic authentication page .

Sample:

require 'anemone'
require 'base64'

url = "http://exsample.com/test.htm"
auth_base64 = Base64.encode64('USER:PASSWORD').gsub(/\n/, "")
headers = {"Authorization" => "Basic #{auth_base64}"}

Anemone.crawl(url, {:http_request_headers => headers}) do |anemone|
  anemone.on_every_page do |page|
    puts page.url
    puts page.body.toutf8
  end
end

👍
Thank you!
I was allowed used in crawling require Basic authentication page .

Sample:

require 'anemone'
require 'base64'

url = "http://exsample.com/test.htm"
auth_base64 = Base64.encode64('USER:PASSWORD').gsub(/\n/, "")
headers = {"Authorization" => "Basic #{auth_base64}"}

Anemone.crawl(url, {:http_request_headers => headers}) do |anemone|
  anemone.on_every_page do |page|
    puts page.url
    puts page.body.toutf8
  end
end
@paresharma

This comment has been minimized.

Show comment Hide comment
@paresharma

paresharma Sep 18, 2015

@chriskite Can we merge this?

@chriskite Can we merge this?

@atgs-ghayakawa

This comment has been minimized.

Show comment Hide comment
@atgs-ghayakawa

atgs-ghayakawa Sep 18, 2015

The need is not . It is just information .

The need is not . It is just information .

@brutuscat

This comment has been minimized.

Show comment Hide comment
@brutuscat

brutuscat Sep 18, 2015

Contributor

@atgs-ghayakawa @paresharma could u help adapt this PR to a new PR in the new fork called Medusa?

Contributor

brutuscat commented Sep 18, 2015

@atgs-ghayakawa @paresharma could u help adapt this PR to a new PR in the new fork called Medusa?

@paresharma

This comment has been minimized.

Show comment Hide comment
@paresharma

paresharma Sep 18, 2015

@brutuscat
Hi, I forked your fork so that I could do a PR with these changes. But, I see that you have already added BAA support:
https://github.com/paresharma/medusa/blob/master/lib/medusa/http.rb#L81-L83

I can just use Medusa in place of Anemone with BAA. 👍

Medusa.crawl(url, { http_basic_authentication: [username, password] }) do |medusa|
  medusa.on_every_page do |page|
    puts page.code
  end
end

Now, that I am at it I'll do a general clean up and make it compatible with Ruby 2.2 (which would mean dropping support for Kyoto and Tokyo, I guess). Will do a PR when it's done. 😄

@brutuscat
Hi, I forked your fork so that I could do a PR with these changes. But, I see that you have already added BAA support:
https://github.com/paresharma/medusa/blob/master/lib/medusa/http.rb#L81-L83

I can just use Medusa in place of Anemone with BAA. 👍

Medusa.crawl(url, { http_basic_authentication: [username, password] }) do |medusa|
  medusa.on_every_page do |page|
    puts page.code
  end
end

Now, that I am at it I'll do a general clean up and make it compatible with Ruby 2.2 (which would mean dropping support for Kyoto and Tokyo, I guess). Will do a PR when it's done. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment