Skip to content

Commit

Permalink
Allow to send configuration options to fetcher
Browse files Browse the repository at this point in the history
  • Loading branch information
oltarasenko committed Dec 30, 2019
1 parent 69dfc31 commit abee884
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 9 deletions.
2 changes: 1 addition & 1 deletion config/config.exs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ use Mix.Config
config :crawly, Crawly.Worker, client: HTTPoison

config :crawly,
fetcher: Crawly.Fetchers.HTTPoisonFetcher,
fetcher: {Crawly.Fetchers.HTTPoisonFetcher, []},

# User agents which are going to be used with requests
user_agents: [
Expand Down
7 changes: 4 additions & 3 deletions lib/crawly/fetchers/fetcher.ex
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@ defmodule Crawly.Fetchers.Fetcher do
@moduledoc """
A behavior module for defining Crawly Fetchers
A fetcher is expected to implement a fetch callback which should return a
Crawly.Response
A fetcher is expected to implement a fetch callback which should take
Crawly.Request, HTTP client options and return Crawly.Response.
"""

@callback fetch(request) :: {:ok, response} | {:error, reason}
@callback fetch(request, options) :: {:ok, response} | {:error, reason}
when request: Crawly.Request.t(),
response: Crawly.Response.t(),
options: map(),
reason: term()
end
2 changes: 1 addition & 1 deletion lib/crawly/fetchers/httpoison_fetcher.ex
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ defmodule Crawly.Fetchers.HTTPoisonFetcher do

require Logger

def fetch(request) do
def fetch(request, _client_options) do
HTTPoison.get(request.url, request.headers, request.options)
end
end
9 changes: 5 additions & 4 deletions lib/crawly/worker.ex
Original file line number Diff line number Diff line change
Expand Up @@ -71,14 +71,15 @@ defmodule Crawly.Worker do
result: {:ok, response, spider_name} | {:error, term()}
defp get_response({request, spider_name}) do
# check if spider-level fetcher is set. Overrides the globally configured fetcher.
# if not set, log warning for explicit config preferred, get the globally-configured fetcher. Defaults to FwtchWithHTTPoison
fetcher = Application.get_env(
# if not set, log warning for explicit config preferred,
# get the globally-configured fetcher. Defaults to HTTPoisonFetcher
{fetcher, options} = Application.get_env(
:crawly,
:fetcher,
Crawly.Fetchers.HTTPoisonFetcher
{Crawly.Fetchers.HTTPoisonFetcher, []}
)

case fetcher.fetch(request) do
case fetcher.fetch(request, options) do
{:ok, response} ->
{:ok, {response, spider_name}}

Expand Down

0 comments on commit abee884

Please sign in to comment.