New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crawly custom settings #72
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using the term custom_settings
is not really intuitive of what it actually does.
Essentially, we are overriding certain default settings in config.exs
for only this specific spider.
I think using override_settings
might be better.
I think we should also provide a macro to make simple overrides easy (for some syntactic sugar) , like in my example in the related issue, to get ride of some of the boilerplate. Though this should probably be split to a separate issue. |
Also add in some tests for the following situations:
|
@Ziinc thanks for comments. I was not addressing them yet, as I was in the middle of more massive changes already. Will have a closer look soon! |
a6ecbf2
to
4373d23
Compare
@Ziinc can you please have another glance on it? |
end) | ||
end | ||
|
||
test "settings from the spider are overriding globals" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an additional test where no global setting is defined, and no spider setting is defined, should throw an error-reason tuple.
This would be for developer's debugging convenience
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well currently Crawly is built with an idea that it should run even without settings. We have default fallback functions for most or all settings.
This idea can be discussed, but lets maybe do it in a separate issue/pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, was thinking along the lines of when devs create their own pipelines/middlewares and want to define/get their own settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, it still should be possible
lib/crawly/utils.ex
Outdated
defp get_spider_setting(_setting_name, nil), do: nil | ||
|
||
defp get_spider_setting(setting_name, spider_name) do | ||
case function_exported?(spider_name, :settings_override, 0) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we pass the global config as first arg to the callback? I think it might be beneficial, for example, adjusting pipelines/middlewares on the fly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please show an example? In general, I don't understand the plan yet. However, if you can show samples I am happy to include it.
Otherwise, if it's private we can't override it, right?
Overall looks great 👍 , just some minor changes. The docs on how to access settings in pipelines/middlewares will need to be updated too |
1. Extend Crawly.Spider behaviour with otional custom_settings 2. Add utility function to extract settings from a global config or spider's custom settings
04f640a
to
905b3a5
Compare
@Ziinc I have addressed most (if not all) comments you have made. Could we make another round of the review please :)? |
end | ||
``` | ||
|
||
The full list of overridable settings: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it necessary to list this? all settings are overridable, from my understanding.
end) | ||
end | ||
|
||
test "settings from the spider are overriding globals" do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, was thinking along the lines of when devs create their own pipelines/middlewares and want to define/get their own settings.
|
||
def get_settings(setting_name, spider_name \\ nil, default \\ nil) do | ||
global_setting = Application.get_env(:crawly, setting_name, default) | ||
case get_spider_setting(setting_name, spider_name) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
def get_settings(setting_name, spider_name \\ nil , defualt \\ nil) do
global = Application.get_env(:crawly, setting_name, default)
overrides = cond do
function_exported?(spider_name, :override_settings, 1) when is_atom(spider_name) ->
spider_name.override_settings(global)
function_exported?(spider_name, :override_settings, 0) when is_atom(spider_name) ->
spider_name.override_settings()
true ->
[]
end
merged = Keyword.merge(global, overrides)
Keyword.get(merged, setting_name, default)
end
Allows for more advanced logic in the override function, and the user doesn't have to do a Application.get_env
in the spider module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not quite sure about it. Lets maybe discuss it in a separate issue.
@Ziinc I have addressed new code review comments. Let's have another glance. I would suggest discussing all more complex capabilities of custom settings separately, as this PR is quite large already. Also maybe we can do it as a part of callbacks related discussion. (Maybe it's time when we will convert Spider to macro-based definition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
🎉🎉🎉🎉 |
or spider's custom settings