Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Dns cache size and timeout options #1132

Merged
merged 1 commit into from Apr 15, 2015

Conversation

@sibiryakov
Copy link
Contributor

@sibiryakov sibiryakov commented Apr 2, 2015

Continuing of #1092 and #1120.

@kmike
Copy link
Member

@kmike kmike commented Apr 2, 2015

Looks good!

@kmike kmike changed the title Dns cache size and timeout options [MRG+1] Dns cache size and timeout options Apr 2, 2015
@nyov
Copy link
Contributor

@nyov nyov commented Apr 2, 2015

I'm sorry if you've read this, but I'll repost in case my comment got lost.

The getHostByName interface in theory supports a list of values (1, 3, 11, 45), which contains staggerable timeouts. But it is only sum()-med in the default implementation.

To quote the docs: "timeout: Number of seconds after which to reissue the query. When the last timeout expires, the query is considered failed. (type: Sequence of int)"

There is no support for float values from the interface point of view, but support for a sequence.

@sibiryakov
Copy link
Contributor Author

@sibiryakov sibiryakov commented Apr 2, 2015

Unfortunately sequence doesn't work in ThreadedResolver. Instead it uses IReactorTime.callLater which has seconds argument as float. See http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IReactorTime.html#callLater
Previously we decided, that it doesn't make much sense to introduce interface changes that doesn't work. IMO, the best would be to wait until twisted.names resolver will be fixed and then perform necessary changes, e.g. switch to sequence of ints.

@nyov
Copy link
Contributor

@nyov nyov commented Apr 2, 2015

IMO, the best would be to wait until twisted.names resolver will be fixed and then perform necessary changes, e.g. switch to sequence of ints.

But then it would be an avoidable change to a user-facing setting, moving DNS_TIMEOUT from a float argument to a sequence later?

@sibiryakov
Copy link
Contributor Author

@sibiryakov sibiryakov commented Apr 2, 2015

We could support both then. By checking the type of value in settings, and using one timeout or sequence.

@nyov
Copy link
Contributor

@nyov nyov commented Apr 2, 2015

Never mind, if you have discussed this and consider it good, lets get some of these PRs merged.
This release keeps dragging on 😫

@kmike
Copy link
Member

@kmike kmike commented Apr 2, 2015

@nyov sorry, we discussed a list vs a single value in a private chat. The problem with list is that from Scrapy user point of view it is meaningless - "provide a list and we'll sum up the values" is a strange API. As @sibiryakov said, we can always add another setting in future, or just allow a list of values as well as a single value for this option.

This PR (like many others) is not marked as Scrapy 1.0, so merging it is not a requirement for 1.0 release. It just happens that when you start paying more attention to PRs more PRs begin to appear :)

@nyov
Copy link
Contributor

@nyov nyov commented Apr 2, 2015

It just happens that when you start paying more attention to PRs more PRs begin to appear :)

Aye. And I was hoping to smuggle some more in as well :] So it's partially my own fault. But as most of the still open tickets for 1.0 are things I can't really contribute to, there is nothing else for me but finding other stuff to push.

@sibiryakov
Copy link
Contributor Author

@sibiryakov sibiryakov commented Apr 13, 2015

Ping! Guys, what about merging it?

pablohoffman added a commit that referenced this pull request Apr 15, 2015
[MRG+1] Dns cache size and timeout options
@pablohoffman pablohoffman merged commit fb85bd4 into scrapy:master Apr 15, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@@ -4,15 +4,20 @@
from scrapy.utils.datatypes import LocalCache

# TODO: cache misses
# TODO: make cache size a setting

dnscache = LocalCache(10000)

This comment has been minimized.

@nyov

nyov Apr 28, 2015
Contributor

Shouldn't this limit change with DNSCACHE_SIZE? Or is this the upper hard limit.

edit: Ignore this, I missed the dnscache.limit = cache_size line!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.