Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Override request_fingerprint with meta field? #126

Closed
jcushman opened this issue Apr 28, 2012 · 5 comments
Closed

Override request_fingerprint with meta field? #126

jcushman opened this issue Apr 28, 2012 · 5 comments

Comments

@jcushman
Copy link

Would it make sense to add something to the top of scrapy.utils.request.request_fingerprint like:

if 'fingerprint' in request.meta:
    return request.meta['fingerprint']

This is useful, for example, if you consider two pages to be identical if they share the same productID query parameter:

def parse(self, response):
    ...
    request = Request(url)
    m = re.search(r'productID=([^&]+)', url)
    if m:
        request.meta['fingerprint'] = m.group(1)

I'm currently handling this with a custom duplicate filter, but it seems like it would be broadly useful.

@pablohoffman
Copy link
Member

This sounds good to me. I'm happy to review and merge a pull request that implements it.

@dangra
Copy link
Member

dangra commented May 10, 2012

+1 too

along this changes, what do you think about dropping _request_fingerprint_cache
and store the computed fingerprint in the meta as suggested in this ticket

@dangra
Copy link
Member

dangra commented May 10, 2012

just figured out a problem with storing fingerprint in request.meta :-(

request.meta is propagated by request.replace() which is commonly used to redirect, retry or simply propagate itermediate results stored in meta to other urls

@pablohoffman
Copy link
Member

Yes, just using request.meta['fingerprint'] wouldn't work, but I still like the original proposal of supporting as an additional source for the fingerprint.

@jcushman mind submitting a pull request? ;)

@dangra
Copy link
Member

dangra commented Jan 29, 2013

dusty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants