We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As we're not downloading resources, we currently have no way of knowing some basic information about them without hitting each URL.
However, downloading would be a rather costly action, both in time and disk space. But we can probably get the headers info only.
An idea would be to use Scrapy's cache, if possible, but we need to investigate.
Examples of useful headers to fetch for each downloadable file:
Acceptance criteria:
The text was updated successfully, but these errors were encountered:
ETA: 3h
Sorry, something went wrong.
Implemented in #87, pending review and merge.
nightsh
Successfully merging a pull request may close this issue.
As we're not downloading resources, we currently have no way of knowing some basic information about them without hitting each URL.
However, downloading would be a rather costly action, both in time and disk space. But we can probably get the headers info only.
An idea would be to use Scrapy's cache, if possible, but we need to investigate.
Examples of useful headers to fetch for each downloadable file:
Acceptance criteria:
The text was updated successfully, but these errors were encountered: