New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple exporters when crawling #1336
Comments
I think this feature would be very useful. By the way,
|
you can achieve this by implementing multiple pipelines which uses exporters, this way once item hits first pipeline it gets recorded in first format, then scrapy releases item to the second pipeline where you can export different format see scrapy.exporters in code and docs, it should be pretty easy |
Would this require the removal of the |
@lufte: Not necessarily, we can throw an error if |
@curita Just asking because, traditionally (at least in most GNU programs), command-line options don't need to be in a specific order to work. It's mostly a matter of style :) |
@lufte if there is no |
@kmike: Yes I understand that, but I could still use them and pass them in a weird order like |
yes, I think for this case we may need to support another way to set output format - e.g. |
That could work :) |
I like that syntax too, actually seems simpler and a little bit shorter than |
Adding "help-wanted" if there are any volunteers to come up with an implementation of this feature. |
I'm scraping a website to export the data into a semantic format (n3). However, I also want to perform some data analysis on that data, so having it in a csv format is more convenient. To get the data in both formats I can do.
However, this scrapes the data twice and I cannot afford it with big amounts of data.
A solution that avoids scraping the data twice consists on implementing Pipeline that exports the data (see alecxe suggestion for details). However, as the documentation explains, this is not the preferred way to export data.
Thus, I consider it would be interesting scrappy's support for multiple exporters.
The text was updated successfully, but these errors were encountered: