-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a command-line option for overwriting exported file #547
Comments
I've been bitten by this too, I prefer the single option to the modifier. +1 for |
Why not overwrite by default and provide an option to append? On Mon, Jan 20, 2014 at 9:16 AM, Daniel Graña notifications@github.comwrote:
|
@darkrho : it's an option, but it changes current behavior, does it really deserve a backward incompatible change? |
@darkrho and is nicer to have -o and -O. |
If we have But @darkrho raised an interesting question because As a side note: |
How about emulating wget here, of all things. By default create new files if one exists, as We have a PR, could anyone take a look (the question was asked why unittests are failing for it). |
-o option creating filename.out.1 filename.out.2 filename.out.3 files by default looks good to me. |
By the way, does anyone need appending option? What is it useful for? |
+1 I realized it was appending the other day when my app broke updating JSON file with scrapy crawl. I don't want it to append. Not sure if current design allows it, how about sending "only items" to stdout with -o and use it with >/>> ? |
+1 |
+1 Any progress on this? |
+1 for -o option |
The current PR looks inadequate to me. Sorry. Changing the behaviour to append/overwrite, looks more daunting. Doing it right would mean modifying the Exporter's codebase to handle append or overwrite options like a File interface (IMHO); implementation depending on the actual Exporter backend (local disk, s3, datastore, ...). Also @kmike's proposal
would require more thought. It sounds great, but if |
I've been using a custom CsvItemExporter to overwrite existing exports and use a custom CSV_DELIMITER in settings. Now i need to set the FEED_URI in the spider. The problem with that is that my custom CsvItemExporter no longer clears existing exports which it only does on init. |
You can simply output to stdout and redirect that output to a file: |
@dannykopping this is not an option when scheduling through scrapyd. |
In the meantime one could subclass |
What do you think about adding an option to overwrite/recreate exported file?
Something like
or
This is useful during development where old data is not needed. I usually run
multiple times. This is not DRY: if I want to change file name for the next iteration and preserve existing file then I must be careful and update both names (of course, the command comes from shell autocompletion).
The text was updated successfully, but these errors were encountered: