Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having an option to define an Items output file in the scrapy parse commnad line #4317

Closed
raphapassini opened this issue Feb 7, 2020 · 11 comments · Fixed by #4377
Closed

Comments

@raphapassini
Copy link
Contributor

I use scrapy parse command to check the Items output for given URL, and I use it a lot.

Sometimes, I need to latter explore the Items extracted by scrapy parse, e.g . to check that something I want to fix was fixed for that specific URL without having to touch the source code.

Proposed solution:

Add a new parameter -o or --output in the parse command line. Allowing the client to set a desired output file for the items.

I could also | the result of scrapy parse, filter out want I want and create a json file out of it, but would be much easier if I could simple write scrapy parse {url} --callback=fixed_function -o items.json

@adityaa30
Copy link
Contributor

Shall I work on this issue?

@Gallaecio
Copy link
Member

@adityaa30 Please, go ahead, no need to ask for permission 🙂

@adityaa30
Copy link
Contributor

Okay sure!

@akshaysharmajs
Copy link
Contributor

Hi, I m working on this issue and made some changes in scrapy/commands/parse.py but running "scrapy parse" in command line not reflecting changes. Any suggestions?

@akshaysharmajs
Copy link
Contributor

@Gallaecio how to run scrapy from cloned repository in my local computer? I think "scrapy parse" command running scrapy package installed through "pip install scrapy".

@elacuesta
Copy link
Member

@akshaysharmajs you should be able to install Scrapy in "development mode" by executing pip install -e .

@akshaysharmajs
Copy link
Contributor

@elacuesta Thanks!

@adityaa30
Copy link
Contributor

adityaa30 commented Feb 25, 2020

@raphapassini @Gallaecio I have sent a pr #4377. I have made it a work in progress as I have a few questions:

  1. If the user uses --output command then it may be not required to show scrapped data in cmd. Shall I add a check such that when --output is specified then data won't print in cmd?
  2. Shall i add another option to save the list of requests made to another file?

@Gallaecio
Copy link
Member

@adityaa30 I think it is better to keep the implementation simple, so I personally would not implement 1 or 2 unless people show up asking for it and explaining their use cases.

@adityaa30
Copy link
Contributor

@Gallaecio Okay!

@raphapassini
Copy link
Contributor Author

Hey @adityaa30 this is a long waited one, thanks for working on this 💃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants