Closed
Description
Absolute paths on Windows has :
as a part of disk names which leads to unexpected behaviour.
Steps to Reproduce
Case 1: run command scrapy crawl spider_name -O "C:\path\to\output.json"
Case 2: run command scrapy crawl spider_name -O "C:\path\to\output.json:json"
Expected behavior
For both commands the results of the crawling must be saved to C:\path\to\output.json
file.
Actual behavior
Case 1: error
Unrecognized output format '\path\to\output.json'. Set a supported one (('json', 'jsonlines', 'jsonl', 'jl', 'csv', 'xml', 'marshal', 'pickle')) after a colon at the end of the output URI (i.e. -o/-O <URI>:<FORMAT>) or as a file extension.
Case 2: no errors, but file is not created. Note: in this example I use system disk, although file is not created if any other disk is used (i.e., permissions in not the issue).
Reproduces how often
Always
Versions
Scrapy : 2.9.0
lxml : 4.9.2.0
libxml2 : 2.9.12
cssselect : 1.2.0
parsel : 1.8.1
w3lib : 2.1.1
Twisted : 22.10.0
Python : 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)]
pyOpenSSL : 23.2.0 (OpenSSL 3.1.1 30 May 2023)
cryptography : 41.0.1
Platform : Windows-10-10.0.19044-SP0
Metadata
Metadata
Assignees
Labels
No labels