Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type error when we try to retrieve the FEEDS setting via CLI and it has a Path objects as a key #5383

Closed
waveFrontSet opened this issue Jan 30, 2022 · 2 comments · Fixed by #5384
Labels

Comments

@waveFrontSet
Copy link

waveFrontSet commented Jan 30, 2022

Description

When a Path object is used as a key in the FEEDS dictionary and we try to obtain the FEEDS setting via CLI scrapy settings --get FEEDS, a type error is raised:

TypeError: keys must be str, int, float, bool or None, not PosixPath

A complete stack trace is attached below under "Actual behavior".

Path objects as keys are allowed as documented in the FEEDS section of the Feed exports chapter.

Steps to Reproduce

  1. In a freshly generated scrapy project (e.g. via scrapy project feeds_bug), prepend the following lines to settings.py:
from pathlib import Path

FEEDS = {
    Path("some_path/file.csv"): {
        "format": "csv"
    }
}
  1. Execute scrapy settings --get FEEDS

Expected behavior: Some string representation of the FEEDS setting gets printed to standard out, maybe like this:

{"some_path/file.csv": {"format": "csv"}}

Actual behavior: A type error is raised:

Traceback (most recent call last):
  File "/Users/paul/Projects/feeds-bug/.direnv/python-3.9.7/bin/scrapy", line 8, in <module>
    sys.exit(execute())
  File "/Users/paul/Projects/feeds-bug/.direnv/python-3.9.7/lib/python3.9/site-packages/scrapy/cmdline.py", line 145, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/Users/paul/Projects/feeds-bug/.direnv/python-3.9.7/lib/python3.9/site-packages/scrapy/cmdline.py", line 100, in _run_print_help
    func(*a, **kw)
  File "/Users/paul/Projects/feeds-bug/.direnv/python-3.9.7/lib/python3.9/site-packages/scrapy/cmdline.py", line 153, in _run_command
    cmd.run(args, opts)
  File "/Users/paul/Projects/feeds-bug/.direnv/python-3.9.7/lib/python3.9/site-packages/scrapy/commands/settings.py", line 37, in run
    print(json.dumps(s.copy_to_dict()))
  File "/Users/paul/.pyenv/versions/3.9.7/lib/python3.9/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Users/paul/.pyenv/versions/3.9.7/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Users/paul/.pyenv/versions/3.9.7/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
TypeError: keys must be str, int, float, bool or None, not PosixPath

Reproduces how often: 100%

Versions

Scrapy       : 2.5.1
lxml         : 4.7.1.0
libxml2      : 2.9.12
cssselect    : 1.1.0
parsel       : 1.6.0
w3lib        : 1.22.0
Twisted      : 21.7.0
Python       : 3.9.7 (default, Dec 11 2021, 11:25:57) - [Clang 13.0.0 (clang-1300.0.29.3)]
pyOpenSSL    : 22.0.0 (OpenSSL 1.1.1m  14 Dec 2021)
cryptography : 36.0.1
Platform     : macOS-11.6-x86_64-i386-64bit
@wRAR
Copy link
Member

wRAR commented Jan 30, 2022

(I didn't even know we have a settings command)

@wRAR
Copy link
Member

wRAR commented Jan 31, 2022

So it simply takes all settings and if a setting value is a dict serializes it into JSON, which can't always work as not all Python dicts are JSON-serializable (not only because of the key type reqs). As the command just prints the values it should be fine if the representation is not 100% true, but I'm not sure how to handle serialization problems without explicit handling of special cases, maybe providing a default function that calls str() would be good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants