Conversation
Agent-Logs-Url: https://github.com/pickwicksoft/pystreamapi/sessions/1a3f86a6-3bc8-41dc-9d9f-8f20f6746e42 Co-authored-by: garlontas <70283087+garlontas@users.noreply.github.com>
Agent-Logs-Url: https://github.com/pickwicksoft/pystreamapi/sessions/1a3f86a6-3bc8-41dc-9d9f-8f20f6746e42 Co-authored-by: garlontas <70283087+garlontas@users.noreply.github.com>
Agent-Logs-Url: https://github.com/pickwicksoft/pystreamapi/sessions/201e043c-1007-4214-bfe2-55563c47281f Co-authored-by: garlontas <70283087+garlontas@users.noreply.github.com>
|
Reviewer's GuideImplements a new TOML data loader using tomlkit that lazily parses TOML content from files or strings into nested namedtuples, wires it into the loaders package and extras, adds tomlkit as a dependency, and introduces tests to validate behavior and edge cases. Sequence diagram for TOML loader entrypoint and lazy loadingsequenceDiagram
participant Client
participant TomlLoader
participant LoaderUtils
participant FileSystem
participant Tomlkit
Client->>TomlLoader: toml(src, read_from_src)
alt read_from_src is true
TomlLoader->>TomlLoader: __lazy_load_toml_string(src)
TomlLoader->>Tomlkit: loads(toml_string)
Tomlkit-->>TomlLoader: dict_data
TomlLoader->>TomlLoader: __dict_to_namedtuple(dict_data)
TomlLoader-->>Client: Iterator[namedtuple]
else read_from_src is false
TomlLoader->>LoaderUtils: validate_path(src)
LoaderUtils-->>TomlLoader: file_path
TomlLoader->>FileSystem: open(file_path, read)
FileSystem-->>TomlLoader: toml_string
TomlLoader->>Tomlkit: loads(toml_string)
Tomlkit-->>TomlLoader: dict_data
TomlLoader->>TomlLoader: __dict_to_namedtuple(dict_data)
TomlLoader-->>Client: Iterator[namedtuple]
end
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
__dict_to_namedtupleconversion assumes all TOML keys are valid Python identifiers, which will break for quoted/space-containing keys; consider normalizing or falling back to a dict for such cases to avoid runtimeValueErrorfromnamedtuple. - The TOML loader always yields a single top-level namedtuple (with arrays-of-tables nested inside), which may differ from how other loaders stream multiple records; it may be worth verifying and documenting that this behavior is intentionally aligned with the existing loader APIs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `__dict_to_namedtuple` conversion assumes all TOML keys are valid Python identifiers, which will break for quoted/space-containing keys; consider normalizing or falling back to a dict for such cases to avoid runtime `ValueError` from `namedtuple`.
- The TOML loader always yields a single top-level namedtuple (with arrays-of-tables nested inside), which may differ from how other loaders stream multiple records; it may be worth verifying and documenting that this behavior is intentionally aligned with the existing loader APIs.
## Individual Comments
### Comment 1
<location path="pystreamapi/loaders/__toml/__toml_loader.py" line_range="55-59" />
<code_context>
+ return generator()
+
+
+def __dict_to_namedtuple(data, name='Item'):
+ """Recursively convert a dictionary (or list) to namedtuples."""
+ if isinstance(data, dict):
+ fields = list(data.keys())
+ Item = namedtuple(name, fields)
+ return Item(**{k: __dict_to_namedtuple(v, k) for k, v in data.items()})
+ if isinstance(data, list):
</code_context>
<issue_to_address>
**issue:** Raw TOML keys and the `name` argument may not be valid Python identifiers, which can cause `namedtuple` construction to fail.
Because TOML keys and table names can include `-`, spaces, or start with digits, both the generated field names and the `name` you pass to `namedtuple` can be invalid identifiers, causing a `ValueError` at runtime.
To handle this safely:
- Call `namedtuple(name, fields, rename=True)` to automatically fix invalid/duplicate field names.
- Sanitize `name` when it comes from a TOML key (e.g., replace non-identifier characters, or prefix if it starts with a digit).
This will let the loader handle a wider range of valid TOML inputs without failing.
</issue_to_address>
### Comment 2
<location path="tests/_loaders/test_toml_loader.py" line_range="58-67" />
<code_context>
+ with self.mock_file(""):
+ self.assertEqual(list(toml(file_path)), [])
+
+ def test_toml_loader_from_empty_string(self):
+ self.assertEqual(list(toml("", read_from_src=True)), [])
+
+ def test_toml_loader_with_invalid_path(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for whitespace-only TOML input to mirror the implementation’s `strip()` check
Since empty content is detected via `.strip()`, whitespace-only input (e.g. `" \n"`) should be treated as empty as well. Please add a test like `self.assertEqual(list(toml(" \n", read_from_src=True)), [])` to capture this behavior and prevent regressions in whitespace handling.
```suggestion
def test_toml_loader_with_empty_file(self):
with self.mock_file(""):
self.assertEqual(list(toml(file_path)), [])
def test_toml_loader_from_empty_string(self):
self.assertEqual(list(toml("", read_from_src=True)), [])
def test_toml_loader_from_whitespace_string(self):
self.assertEqual(list(toml(" \n", read_from_src=True)), [])
def test_toml_loader_with_invalid_path(self):
with self.assertRaises(FileNotFoundError):
toml('path/to/invalid.toml')
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def __dict_to_namedtuple(data, name='Item'): | ||
| """Recursively convert a dictionary (or list) to namedtuples.""" | ||
| if isinstance(data, dict): | ||
| fields = list(data.keys()) | ||
| Item = namedtuple(name, fields) |
There was a problem hiding this comment.
issue: Raw TOML keys and the name argument may not be valid Python identifiers, which can cause namedtuple construction to fail.
Because TOML keys and table names can include -, spaces, or start with digits, both the generated field names and the name you pass to namedtuple can be invalid identifiers, causing a ValueError at runtime.
To handle this safely:
- Call
namedtuple(name, fields, rename=True)to automatically fix invalid/duplicate field names. - Sanitize
namewhen it comes from a TOML key (e.g., replace non-identifier characters, or prefix if it starts with a digit).
This will let the loader handle a wider range of valid TOML inputs without failing.
| def test_toml_loader_with_empty_file(self): | ||
| with self.mock_file(""): | ||
| self.assertEqual(list(toml(file_path)), []) | ||
|
|
||
| def test_toml_loader_from_empty_string(self): | ||
| self.assertEqual(list(toml("", read_from_src=True)), []) | ||
|
|
||
| def test_toml_loader_with_invalid_path(self): | ||
| with self.assertRaises(FileNotFoundError): | ||
| toml('path/to/invalid.toml') |
There was a problem hiding this comment.
suggestion (testing): Consider adding a test for whitespace-only TOML input to mirror the implementation’s strip() check
Since empty content is detected via .strip(), whitespace-only input (e.g. " \n") should be treated as empty as well. Please add a test like self.assertEqual(list(toml(" \n", read_from_src=True)), []) to capture this behavior and prevent regressions in whitespace handling.
| def test_toml_loader_with_empty_file(self): | |
| with self.mock_file(""): | |
| self.assertEqual(list(toml(file_path)), []) | |
| def test_toml_loader_from_empty_string(self): | |
| self.assertEqual(list(toml("", read_from_src=True)), []) | |
| def test_toml_loader_with_invalid_path(self): | |
| with self.assertRaises(FileNotFoundError): | |
| toml('path/to/invalid.toml') | |
| def test_toml_loader_with_empty_file(self): | |
| with self.mock_file(""): | |
| self.assertEqual(list(toml(file_path)), []) | |
| def test_toml_loader_from_empty_string(self): | |
| self.assertEqual(list(toml("", read_from_src=True)), []) | |
| def test_toml_loader_from_whitespace_string(self): | |
| self.assertEqual(list(toml(" \n", read_from_src=True)), []) | |
| def test_toml_loader_with_invalid_path(self): | |
| with self.assertRaises(FileNotFoundError): | |
| toml('path/to/invalid.toml') |



poetry lockto regenerate the lock file to match pyproject.tomlcloses Data loader for TOML #73
Summary by Sourcery
Add support for loading TOML data alongside existing CSV and JSON loaders.
New Features:
Enhancements:
Build:
Tests: