Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade from toml to tomlkit library #7468

Merged
merged 6 commits into from
Aug 5, 2022
Merged

Conversation

syntapy
Copy link
Contributor

@syntapy syntapy commented Mar 16, 2022

Still maintains functionality to use legacy toml lib in case they pass a decoder to the parser

Todo

  • ❗ I have followed the Contributing to DVC checklist.

  • πŸ“– If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.

  • Fix lint errors

  • Functional test to ensure comments preserved on parse_toml_for_update

  • Unit test that parse_toml: returns dict if not preserving comments, or TOMLDocument if preserving comments

  • Unit test that parse_toml_for_update returns TOMLDocument

  • Test that modify_toml to preserves formatting

Thank you for the contribution - we'll try to review it as soon as possible. πŸ™

Fixes #7253

@syntapy syntapy requested a review from a team as a code owner March 16, 2022 06:26
@syntapy syntapy requested a review from pmrowla March 16, 2022 06:26
@syntapy syntapy changed the title Tomlkit Upgrade from toml to tomlkit library Mar 16, 2022
@syntapy syntapy marked this pull request as draft March 16, 2022 06:30
@pmrowla pmrowla requested a review from skshetry March 16, 2022 06:52
@syntapy
Copy link
Contributor Author

syntapy commented Mar 16, 2022

Ok, I think I need to change the way I'm doing the mock toml file

@skshetry
Copy link
Member

skshetry commented Mar 16, 2022

Hey @syntapy, thank you so much for contributing.

Still maintains functionality to use legacy toml lib in case they pass a decoder to the parser

We don't need to maintain a legacy toml library. We don't directly depend on it, and only expect:

  • parse_toml to return us a dictionary
  • parse_toml_for_update to return a dictionary that when later dumped via _dump preserves formatting/comments.
  • _dump that can serialize a dictionary to stream as a text
  • modify_toml that provides a contextmanager that can update toml files and preserve formatting/comments. If parse_toml_for_update and _dump preserves formatting, this should automatically work.

Regarding tests, we probably don't need anything other than that the ordering/formatting/comments are preserved on update (if existing tests pass).

@syntapy

This comment was marked as outdated.

@syntapy

This comment was marked as outdated.

@syntapy syntapy force-pushed the tomlkit branch 4 times, most recently from 050c06f to b0b5fdb Compare March 19, 2022 05:07
@syntapy syntapy marked this pull request as ready for review March 19, 2022 09:00
@syntapy
Copy link
Contributor Author

syntapy commented Mar 19, 2022

@skshetry Ok, I think I'm ready to have this looked at. I still want to add a test for the format preservation of modify_toml but if you wan to look at it now I think I've basically got it ready for review

I dont know whats going on with the linter error though and I can't tell what the error in the code is

Also, when I run tests locally, they pass even though theres some tracebacks being shown

As one example from running python -m tests -s --pdb tests/func/params/test_show.py I get

Traceback (most recent call last):
  File "/dvc/dvc/fs/utils.py", line 28, in _link
    func(from_path, to_path)
  File "/dvc/dvc/fs/local.py", line 145, in reflink
    System.reflink(from_info, to_info)
  File "/dvc/dvc/system.py", line 112, in reflink
    System._reflink_linux(source, link_name)
  File "/dvc/dvc/system.py", line 96, in _reflink_linux
    fcntl.ioctl(d.fileno(), FICLONE, s.fileno())
OSError: [Errno 95] Operation not supported

The above exception was the direct cause of the following exception:

But it doesn't cause any test in that batch to fail locally

@syntapy
Copy link
Contributor Author

syntapy commented Mar 19, 2022

Another traceback from running that local command in the above comment is below. And also not sure which test this or the above traceback is from.

Edit I just checked. I am getting same looking tracebacks on main branch, both for this and the above comment, so I probably didn't introduce anything

Traceback (most recent call last):
  File "/dvc/.venv/lib/python3.10/site-packages/funcy/flow.py", line 112, in reraise
    yield
  File "/dvc/dvc/utils/serialize/_yaml.py", line 29, in parse_yaml
    return yaml.load(text) or {}
  File "/dvc/.venv/lib/python3.10/site-packages/ruamel/yaml/main.py", line 434, in load
    return constructor.get_single_data()
  File "/dvc/.venv/lib/python3.10/site-packages/ruamel/yaml/constructor.py", line 119, in get_single_data
    node = self.composer.get_single_node()
  File "_ruamel_yaml.pyx", line 706, in _ruamel_yaml.CParser.get_single_node
  File "_ruamel_yaml.pyx", line 724, in _ruamel_yaml.CParser._compose_document
  File "_ruamel_yaml.pyx", line 775, in _ruamel_yaml.CParser._compose_node
  File "_ruamel_yaml.pyx", line 891, in _ruamel_yaml.CParser._compose_mapping_node
  File "_ruamel_yaml.pyx", line 904, in _ruamel_yaml.CParser._parse_next_event
ruamel.yaml.scanner.ScannerError: while scanning a simple key
  in "<unicode string>", line 8, column 1
could not find expected ':'
  in "<unicode string>", line 9, column 1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/dvc/dvc/utils/__init__.py", line 490, in wrapper
    vals = func(*args, **kwargs)
  File "/dvc/dvc/repo/params/show.py", line 135, in _gather_params
    param_outs, params_fs_paths = _collect_configs(repo, rev, targets=targets)
  File "/dvc/dvc/repo/params/show.py", line 31, in _collect_configs
    params, fs_paths = collect(
  File "/dvc/dvc/repo/collect.py", line 81, in collect
    outs: Outputs = _collect_outs(repo, output_filter=output_filter, deps=deps)
  File "/dvc/dvc/repo/collect.py", line 24, in _collect_outs
    index.check_graph()  # ensure graph is correct
  File "/dvc/dvc/repo/index.py", line 268, in check_graph
    self.graph  # pylint: disable=pointless-statement
  File "/dvc/.venv/lib/python3.10/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/dvc/dvc/repo/index.py", line 155, in graph
    return build_graph(self.stages, self.outs_trie)
  File "/dvc/.venv/lib/python3.10/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/dvc/dvc/repo/index.py", line 74, in stages
    return self.stage_collector.collect_repo(onerror=onerror)
  File "/dvc/dvc/repo/stage.py", line 497, in collect_repo
    return list(self._collect_repo(onerror))
  File "/dvc/dvc/repo/stage.py", line 480, in _collect_repo
    new_stages = self.load_file(file_path)
  File "/dvc/dvc/repo/stage.py", line 304, in load_file
    return self.load_all(path)
  File "/dvc/dvc/repo/stage.py", line 276, in load_all
    stages = dvcfile.stages  # type: ignore
  File "/dvc/dvc/dvcfile.py", line 281, in stages
    data, _ = self._load()
  File "/dvc/dvc/dvcfile.py", line 150, in _load
    return self._load_yaml(**kwargs)
  File "/dvc/dvc/dvcfile.py", line 161, in _load_yaml
    return strictyaml.load(
  File "/dvc/dvc/utils/strictyaml.py", line 282, in load
    raise YAMLSyntaxError(path, text, exc) from cause
dvc.utils.strictyaml.YAMLSyntaxError: unable to read: 'dvc.yaml', YAML file structure is corrupted

@syntapy
Copy link
Contributor Author

syntapy commented Mar 23, 2022

Oh ok, those are the linter error messages. I tried running pre-commit locally on main branch and now I see

.gitignore Outdated Show resolved Hide resolved
@dberenbaum
Copy link
Contributor

Waiting on python-poetry/tomlkit#187?

@syntapy
Copy link
Contributor Author

syntapy commented May 20, 2022

@dberenbaum I think I may have finished the last aspects just now. If so then I believe this PR should be mostly done

@syntapy
Copy link
Contributor Author

syntapy commented May 25, 2022

Final failing test under tomlkit lib (aside from the linter, which I have yet to figure out)

Seems like " chars are being appended to back / front of str values in dictionary when reloading from dump_toml

The test code is in unit/dependency.test_params.py:

1: def test_read_params_toml(tmp_dir, dvc):
2:    parameters_file = "parameters.toml"
3:    dump_toml(parameters_file, {"some": {"path": {"foo": ["val1", "val2"]}}})
4:    dep = ParamsDependency(Stage(dvc), parameters_file, ["some.path.foo"])
5:    assert dep.read_params() == {"some.path.foo": ["val1", "val2"]}

on line 4 in above snippet, dep.read_params() is different from {"some": {"path": {"foo": ["val1", "val2"]}}} such that instead of being straight up strings for "val1" and "val2", it is '"val1"' and '"val2"'

I haven't modified this test, so I shouldn't touch it

@syntapy
Copy link
Contributor Author

syntapy commented Jul 1, 2022

The issue described in the above comment is fixed by python-poetry/tomlkit@1cb4c25 so I believe we're waiting for a new release to incorporate that

@syntapy syntapy requested a review from pmrowla July 7, 2022 17:25
@dberenbaum
Copy link
Contributor

Thanks @syntapy! Looks like this should be ready now (see https://github.com/sdispater/tomlkit/releases/tag/0.11.1)?

@syntapy
Copy link
Contributor Author

syntapy commented Jul 8, 2022

I added that release in the setup.cfg file just the other day. Is there anywhere else I need to do?

@dberenbaum
Copy link
Contributor

🀦 Sorry @syntapy, I missed your updates somehow.

@syntapy
Copy link
Contributor Author

syntapy commented Jul 11, 2022

I mean, its not like I haven't just let this sit anyhow. So yeah, thanks for checking in on this for me

Copy link
Contributor

@pmrowla pmrowla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look alright to me, but all of these commits need to be squashed before we can merge

@skshetry can you take another look at the current PR

Copy link
Member

@skshetry skshetry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks great, I have a few cosmetic suggestions. πŸ”₯

Also thank you so much for all the upstream patches, and for helping us keep implementation simpler. Also, apologies for the late reviews. :)

tests/func/params/test_show.py Outdated Show resolved Hide resolved
dvc/utils/serialize/_toml.py Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Jul 25, 2022
@syntapy syntapy requested a review from skshetry August 4, 2022 16:11
@skshetry skshetry merged commit c7b22b4 into iterative:main Aug 5, 2022
@syntapy syntapy deleted the tomlkit branch October 6, 2022 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :)
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

toml: being 1.0.0 compliant, preserving style on dump
6 participants