tests don't work when run without git repository #48

frenzymadness · 2022-11-03T17:34:16Z

When packaging software for Fedora, we run tests in an unpacked archive without initialized git repository for the codebase because we are testing the installed version. Would it make sense to make the tests compatible with this way of running them?

Maybe the tests can initialize their own repositories in a tmpdir or something like that.

The current log I have is:

python -m pytest
===================================== test session starts =====================================
platform linux -- Python 3.10.7, pytest-7.1.3, pluggy-1.0.0
rootdir: /home/lbalhar/Software/databooks
collected 118 items                                                                           

tests/test_affirm.py ......................                                             [ 18%]
tests/test_cli.py ..........FF.FFF.F.F.FF                                               [ 38%]
tests/test_common.py ..                                                                 [ 39%]
tests/test_conflicts.py .........                                                       [ 47%]
tests/test_git_utils.py F.                                                              [ 49%]
tests/test_metadata.py ..........                                                       [ 57%]
tests/test_recipes.py ...............                                                   [ 70%]
tests/test_tui.py ..................                                                    [ 85%]
tests/test_data_models/test_base.py .                                                   [ 86%]
tests/test_data_models/test_notebook.py ................                                [100%]

========================================== FAILURES ===========================================
__________________________________________ test_meta __________________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta0')

    def test_meta(tmpdir: LocalPath) -> None:
        """Remove notebook metadata."""
        read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(read_path)
    
        nb_read = JupyterNotebook.parse_file(path=read_path)
        result = runner.invoke(app, ["meta", str(read_path), "--yes"])
        nb_write = JupyterNotebook.parse_file(path=read_path)
    
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:50: AssertionError
______________________________________ test_meta__check _______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__check0')
caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a63baf0>

    def test_meta__check(tmpdir: LocalPath, caplog: LogCaptureFixture) -> None:
        """Report on existing notebook metadata (both when it is and isn't present)."""
        caplog.set_level(logging.INFO)
    
        read_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(read_path)
    
        nb_read = JupyterNotebook.parse_file(path=read_path)
        result = runner.invoke(app, ["meta", str(read_path), "--check"])
        nb_write = JupyterNotebook.parse_file(path=read_path)
    
        logs = list(caplog.records)
        assert result.exit_code == 1
>       assert len(logs) == 1
E       assert 0 == 1
E        +  where 0 = len([])

tests/test_cli.py:83: AssertionError
______________________________________ test_meta__script ______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__script0')

    def test_meta__script(tmpdir: LocalPath) -> None:
        """Raise `typer.BadParameter` when passing a script instead of a notebook."""
        py_path = tmpdir.mkdir("files") / "a_script.py"  # type: ignore
        py_path.write_text("# some python code", encoding="utf-8")
    
        result = runner.invoke(app, ["meta", str(py_path)])
>       assert result.exit_code == 2
E       assert 1 == 2
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:139: AssertionError
____________________________________ test_meta__no_confirm ____________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__no_confirm0')

    def test_meta__no_confirm(tmpdir: LocalPath) -> None:
        """Don't make any changes without confirmation to overwrite files (prompt)."""
        nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(nb_path)
    
        result = runner.invoke(app, ["meta", str(nb_path)])
    
        assert result.exit_code == 1
        assert JupyterNotebook.parse_file(nb_path) == TestJupyterNotebook().jupyter_notebook
>       assert result.output == (
            "1 files will be overwritten (no prefix nor suffix was passed)."
            " Continue? [y/n]: \nAborted!\n"
        )
E       AssertionError: assert '' == '1 files will... \nAborted!\n'
E         - 1 files will be overwritten (no prefix nor suffix was passed). Continue? [y/n]: 
E         - Aborted!

tests/test_cli.py:155: AssertionError
_____________________________________ test_meta__confirm ______________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_meta__confirm0')

    def test_meta__confirm(tmpdir: LocalPath) -> None:
        """Make changes when confirming overwrite via the prompt."""
        nb_path = tmpdir.mkdir("notebooks") / "test_meta_nb.ipynb"  # type: ignore
        TestJupyterNotebook().jupyter_notebook.write(nb_path)
    
        result = runner.invoke(app, ["meta", str(nb_path)], input="y")
    
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:168: AssertionError
_________________________________________ test_assert _________________________________________

caplog = <_pytest.logging.LogCaptureFixture object at 0x7fca4a66dff0>

    def test_assert(caplog: LogCaptureFixture) -> None:
        """Assert that notebook has sequential and increasing cell execution."""
        caplog.set_level(logging.INFO)
    
        exprs = (
            "[c.execution_count for c in exec_cells] == list(range(1, len(exec_cells) + 1))"
        )
        recipe = "seq-increase"
        with resources.path("tests.files", "demo.ipynb") as nb_path:
            result = runner.invoke(
                app, ["assert", str(nb_path), "--expr", exprs, "--recipe", recipe]
            )
    
        logs = list(caplog.records)
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:203: AssertionError
__________________________________________ test_fix ___________________________________________

tmpdir = local('/tmp/pytest-of-lbalhar/pytest-1/test_fix0')

    def test_fix(tmpdir: LocalPath) -> None:
        """Fix notebook conflicts."""
        # Setup
        nb_path = Path("test_conflicts_nb.ipynb")
        notebook_1 = TestJupyterNotebook().jupyter_notebook
        notebook_2 = TestJupyterNotebook().jupyter_notebook
    
        notebook_1.metadata = NotebookMetadata(
            kernelspec=dict(
                display_name="different_kernel_display_name", name="kernel_name"
            ),
            field_to_remove=["Field to remove"],
            another_field_to_remove="another field",
        )
    
        extra_cell = BaseCell(
            cell_type="raw",
            metadata=CellMetadata(random_meta=["meta"]),
            source="extra",
        )
        notebook_2.cells = notebook_2.cells + [extra_cell]
        notebook_2.nbformat += 1
        notebook_2.nbformat_minor += 1
    
        git_repo = init_repo_conflicts(
            tmpdir=tmpdir,
            filename=nb_path,
            contents_main=notebook_1.json(),
            contents_other=notebook_2.json(),
            commit_message_main="Notebook from main",
            commit_message_other="Notebook from other",
        )
    
        conflict_files = get_conflict_blobs(repo=git_repo)
        id_main = conflict_files[0].first_log
        id_other = conflict_files[0].last_log
    
        # Run CLI and check conflict resolution
        result = runner.invoke(app, ["fix", str(tmpdir)])
>       fixed_notebook = JupyterNotebook.parse_file(path=tmpdir / nb_path)

tests/test_cli.py:268: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
databooks/data_models/notebook.py:250: in parse_file
    return super(JupyterNotebook, cls).parse_file(
pydantic/main.py:561: in pydantic.main.BaseModel.parse_file
    ???
pydantic/parse.py:64: in pydantic.parse.load_file
    ???
pydantic/parse.py:37: in pydantic.parse.load_str_bytes
    ???
/usr/lib64/python3.10/json/__init__.py:346: in loads
    return _default_decoder.decode(s)
/usr/lib64/python3.10/json/decoder.py:337: in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <json.decoder.JSONDecoder object at 0x7fca4be6b1f0>
s = '<<<<<<< HEAD\n{"nbformat": 4, "nbformat_minor": 4, "metadata": {"field_to_remove": ["Field to remove"], "another_fiel...xecution_count": 1}, {"metadata": {"random_meta": ["meta"]}, "source": "extra", "cell_type": "raw"}]}\n>>>>>>> other\n'
idx = 0

    def raw_decode(self, s, idx=0):
        """Decode a JSON document from ``s`` (a ``str`` beginning with
        a JSON document) and return a 2-tuple of the Python
        representation and the index in ``s`` where the document ended.
    
        This can be used to decode a JSON document from a string that may
        have extraneous data at the end.
    
        """
        try:
            obj, end = self.scan_once(s, idx)
        except StopIteration as err:
>           raise JSONDecodeError("Expecting value", s, err.value) from None
E           json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

/usr/lib64/python3.10/json/decoder.py:355: JSONDecodeError
__________________________________________ test_show __________________________________________

    def test_show() -> None:
        """Show notebook in terminal."""
        with resources.path("tests.files", "tui-demo.ipynb") as nb_path:
            result = runner.invoke(app, ["show", str(nb_path)])
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:382: AssertionError
____________________________________ test_show_no_multiple ____________________________________

    def test_show_no_multiple() -> None:
        """Don't show multiple notebooks if not confirmed in prompt."""
        with resources.path("tests.files", "tui-demo.ipynb") as nb:
            dirpath = str(nb.parent)
    
        # Exit code is 0 if user responds to prompt with `n`
        result = runner.invoke(app, ["show", dirpath], input="n")
>       assert result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result InvalidGitRepositoryError()>.exit_code

tests/test_cli.py:457: AssertionError
________________________________________ test_get_repo ________________________________________

    def test_get_repo() -> None:
        """Find git repository."""
        curr_dir = Path(__file__).parent
>       repo = get_repo(curr_dir)

tests/test_git_utils.py:54: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
databooks/git_utils.py:38: in get_repo
    repo = Repo(path=repo_dir)
../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:266: in __init__
    self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <git.repo.base.Repo ''>

    @property
    def common_dir(self) -> PathLike:
        """
        :return: The git dir that holds everything except possibly HEAD,
            FETCH_HEAD, ORIG_HEAD, COMMIT_EDITMSG, index, and logs/."""
        if self._common_dir:
            return self._common_dir
        elif self.git_dir:
            return self.git_dir
        else:
            # or could return ""
>           raise InvalidGitRepositoryError()
E           git.exc.InvalidGitRepositoryError

../../.virtualenvs/databooks/lib/python3.10/site-packages/git/repo/base.py:347: InvalidGitRepositoryError
=================================== short test summary info ===================================
FAILED tests/test_cli.py::test_meta - assert 1 == 0
FAILED tests/test_cli.py::test_meta__check - assert 0 == 1
FAILED tests/test_cli.py::test_meta__script - assert 1 == 2
FAILED tests/test_cli.py::test_meta__no_confirm - AssertionError: assert '' == '1 files will...
FAILED tests/test_cli.py::test_meta__confirm - assert 1 == 0
FAILED tests/test_cli.py::test_assert - assert 1 == 0
FAILED tests/test_cli.py::test_fix - json.decoder.JSONDecodeError: Expecting value: line 1 c...
FAILED tests/test_cli.py::test_show - assert 1 == 0
FAILED tests/test_cli.py::test_show_no_multiple - assert 1 == 0
FAILED tests/test_git_utils.py::test_get_repo - git.exc.InvalidGitRepositoryError
=============================== 10 failed, 108 passed in 0.60s ================================

The text was updated successfully, but these errors were encountered:

frenzymadness · 2022-11-03T17:42:45Z

By the way, I'm on PyConPL till Sunday so feel free to ping me and we can meet and discuss it.

murilo-cunha · 2022-11-04T07:21:13Z

Yes let's do that!!

Not sure why this is happening at a first glance tbh. Let's meet up and we can look into it 🤝

frenzymadness · 2022-11-05T15:20:30Z

I think we should discuss this. For operations like meta, there is no need to look for git repository. If I run databooks meta something.ipynb without a git repository, I get:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/lbalhar/Software/databooks/databooks/__main__.py", line 4, in <module>
    app(prog_name="databooks")
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/typer/main.py", line 214, in __call__
    return get_command(self)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 1655, in invoke
    sub_ctx = cmd.make_context(cmd_name, args, parent=ctx)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 920, in make_context
    self.parse_args(ctx, args)
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 1378, in parse_args
    value, args = param.handle_parse_result(ctx, opts, args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 2360, in handle_parse_result
    value = self.process_value(ctx, value)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/click/core.py", line 2322, in process_value
    value = self.callback(ctx, self, value)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/typer/main.py", line 833, in wrapper
    return callback(**use_params)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/Software/databooks/databooks/cli.py", line 52, in _config_callback
    get_config(
  File "/home/lbalhar/Software/databooks/databooks/config.py", line 19, in get_config
    repo_dir = get_repo().working_dir
               ^^^^^^^^^^
  File "/home/lbalhar/Software/databooks/databooks/git_utils.py", line 38, in get_repo
    repo = Repo(path=repo_dir)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/git/repo/base.py", line 266, in __init__
    self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir
                                                                     ^^^^^^^^^^^^^^^
  File "/home/lbalhar/.virtualenvs/databooks/lib/python3.11/site-packages/git/repo/base.py", line 347, in common_dir
    raise InvalidGitRepositoryError()
git.exc.InvalidGitRepositoryError

And in situations where you need a git repository for example for getting a list of files with conflicts, the app should look for git repository to which the file belongs to instead of the one in CWD. Let's say I have a git repo in /foo/bar/ and a file with conflict in /foo/bar/notebooks/file.ipynb and I run the tool from /tmp - the tool should find the repo in /foo/bar.

But the main question is, why do you need a repository at all? I think it can be necessary only for the new diff command but meta, assert, fix and show commands don't need it at all.

murilo-cunha · 2022-11-05T16:24:55Z

That's a good point. We always look for git repos by default to look for config files (look until the repo root and if there's no pyproject.toml until then use default values) - but I agree we shouldn't see errors if we cannot find them.

why do you need a repository at all? I think it can be necessary only for the new diff command but meta, assert, fix and show commands don't need it at all.

That would be the case for meta and assert, but for fix and diff we do need the git repos - fix we actually look at the git-ls-files to get the blobs of the original files and compare them (so we're not reeeeeally fixing anything but rather going back to previous versions and comparing them). And for that we need the git repo info.

the app should look for git repository to which the file belongs to instead of the one in CWD

Completely agree. There is an implementation for finding the common parent (in case you pass different paths) and looking for the repo root there but I don't think we're using it everywhere (which we definitely should).

So summary of changes:

Return default values in the CLI in case we don't find the repo (for meta and assert)
All git repos should be found based on the file paths, not the current working dir

murilo-cunha · 2022-11-08T12:31:50Z

Done!! And thanks for the suggestion! Maybe worth double checking with you that the fixes are working?

@frenzymadness

murilo-cunha mentioned this issue Nov 7, 2022

Repo found from paths #54

Merged

murilo-cunha closed this as completed in #54 Nov 8, 2022

frenzymadness mentioned this issue Nov 8, 2022

Fix test_get_repo #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests don't work when run without git repository #48

tests don't work when run without git repository #48

frenzymadness commented Nov 3, 2022

frenzymadness commented Nov 3, 2022

murilo-cunha commented Nov 4, 2022

frenzymadness commented Nov 5, 2022

murilo-cunha commented Nov 5, 2022

murilo-cunha commented Nov 8, 2022

tests don't work when run without git repository #48

tests don't work when run without git repository #48

Comments

frenzymadness commented Nov 3, 2022

frenzymadness commented Nov 3, 2022

murilo-cunha commented Nov 4, 2022

frenzymadness commented Nov 5, 2022

murilo-cunha commented Nov 5, 2022

murilo-cunha commented Nov 8, 2022