Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-implement external objects #86

Open
15r10nk opened this issue May 6, 2024 · 18 comments
Open

re-implement external objects #86

15r10nk opened this issue May 6, 2024 · 18 comments

Comments

@15r10nk
Copy link
Owner

15r10nk commented May 6, 2024

current implementation

The current implementation of external objects has some drawbacks.

  • the name of the external objects changes every time because it is the hash of the data.
  • they are all stored in the same directory.

There where reasons why it was implemented the way they are currently, but some changes in the implementation of inline-snapshot could allow a different implementation now.

new implementation

the interface will not change very much outsource(data) will still create external(name) objects, but the name of the external objects will now be a uuid.

def test_something():
    assert outsource("text"*50) == snapshot(external("92f570b0-e411-46d3-bece-108aeaac6b1f.txt"))

The uuid/name of the file will not change. inline-snapshot will only change the content of the file when you run --inline-snapshot=fix

These external files will be stored beside you tests. The snapshot will be stored in __inline_snapshot__/test_things/test_something/92f570b0-e411-46d3-bece-108aeaac6b1f.txt if the example above is from test_things.py.

inline-snapshot will report an error if it detects that the same external object is referenced twice in the same file. It might technical not necessary, but it would make the behavior and implementation more complicated if it would be allowed.

I don't want to provide the option to name the external objects (at least for now) like outsource("text"*100, name="long_test") == snapshot(external("long_test.txt")). The folder structure should provide enough information to find your data.
an explicit name would have the following problems:

  • the user would have to provide a unique name
  • what should happen when the user changes the name? The idea of the current implementation is that inline-snapshot sticks to the uuid. Should it be different if the user chooses the name?

This is currently only an idea and feedback is very welcome.

@adriangb
Copy link

adriangb commented May 10, 2024

Id like to snapshot parquet/arrow data. You can get a meaningful human readable diff via polars/pandas assert_frame_equal and such. So I think it would be nice to get a load/dump function and/or a diff function hook so I can load/dump as parquet but compare using assert_frame_equal.

For binary images we’ve also implemented a mask where a heatmap of the the diff is saved to a failure file ({hash}.diff.{suffix}?) which is another functionality for the compare callback.

@Galsor
Copy link

Galsor commented May 13, 2024

I agree your current suggestion would be a nice improvement of the current implementation.

What do you think of adding encode and decode optional arguments to outsource allowing custom file and data types to be handled ?

from pydantic import BaseModel

class CustomModel(BaseModel):
    name: str
    age: int

def test_model():
    model = CustomModel(name="foo", age=42)
    assert outsource(model, encode=pickle.dumps, decode=pickle.loads, suffix=".pkl") == snapshot(external("76a970z09f413-56d9-erfd-095aerdv60a1f.pkl"))

I believe it could provide more explicit error traces for non-textual data.

Additionally, would it be relevant to add a .value property to external class to allow ad-hoc testing functions such as:

assert pd. assert_frame_equal(outsource(df, encode= DataFrame.to_csv, decode=pd.read_csv, suffix=".csv").value, snapshot(external("76a970z09f413-56d9-erfd-095aerdv60a1f.csv").value))

Thank you for your work. It tackles a pain point I had to implement several times in data science pipelines with bearly satisfacting solutions...

@15r10nk
Copy link
Owner Author

15r10nk commented May 14, 2024

Thank you @Galsor and @adriangb for your feedback. Support for DataFrames seems to be a very much wanted feature (see also #67), but there are some problems with DataFrames.
It is also the first time for me working with pandas please correct me if I miss something in the following.

>>> DataFrame()==DataFrame()
Empty DataFrame
Columns: []
Index: []

a DataFrame comparison returns a DataFrame which leads to problems in the current logic of inline-snapshot (it expects it to be a bool). The other problem is that repr(DataFrame()) returns something which is no python code, but this should be fixable with #85

@lucianosrp proposed the following in #67:

def test_get_data():
    df = get_data()
    assert (df == snapshot()).all().all()

This might work, but does not look very intuitive in for me, but I have not much experience with pandas. Maybe it is what someone who uses pandas wants to write. Please let me know.

Another problem with this syntax is the following.
inline-snapshot relies on pytest_assertrepr_compare to represent the diff.
I specialized it for DataFrames and got the following result:

# conftest.py
def pytest_assertrepr_compare(config, op, left, right):
    if isinstance(left,DataFrame) and isinstance(right,DataFrame):
        try:
            assert_frame_equal(left,right)
        except AssertionError as e:
            return ["diff:"]+str(e).split("\n")
d1=DataFrame({'col1': [True, True], 'col2': [True, False]})
d2=DataFrame({'col1': [True, True], 'col2': [True, True]})

def test_foo():
    assert (d1==d2).all().all()
    def test_foo():
>       assert (d1==d2).all().all()
E       assert False
E        +  where False = <bound method Series.all of col1     True\ncol2    False\ndtype: bool>()
E        +    where <bound method Series.all of col1     True\ncol2    False\ndtype: bool> = col1     True\ncol2    False\ndtype: bool.all
E        +      where col1     True\ncol2    False\ndtype: bool = <bound method DataFrame.all of    col1   col2\n0  True   True\n1  True  False>()
E        +        where <bound method DataFrame.all of    col1   col2\n0  True   True\n1  True  False> = diff:
E                 DataFrame.iloc[:, 1] (column name="col2") are different
E                 
E                 DataFrame.iloc[:, 1] (column name="col2") values are different (50.0 %)
E                 [index]: [0, 1]
E                 [left]:  [True, False]
E                 [right]: [True, True].all

I don't know if this would be ok for you. I don't like the first E + where lines very much. The .all at the end looks also strange.

I think there might be two ways how DataFrames could be integrated:

  1. provide more hooks to customize inline-snapshot for special types like DataFrames
  2. write a DataFrameWrapper(DataFrame()) with eq and repr methods which work with the existing inline-snapshot logic.
    def test_foo():
        assert outsource(DataFrameWrapper(df))==snapshot(external("76a970z09f413-56d9-erfd-095aerdv60a1f.pkl"))
        # or for small DataFrames
        assert DataFrameWrapper(df)==snapshot(DataFrameWrapper({"col_1":[1,2],"col_2":[3,4]}))

encode/decode load/dump

Yes, Customization for specific types is also important. My current idea looks something like this

@register_external_handler(DataFrame)
class DataFrameHandler:
    suffix=".parquet"

    def load(filename: Path) -> DataFrame:
        ...

    def save(filename: Path, value: DataFrame):
        ...

This handler would be used every time you outsource() a DataFrame.
@Galsor I don't think that it is a good idea to use encode/decode as arguments of outsource(), because you would have to specify them every time you want to outsource your CustomType in your example, but please let me know if you want to do it this way for some reason.

some questions

For binary images we’ve also implemented a mask where a heatmap of the the diff is saved to a failure file ({hash}.diff.{suffix}?) which is another functionality for the compare callback.

@adriangb are you still talking about DataFrames? and which compare callback do you mean?

Additionally, would it be relevant to add a .value property to external class to allow ad-hoc testing functions such as:

@Galsor I don't know how this syntax could work. There are currently ways how you can use snapshots in custom functions (see #81 (comment)), but you have no way how you can access the value of the snapshot. I know that you want to access the .value of the external but this has other problems because snapshot() would then try to save repr(value), which is your DataFrame(...) and not external(...).

@adriangb
Copy link

adriangb commented May 14, 2024

@adriangb are you still talking about DataFrames? and which compare callback do you mean?

Yes, I'm still talking about DataFrames. As you point out they are not trivially comparable. Hence why libraries like pandas ship testing helpers: https://pandas.pydata.org/docs/reference/api/pandas.testing.assert_frame_equal.html. I would propose that you add a compare callback to snapshot or external (not sure which) so that one can do something like outsource(df, load=polars.read_parquet, dump=polars.DataFrame.write_parquet) == snapshot(compare=polars.testing.assert_frame_equal). Now the data gets saved and loaded from parquet and compared via polars.testing.assert_frame_equal instead of ==. No repr needed. A global per-type option would also make sense to reduce boilerplate (your register_external_handler approach above) and it could handle saving, loading and selecting a comparison function.

@15r10nk
Copy link
Owner Author

15r10nk commented May 14, 2024

I think the compare logic can be viewed separate from load/store.

The following example works with the current version of inline-snapshot and uses assert_frame_equal to show a diff.

Here is a small wrapper for the DataFrame:

from pandas import DataFrame

class Wrapper:
    def __init__(self,*a,**ka):
        if len(a)==1 and len(ka)==0 and isinstance(a[0],DataFrame):
            self.df=a[0]
        else:
            self.df=DataFrame(*a,**ka)

    def __repr__(self):
        return f"Wrapper({self.df.to_dict()!r})"

    def __eq__(self,other):
        if not isinstance(other,Wrapper):
            return NotImplemented
        return (self.df == other.df).all().all()

I extend pytest in conftest.py:

from df_wrapper import Wrapper
from pandas.testing import assert_frame_equal

def pytest_assertrepr_compare(config, op, left, right):
    if isinstance(left,Wrapper) and isinstance(right,Wrapper):
        try:
            assert_frame_equal(left.df,right.df)
        except AssertionError as e:
            return str(e).split("\n")

the test:

from pandas import DataFrame

from inline_snapshot import snapshot
from df_wrapper import Wrapper


d1 = DataFrame({"col1": [1, 2], "col2": [3, 4]})


def test_foo():
    assert Wrapper(d1) == snapshot(
        Wrapper({"col1": {0: 1, 1: 2}, "col2": {0: 3, 1: 5}}) 
        # I changed the last 4 to 5 in col2 to make the test fail
    )

and this is how it looks when the test fails.

    def test_foo():
>       assert Wrapper(d1) == snapshot(
            Wrapper({"col1": {0: 1, 1: 2}, "col2": {0: 3, 1: 5}})
        )
E       assert DataFrame.iloc[:, 1] (column name="col2") are different
E         
E         DataFrame.iloc[:, 1] (column name="col2") values are different (50.0 %)
E         [index]: [0, 1]
E         [left]:  [3, 4]
E         [right]: [3, 5]

This does not uses outsource, but the error message would look the same with outsource.

Do you want something like this?

@adriangb
Copy link

adriangb commented May 14, 2024

Yes I agree it can be handled separately. And yes I do want something like that! I think it can even be done without Wrapper right?

from pandas import DataFrame
from pandas.testing import assert_frame_equal

def pytest_assertrepr_compare(config, op, left, right):
    if isinstance(left,DataFrame) and isinstance(right,DataFrame) and op == "==":
        try:
            assert_frame_equal(left, right)
        except AssertionError as e:
            return str(e).split("\n")

Thank you for teaching me about pytest_assertrepr_compare. But of course this may break if I start doing assert outsource(df) == snapshot() so maybe it does have to be integrated into the inline-snapshot API.

@15r10nk
Copy link
Owner Author

15r10nk commented May 15, 2024

I think it can even be done without Wrapper right?

Yes, but you get an error message which looks like this (which is from my previous comment):

    def test_foo():
>       assert (d1==d2).all().all()
E       assert False
E        +  where False = <bound method Series.all of col1     True\ncol2    False\ndtype: bool>()
E        +    where <bound method Series.all of col1     True\ncol2    False\ndtype: bool> = col1     True\ncol2    False\ndtype: bool.all
E        +      where col1     True\ncol2    False\ndtype: bool = <bound method DataFrame.all of    col1   col2\n0  True   True\n1  True  False>()
E        +        where <bound method DataFrame.all of    col1   col2\n0  True   True\n1  True  False> = diff:
E                 DataFrame.iloc[:, 1] (column name="col2") are different
E                 
E                 DataFrame.iloc[:, 1] (column name="col2") values are different (50.0 %)
E                 [index]: [0, 1]
E                 [left]:  [True, False]
E                 [right]: [True, True].all

You need the .all().all() because equality is not defined on DataFrames. The problem is that the following code does not work.

>>> bool(DataFrame()==DataFrame())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/frank/.pyenv/versions/dirty_equal_3.12/lib/python3.12/site-packages/pandas/core/generic.py", line 1577, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

This means also that you can not use a Dataframe in a complex list/dict structure.

>>> [DataFrame({"col0":[1]})]==[DataFrame({"col0":[2]})]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/frank/.pyenv/versions/dirty_equal_3.12/lib/python3.12/site-packages/pandas/core/generic.py", line 1577, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Which would mean that you are not able to snapshot such structures.

Using a Wrapper which provides an __eq__ implementation solves all this issues and saves me from providing custom hooks to make inline-snapshot work with datatypes where __eq__ returns something which can not be converted to a bool.

Note

I have found several other ways to compare DataFrames https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.equals.html
I don't know what would be the preferred way to compare DataFrames or if there needs to be a choice.

@Galsor
Copy link

Galsor commented May 15, 2024

@lucianosrp proposed the following in #67:

def test_get_data():
   df = get_data()
   assert (df == snapshot()).all().all()

This might work, but does not look very intuitive in for me, but I have not much experience with pandas. Maybe it is what someone who uses pandas wants to write. Please let me know.

Doesn't look intuitive to me either.

Is there any constraint related to the snapshot implementation that requires sticking to the assert x == y syntax?
My pandas unit tests often look like:

def test_df(mock_input):
   #Given
   expected_df = load_test_df_expected()
   #When
   df = custom_process(mock_input)
   #Then
   assert_frame_equal(df, expected_df)

And I would love to be able to write:

def test_df(mock_input):
   #When
   df = custom_process(mock_input)
   #Then
   assert_frame_equal(outsource(df), snapshot())

But this involves outsource(df) and snapshot to return the value of the underlying DataFrame at test time.

Wrapper solution

This implementation faces issues with the nasty dtypes handled by pandas. Could be tricky to handle all cases especially with all DataFrame flavors blossoming these days (Polars, Ibis, Modin, Spark etc.).
Here is quick test I explored:

import pandas as pd
from datetime import timedelta

data = {
    "category": pd.Categorical(["A", "B", "C"]),
    "string": ["foo", "bar", "baz"],
    "timedelta": [timedelta(days=1), timedelta(hours=2), timedelta(minutes=30)],
}

df = pd.DataFrame(data)

After --inline-snapshot=create it results in:

def test_df():
    assert Wrapper(df) == snapshot(
        Wrapper(
            {
                "category": {0: "A", 1: "B", 2: "C"},
                "string": {0: "foo", 1: "bar", 2: "baz"},
                "timedelta": {
                    0: Timedelta("1 days 00:00:00"),
                    1: Timedelta("0 days 02:00:00"),
                    2: Timedelta("0 days 00:30:00"),
                },
            }
        )
    )

Where Timedelta returned by to_dict() is a Pandas type:

type(df.to_dict()["timedelta"][0])
Out[8]: pandas._libs.tslibs.timedeltas.Timedelta

It generates NameErrors in the generated code due to the lacking imports.

@lucianosrp
Copy link

Doesn't look intuitive to me either.

I agree, currently using:

 df.to_dict("records") == snapshot()

Which is more intuitive and produces a nice and readable dict. Suits perfectly for my case.

@15r10nk
Copy link
Owner Author

15r10nk commented May 15, 2024

Is there any constraint related to the snapshot implementation that requires sticking to the assert x == y syntax?

One goal I have with inline-snapshot is to provide the ability to write custom functions which make use of the inline-snapshot logic.
An assert_frame_equal which works with snapshots could look like this:

from df_wrapper import Wrapper
from pandas.testing import assert_frame_equal
from pandas import DataFrame

from inline_snapshot import outsource,snapshot

def assert_frame_equal(df,df_snapshot):
    # just to show that you can some dynamic things with inline-snapshot
    if df.size > 5:
        # this case does not work currently, because outsource only works with str/bytes
        assert outsource(Wrapper(df)) == df_snapshot 
    else:
        assert Wrapper(df) == df_snapshot
    

def custom_process():
    return DataFrame({"col0":[1,2]})


def test_df():
   #When
   df = custom_process()
   #Then
   assert_frame_equal(df, snapshot(Wrapper({"col0": {0: 1, 1: 2}})))

This example is still missing some parameters, but I think that it should be possible to write an implementation which provides the same signature.

Changing your imports would then be enough if you want to use snapshots with assert_frame_equal:

#from pandas.testing import assert_frame_equal
from inline_snapshot_pandas import assert_frame_equal

This might be the easiest way to adopt inline-snapshot for existing pandas tests.

It generates NameErrors in the generated code due to the lacking imports.

I plan to generate import-statements in a future release, but this is not my priority now.
Cases like this could also be tricky because the type is part of some private module (_libs). It might also be re-exported in some __init__.py or somewhere else ... and this is the part where it gets tricky to automate.
importing the class by hand is currently the easiest solution pandas.Timedelta.

@15r10nk
Copy link
Owner Author

15r10nk commented May 15, 2024

I created a assert_frame_equal implementation which works with snapshot arguments.

You find the code here #87. I'm interested in your feedback and if something like this could work for you. outsourcing of the Dataframes does currently not work but i think that it could be implemented with a outsource=True argument later.

short example:

from inline_snapshot._pandas import assert_frame_equal

from inline_snapshot import snapshot
from pandas import DataFrame

def test_df():
   df = DataFrame({"col0": [1, 2]})

   # the second argument can be a snapshot,
   assert_frame_equal(df, snapshot())

   # and the generated code looks like this.
   assert_frame_equal(df, snapshot(DataFrame({"col0": {0: 1, 1: 2}})))

   # it can also be used without a snapshot
   assert_frame_equal(df, df)

@adriangb
Copy link

That looks really nice! Personally I don't particularly care about inline dataframes, I would only ever extern them. I'd also be interested in a more generic version of this because:

  1. We're using Polars not Pandas.
  2. We also want to compare externe'd images and in particular we want to save a heatmap of differences to a new image and include the path to that in the assertion failure (so that a human can reasonably review the difference).

@15r10nk
Copy link
Owner Author

15r10nk commented May 16, 2024

@adriangb could you provide me a code sample of what you are doing? I'm new to pandas/polars/... and want to understand how you are using these libraries and what are common testing patterns.

Are these real (visual) images?
Are you generating these heat maps with something else like matplotlib or with polars?

@15r10nk
Copy link
Owner Author

15r10nk commented May 16, 2024

I think I understand what you mean. You want to outsource images like png and show a visual diff (color different pixels red for example). Am I right?
I plan to implement this in #48.

I also looked at polars and assert_frame_equal looks the same there. It should be possible to apply the same solution like for pandas.

@adriangb
Copy link

Yes, but I don't need that to be built into inline-snapshot. A callback where I print out the path to the file like I have now is fine. I do think a heatmap for comparing images is a good idea for your web UI.

And yes polars can be implemented similar to pandas, but do you really want to have custom stuff in inline-snapshot for every library out there?

@15r10nk
Copy link
Owner Author

15r10nk commented May 16, 2024

but do you really want to have custom stuff in inline-snapshot for every library out there?

No, I want to put it into a extra package. This merge-request (#87) is just for me to play around and for you to try it out and give feedback.

My implementation for assert_frame_equal is actually so independent of inline-snapshot that I don't need to import it 😃.

@adriangb
Copy link

I guess my question is: will I be able to write my own assert_frame_equal for <insert arbitrary library that produces non reprable data structures?

@15r10nk
Copy link
Owner Author

15r10nk commented May 16, 2024

Yes, extensibility is one of my goals. #85 will allow you to register custom repr implementations and overwrite the default repr() behavior of classes which you have not control about (because the class are part of some other package).

This implementation is just a bit more tricky because DataFrame.eq does not return a bool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants