Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve testing infrastructure. #871

Merged
merged 1 commit into from Dec 19, 2023
Merged

Improve testing infrastructure. #871

merged 1 commit into from Dec 19, 2023

Conversation

1uc
Copy link
Collaborator

@1uc 1uc commented Nov 27, 2023

Adds/reimplements abstractions for the following:

  • Create multi-dimensional array filled with suitable values.
  • Traits for accessing values.
  • Traits for hiding the difference of DataSet and Attribute.
  • Useful utilities such as ravel, unravel and flat_size.

Copy link

codecov bot commented Nov 27, 2023

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Comparison is base (30a9d14) 84.97% compared to head (586310f) 85.62%.

Files Patch % Lines
tests/unit/test_all_types.cpp 90.69% 12 Missing ⚠️
tests/unit/data_generator.hpp 97.76% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #871      +/-   ##
==========================================
+ Coverage   84.97%   85.62%   +0.65%     
==========================================
  Files          71       73       +2     
  Lines        5118     5461     +343     
==========================================
+ Hits         4349     4676     +327     
- Misses        769      785      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@1uc 1uc force-pushed the 1uc/improve-data-generator branch 7 times, most recently from 8e44c75 to 1a87fe7 Compare November 28, 2023 11:10
@1uc 1uc force-pushed the 1uc/improve-data-generator branch 11 times, most recently from 687d71d to 9a97a69 Compare December 1, 2023 08:17
Adds/reimplements abstractions for the following:

  * Create multi-dimensional array filled with suitable values.
  * Traits for accessing values.
  * Traits for hiding the difference of DataSet and Attribute.
  * Useful utilities such as `ravel`, `unravel` and `flat_size`.
@1uc 1uc force-pushed the 1uc/improve-data-generator branch from 9a97a69 to 586310f Compare December 1, 2023 08:24
@1uc 1uc marked this pull request as ready for review December 1, 2023 08:59
@pramodk pramodk requested a review from alkino December 8, 2023 08:26
`DataGenerator`. For example:
```
auto dims = std::vector<size_t>{4, 2};
auto values = testing::DataGenerator<std::vector<std::array<double, 2>>::create(dims);
Copy link
Collaborator

@tdegeus tdegeus Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a library for this! For example, you can use the NumPy equivalent xtensor. Writing and maintaining some extra custom code is extra work and error-prone. I would say that testable data are as simple as

xt::xarray<int> a = xt::arange<int>(5 * 4 * 3).reshape({5, 4, 3}); 

Then you can use functions like ravel, ravel_index, ... just like in NumPy. No new API, no new code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to test the following:

template<class Container>
void check_write_dataset(const Container& container) {
  file.createDataSet("foo", container);
}

Hence, I need to able to create arrays of all supported types

std::vector<T>
std::vector<std::array<T, 4>>
std::array<std::vector<T>, 4>
boost::
// xt:: (once supported in core)

and so forth, and be able to fill them with exactly the correct values. Being able to have any container filled with sensible values, requires being able to a) allocate containers of a given shape, b) assign values. Those two feature make up a good chunk of the PR. Then there very minor niceties like copying, etc.

The second issue is that I need a somewhat uniform API for creating the product of template<class T> Container with a bunch of different scalar types T. That's the other large chunk of the PR.

How does an xt::array help in this situation? Yes, I can test XTensor, but I can't test writing of any of the other containers. If I use XTensor arrays to write, I can read the other containers, but I still can't check that the values are correct, because I don't have access to some API unifying trait to hide the difference of a[i][j], a(i,j) etc. (gets worse in 3D).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a pressing reason to use a custom implementation, of course, it's fine. Anyway, this is not a public implementation, nor will change a lot.

I just thought it would be easier to write some casters but avoid implementing (un)ravel(index) functions and equality operators.

```
using trait = testing::ContainerTraits<std::vector<std::array<int, 2>>;
// auto x = values[1][0];
auto x = trait::get(values, {1, 0});
Copy link
Collaborator

@tdegeus tdegeus Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the command above:

auto x = a(1, 0);

there is also

auto y = a.flat(10)

to read directly from the flat storage.

(comment applicable below)

@tdegeus
Copy link
Collaborator

tdegeus commented Dec 19, 2023

It's a good idea to streamline testing. However, as I stated in some comments, there are existing libraries for this, and I would strongly suggest not to seek a custom implementation. Using a library comes with the advantage that:

  1. It does not imply extra code that has to be maintained.
  2. It comes with a fixed API that is well-documented.
  3. It comes with print functions.
  4. The libraries are usually themselves intensively tested.

@1uc
Copy link
Collaborator Author

1uc commented Dec 19, 2023

In defense of the complexity: it found several bugs/quirks or things to follow up on:

It would trivially fix:

Any non-uniformity is very painful to us (not just our users), and therefore has a higher chance of getting noticed and fixed. The following come to mind:

Copy link
Contributor

@ferdonline ferdonline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a nice idea to unify Dataset and Property testing. You also added tests convering that, so I really don't have anything to nitpick about. If you want to wait for other reviews, great, but from my side good to go

@1uc 1uc merged commit 7638232 into master Dec 19, 2023
36 checks passed
@1uc 1uc deleted the 1uc/improve-data-generator branch December 19, 2023 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants