Skip to content

Conversation

@pasabanov
Copy link
Member

@pasabanov pasabanov commented Jan 26, 2025

Types of changes

  • Bug fix
  • Documentation
  • Testing
  • Refactoring, reformatting, cleanup
  • Dependency version updates
  • Configuration (submodules, CI/CD)

Related: #1 #4 #15 #17 ALFI-lib/test_data#1

Description

1. Dependencies:

  • Added submodule github.com:ALFI-library/test_data to tests/test_data, containing a TOML file with test data for distributions.
  • Added submodule github.com:marzer/tomlplusplus to tests/deps/toml++, containing a header-only library for TOML processing.
  • In helper.py:
    • Added git submodule update --init to dependency install commands.
    • In connection, added git to the list of dependencies installed by apt.
  • Added submodules: true to actions/checkout in .github/workflows/*.yml files.

2. Testing:

  • Added distribution function testing using data from the TOML file:
    • Updated tests/CMakeLists.txt to reflect changes.
    • Included toml++/toml.hpp in tests/test_utils.h.
    • Added helper function to_vector to tests/test_utils.h, converting toml::array to std::vector.
    • Rewrote tests/dist/test_dist.cpp to read TOML data and use it for testing all distribution functions.
    • Removed legacy testing code.

3. New files:

  • Added ALFI/ALFI/util/points.h containing:
    • Functions lin_map and lin_mapped for mapping points between intervals.
    • Functions stretch (moved from ALFI/ALFI/dist.h) and stretched.

4. Fixes:

  • In expect_eq function in tests/test_utils.h: replaced EXPECT_NEAR with EXPECT_TRUE and alfi::util::numeric::are_equal.
  • Revised functions in alfi::dist namespace (sigmoid, sigmoid_stretched, erf, erf_stretched) to improve logical consistency and return values closer to interval boundaries.
  • Fixed the stretch function:
    • Previously subtracted (a+b)/2 instead of (min+max)/2.
    • Now implemented via lin_map.
    • NOTE: Test results show this significantly reduces precision for all _stretched functions in alfi::dist.

5. Documentation:

  • Added descriptions for alfi::dist::sigmoid and alfi::dist::erf.
  • Used Qt-inspired Doxygen style: without extra asterisks (*), tab-indented descriptions - for readability and space efficiency.

Screenshots, demonstrating the differences between the unfixed and fixed versions of the sigmoid, sigmoid_stretched, erf and erf_stretched functions:

before fix after fix

Motivation

Why TOML-based testing? This approach was chosen to:

  1. Future-proof cross-language compatibility
    Isolating test data as a separate entity allows seamless reuse across potential future ports (Python, JavaScript, GNU Octave, etc.), avoiding duplication.
  2. Enable automated generation and validation
    Test datasets can be programmatically generated and verified.
  3. Improve maintainability
    Declarative format provides human-readable test cases.
    TOML was chosen over YAML/JSON for its simple and whitespace-agnostic syntax.

How to test

Tests now require submodules to run. To download the submodules one may use the following approaches:

  1. Clone the repository with the --recurse-submodules option:
    git clone --recurse-submodules https://github.com/ALFI-library/ALFI
    # or
    git clone --recurse-submodules git@github.com:ALFI-library/ALFI
  2. Update the submodules after clone:
    git submodule update --init
  3. Use the helper.py utility to update all the dependencies, including apt dependencies (requires sudo!) and the submodules:
    helper.py -d
    # or
    helper.py --deps

Future improvements

  1. Extend data-driven testing to all other parts of the library.
  2. Investigate the precision reduction for _stretched functions in alfi::dist after the stretch function fix.
  3. Find the best approach to use in the lin_map function.

@pasabanov pasabanov added config Configuring the project docs Improvements or additions to documentation fix Fixing some bugs refactor Improving code structure without adding new functionality tests Adding or changing tests labels Jan 26, 2025
@pasabanov pasabanov added this to the First Release Version milestone Jan 26, 2025
@pasabanov pasabanov self-assigned this Jan 26, 2025
1. Dependencies:
  - Added submodule `github.com:ALFI-library/test_data` to `tests/test_data`,
    containing the `dist/dist.toml` file with test data for distributions.
  - Added submodule `github.com:marzer/tomlplusplus` to `tests/deps/toml++`,
    containing a header-only library for TOML processing.
  - In `helper.py`:
    - Added `git submodule update --init` to dependency install commands.
    - In connection, added `git` to the list of dependencies installed by `apt`.
  - Added `submodules: true` to `actions/checkout` in `.github/workflows/*.yml` files.

2. Testing:
  - Added distribution function testing using data from the TOML file:
    - Updated `tests/CMakeLists.txt` to reflect changes.
    - Included `toml++/toml.hpp` in `tests/test_utils.h`.
    - Added helper function `to_vector` to `tests/test_utils.h`,
      converting `toml::array` to `std::vector`.
    - Rewrote `tests/dist/test_dist.cpp` to read TOML data
      and use it for testing all distribution functions.
    - Removed legacy testing code.

3. New files:
  - Added `ALFI/ALFI/util/points.h` containing:
    - Functions `lin_map` and `lin_mapped` for mapping points between intervals.
    - Functions `stretch` (moved from `ALFI/ALFI/dist.h`) and `stretched`.

4. Fixes:
  - In `expect_eq` function in `tests/test_utils.h`:
    replaced `EXPECT_NEAR` with `EXPECT_TRUE` and `alfi::util::numeric::are_equal`.
  - Revised functions in `alfi::dist` namespace
    (`sigmoid`, `sigmoid_stretched`, `erf`, `erf_stretched`)
    to improve logical consistency and return values closer to interval boundaries.
  - Fixed the `stretch` function:
    - Previously subtracted `(a+b)/2` instead of `(min+max)/2`.
    - Now implemented via `lin_map`.
    - NOTE: Test results show this **significantly** reduces precision
      for all `_stretched` functions in `alfi::dist`.

5. Documentation:
  - Added descriptions for `alfi::dist::sigmoid` and `alfi::dist::erf`.
@pasabanov pasabanov changed the title Implemented TOML-based testing for distributions, related fixes and documentation Implemented data-driven testing for dist, related fixes and docs Jan 26, 2025
@pasabanov pasabanov linked an issue Jan 26, 2025 that may be closed by this pull request
const auto mid2 = (c + d) / 2;
const auto scale = (d - c) / (b - a);
for (auto& point : points) {
point = mid2 + scale * (point - mid1);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear whether the mapping should be based on the beginning or the middle of the segment, or if some hybrid approach should be used.

  • Which way would be more accurate?
  • How does this affect the fact that the points in the library are symmetrical relative to the midpoints of the segments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config Configuring the project docs Improvements or additions to documentation fix Fixing some bugs refactor Improving code structure without adding new functionality tests Adding or changing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants