Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

labels support #426

Open
14 tasks
9il opened this issue Jul 8, 2022 · 9 comments
Open
14 tasks

labels support #426

9il opened this issue Jul 8, 2022 · 9 comments

Comments

@9il
Copy link
Member

9il commented Jul 8, 2022

Minimal support:

  • opIndex that returns slices, should also attach labels
  • indexed patch for labels
  • opIndex that returns indexed slices, should also attach labels
  • API to access labels
  • API to attach and detach labels
  • API to sort ndslices across labels
  • API to access ndslices sorted across labels such as atLabel!d, lowerBound!d, upperBound!d, transactionIndex!d
  • check that base mutation API like popFront works
  • Common API like opIndex access to elements should work the same way like for common slices without labels
  • Port tests from Series

Advanced support:

  • Adopt mir.ndslice.dynamic
  • Adopt mir.ndslice.topology
  • Adopt mir.algorithm.iteration
  • Adopt the rest of mir.ndslice.*
@jmh530
Copy link
Contributor

jmh530 commented May 16, 2023

Also note that functions that take a slice as input can't accept a slice with labels. I show using isMatrix below, but isConvertibleToSlice also works. So functions that don't have an overload for isConvertibleToSlice should if possible.

/+dub.sdl:
dependency "mir-algorithm" version="*"
+/

import mir.ndslice.allocation: slice;
import mir.ndslice.slice: Slice, SliceKind;
import mir.ndslice.traits: isMatrix;

void foo(SliceKind kind)(Slice!(double*, 2, kind) x) {}
void bar(M)(M x) if(isMatrix!M) {}

void main()
{
    auto dataframe = slice!(double, string)(4, 3);
    foo(dataframe); //does not compile
    bar(dataframe);
}

@jmh530
Copy link
Contributor

jmh530 commented May 18, 2023

toSlice is another one of the functions (among many) that suffers from this problem. The problem is that relying on something like isSlice!S may result in a lot of template bloat. So what you would have to do is have some functions that constraint on isSlice!S and that just forward to some implementation function after stripping the labels.

One related problem is that there is a bug (#23924) related to how DMD handles overloads with template enum parameters combined with variadic templates that makes some of this more difficult than it should be.

Another solution that had been discussed in the past is to have DataFrame as a separate type that takes a slice as an alias this member.

import mir.ndslice.slice;

struct DataFrame(Iterator, size_t N, SliceKind kind, Labels...)
{
    Slice!(Iterator, N, kind) x;
    alias x this;
}

void foo(Iterator, size_t N, SliceKind kind)(Slice!(Iterator, N, kind) x) {}

void main()
{
    DataFrame!(double*, 1, Contiguous, string) x;
    foo(x);
}

@jmh530
Copy link
Contributor

jmh530 commented May 19, 2023

Eh, maybe a better approach would be define something like

enum bool isSliceExact(T) = is(T == Slice!(Iterator, N, kind), Iterator, size_t N, SliceKind kind);

and then use it like

void foo(Iterator, size_t N, SliceKind kind)(Slice!(Iterator, N, kind) x) {}
void foo(T)(T x) if(isConvertibleToSlice!T && !isSliceExact!T) {x.toSlice.foo}

This should avoid the problem of matching both of them. It should work with slices that have labels as well.

@9il
Copy link
Member Author

9il commented May 19, 2023

Labels may be a breaking change. mir-algorithm and mir-ion have to try to avoid them until at least 2026.

@jmh530
Copy link
Contributor

jmh530 commented May 19, 2023

What happens in 2026?

@9il
Copy link
Member Author

9il commented May 20, 2023

The thing is that I am going to spend far less time with Mir. But still have to support it for a few years. On the other hand, we can add new code if that will not affect existing behavior.

@jmh530
Copy link
Contributor

jmh530 commented May 22, 2023

Thanks. I was just confused by the reference to 2026. Your previous comment made it seem like improved labels support might come, but not until at least after 2026. This most recent comment suggests the support is not likely to come at all.

If one of my PRs is adding too much new code or changes, just let me know and we can close. I've got an outstanding lubeck PR, but I'm thinking about just moving the functionality I need into mir-stat.

@9il
Copy link
Member Author

9il commented May 25, 2023

Actually, if the API doesn't change existing code in Lubeck, you can add new API ther.

@jmh530
Copy link
Contributor

jmh530 commented May 25, 2023

@9il See here. I don't introduce breaking changes, but I am making changes to existing code (the diff in some cases makes it look like there are more significant changes than there are).

My objective from these changes is to have multivariate normal densities in mir.stat. Having functions to handle quadratic forms and Mahalanobis distance is useful in this, and improving the handling of matrix multiplication makes writing those functions simpler. It doesn't really matter to me where this stuff is located.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants