Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ScientificTypes.jl extension #31

Merged
merged 5 commits into from
Jul 11, 2023
Merged

Create ScientificTypes.jl extension #31

merged 5 commits into from
Jul 11, 2023

Conversation

MilesCranmer
Copy link
Member

@MilesCranmer MilesCranmer commented Jul 6, 2023

@OkonSamuel @ablaom might one of you have some time to perhaps look at this and let me know if this is the correct way to forward scitype traits? It's a pretty lightweight interface; it basically just forwards the scitype call on the value of a physical quantity, while ignore the units.

The tricky part about this is integration that the units of Quantity{...,Dimensions{...}} objects are in SI base units, so as an example: you technically could not have a Count number of femtometers. It would need to be represented as a float, and therefore would be Continuous. This is a limitation of DynamicQuantities.jl perhaps, although you could create a custom AbstractDimensions that uses femtometers as a base.

I'm also opening to simply declaring scitype(::AbstractQuantity) = Continuous in all cases if that is better semantics, as physical quantities usually are measured continuously. Then again, perhaps someone has a custom scitype for the values they are working with – and it wouldn't make sense to override their chosen trait.

Hre are the current tests which illustrate the usage:

using DynamicQuantities
using ScientificTypes

x = 1.0u"m/s"

@test scitype(x) <: Continuous
@test scitype([x]) <: AbstractVector{<:Continuous}
@test scitype(randn(32) .* u"m/s") <: AbstractVector{<:Continuous}

# Converted to integer 1 m/s => Count
@test scitype(Quantity{Int}(x)) <: Count

X = (; x=randn(32) .* u"m/s")

@test scitype(X) <: Table{<:AbstractVector{<:Continuous}}

sch = schema(X)

@test first(sch.names) == :x
@test first(sch.scitypes) == Continuous
@test first(sch.types) <: Quantity{Float64}

@github-actions
Copy link

github-actions bot commented Jul 6, 2023

Benchmark Results

main 6e9336b... t[main]/t[6e9336b...]
creation/Quantity(x) 3.1 ± 0.1 ns 3.1 ± 0.1 ns 1
creation/Quantity(x, length=y) 3.4 ± 0 ns 3.4 ± 0 ns 1
time_to_load 0.139 ± 0.0013 s 0.138 ± 0.00086 s 1.01
with_numbers/*real 3.4 ± 0 ns 3.4 ± 0 ns 1
with_numbers/^int 24.1 ± 2.4 ns 24.1 ± 2.4 ns 1
with_numbers/^int * real 24.1 ± 2.4 ns 24.1 ± 2.4 ns 1
with_quantity/+y 7.7 ± 0 ns 7.7 ± 0.1 ns 1
with_quantity//y 3.4 ± 0 ns 3.4 ± 0 ns 1
with_self/dimension 1.7 ± 0 ns 1.7 ± 0 ns 1
with_self/inv 3.4 ± 0 ns 3.4 ± 0 ns 1
with_self/ustrip 1.7 ± 0 ns 1.7 ± 0 ns 1

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@OkonSamuel
Copy link

@MilesCranmer I don't see anything wrong with your idea.
ScientificTypes.jl is pretty similar to this package as it also allows users to get the scientific types of objects with types that may not be known at compile time. (e.g Any[1, 1//3, 1.5]).
So it's okay to pass the value to the scitype function.
The only thing is that if for instance scitype(1.0u"m/s") returns Continuous we require that it actually behaves like one. Which I don't think will be an issue as it plays well with other Continuous objects right?

While your idea is fine. The implementations isn't quite right.
At the moment the ScientificTypes interface is quite convoluted.
You may want to wait a bit as we are working on improving the ScientificTypes interface and making it easier for users to extend.

@MilesCranmer
Copy link
Member Author

Which I don't think will be an issue as it plays well with other Continuous objects right?

Yep, it basically inherits all behaviour from the value type. So if it stores a float, then it acts exactly like a float with functions in Base.

You may want to wait a bit as we are working on improving the ScientificTypes interface and making it easier for users to extend.

My only question would be: is it better I extend scitype from ScientificTypes, or from MLJModelInterface?

(The main reason I’m doing this is to get SymbolicRegression working with units)

@MilesCranmer MilesCranmer changed the base branch from main to extra-utils-2 July 10, 2023 06:22
@MilesCranmer MilesCranmer changed the base branch from extra-utils-2 to main July 10, 2023 06:26
@MilesCranmer
Copy link
Member Author

Okay I think I fixed the interface:

import ScientificTypes as ST
import ScientificTypesBase as STB

STB.scitype(x::AbstractQuantity, C::ST.DefaultConvention) = STB.scitype(ustrip(x), C)
STB.Scitype(::Type{<:AbstractQuantity{T}}, C::ST.DefaultConvention) where {T} = STB.Scitype(T, C)

@MilesCranmer
Copy link
Member Author

Merging now so I can get quantities working in SymbolicRegression.jl. But we can change this in the future if need be.

@MilesCranmer MilesCranmer merged commit cf26dc5 into main Jul 11, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants