Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Improvements, small Table stuff, Statistic in Enso not Java and few other minor bits. #3964

Merged
merged 35 commits into from
Dec 14, 2022

Conversation

jdunkerley
Copy link
Member

@jdunkerley jdunkerley commented Dec 8, 2022

Pull Request Description

  • Aligned compare_to so returns Type_Error if that is wrong type for Text, Ordering and Duration.
  • Add empty_object, empty_array. get_or_else, at, field_names and length to Json.
  • Fix Json serialisation of NaN and Infinity (to "null").
  • Added length, at and to_vector to Pair (allowing it to be treated as a Vector).
  • Added running_fold to the Vector and Range.
  • Added first and last to the Vector.Builder.
  • Allow order_by to take a single Sort_Column or have a mix of Text and Sort_Column.Name in a Vector.
  • Allow select_columns_helper to take a Text value. Allows for a single field in group_by in cross_tab.
  • Added Patch and Custom to HTTP_Method.
  • Added running Statistic calculation and moved more of the logic from Java to Enso. Performance seems similar to pure Java version now.

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the
    Scala,
    Java,
    and
    Rust
    style guides.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed: Enso GUI was tested when built using BOTH
      ./run ide build and ./run ide watch.

@jdunkerley jdunkerley force-pushed the wip/jd/table-union-initial branch 5 times, most recently from ac2413b to c246a19 Compare December 13, 2022 23:37
@jdunkerley jdunkerley marked this pull request as ready for review December 13, 2022 23:37
@jdunkerley jdunkerley changed the title JSON Improvements, Quality of life Table stuff and Table.Union JSON Improvements, small Table stuff, Statistic in Enso not Java and few other minor bits. Dec 13, 2022
Comment on lines 191 to 197
## Gets the value associated with the given index in this object.

Arguments:
- index: The index position from which to get the value.
at : Integer -> Json ! No_Such_Field | Out_Of_Bounds
at self index = case self of
Json.Array array -> array.at index
Json.String text -> Json.String (text.at index)
Json.Null -> Json.Null
_ -> Error.throw (Illegal_Argument.Error "Json.at: self must be a String or an Array.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can No_Such_Field be thrown here? Seems spurious.
Also, technically we should add Illegal_Argument here though. And the same for get and get_or_else.

Btw. should we allow get and at to be aliases of each other to emulate how in JS we use the same syntax for both: obj[field] and array[index]? Just a thought, although I guess keeping them separate is safer, so I'm probably for keeping it - but wanted to address possibilities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Legacy from a previous step.
This is a partial step forward better JSON support is coming. The current is a disaster to work with.

Comment on lines +46 to +48
compare_to self that = case that of
_ : Ordering -> self.to_sign.compare_to that.to_sign
_ -> Error.throw (Type_Error.Error Ordering that "that")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. what yielded this change?

I'm fine with it but usually we do not check the input types anyway, but I'm wondering what led to this change and if maybe we can do a more general solution at Incomparable_Values.handle_errors level?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a Type_Error and when I was reviewing compare_to spotted it.
If passed an invalid type for comparing we should be consistent and return a Type_Error.

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at most changes and tests and they seem good. I want to have a deeper look at the statistics though.


java_stats = statistics.map .order
skip_moments = java_stats.all s->s.is_nothing
moments = if skip_moments then Nothing else Moments.new (java_stats.filter v->v!=Nothing . fold 0 .max)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should allow 2.max Nothing == 2, then we'd be able to skip the filter step.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't want to change that behavior at the moment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fair enough

Comment on lines -6 to +7
## UNSTABLE
## PRIVATE
UNSTABLE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit wrong.

I think we should have a separate tag for marking entries that should not be shown in the GUI.

These errors are not meant to be private - normal text user should be able to use them. Well, even GUI user may want to value.catch Illegal_Argument.

If we mark stuff that is intended for usage by users as private, they will end up using it anyway and alongside it, they will be using the truly private stuff too.

Not something to address in this PR, but I really think we should separate truly private implementation details from stuff that is just meant to be hidden to the GUI. Ideally, before the release.

Otherwise, we will have no good way of saying 'this is an implementation detail, do not rely on it staying as it is'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree the concept of genuinely private keeps coming up. Using the comment to mask is just a workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I'm fine with the PRIVATE comment, although I see that a genuine private would be even better. Still, not good to mix our current only concept of privacy with a separate concept of 'users are totally FINE using this, but do not show in the GUI component browser'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it sounds like a simple thing to workaround, can we coordinate with IDE/SuggestionDatabase guys to just add one more keyword, like TEXT-ONLY or HIDDEN which would for now work the same way as PRIVATE (i.e. the modification in GUI / DB just needs to add an ||) and use it here? This way we'll avoid abusing PRIVATE and can do this properly before we get proper private (which probably won't happen too soon).

Comment on lines +1057 to +1065
Test.specify "should allow a grouping by text" <|
t = Table.new [["Group", ["A","B","A","B","A","B","A","B","A"]], ["Key", ["x", "x", "x", "x", "y", "y", "y", "z", "z"]], ["Value", [1, 2, 3, 4, 5, 6, 7, 8, 9]]]
t1 = t.cross_tab "Group" "Key"
t1.row_count . should_equal 2
t1.column_count . should_equal 4
t1.at "Group" . to_vector . should_equal ["A", "B"]
t1.at "x" . to_vector . should_equal [2, 2]
t1.at "y" . to_vector . should_equal [2, 1]
t1.at "z" . to_vector . should_equal [1, 1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth double-checking that the grouping supports Unicode normalization in backends with test_selection.supports_unicode_normalization == True?

test/Tests/src/Data/Json_Spec.enso Show resolved Hide resolved
test/Tests/src/Data/Json_Spec.enso Outdated Show resolved Hide resolved
test/Tests/src/Data/Pair_Spec.enso Outdated Show resolved Hide resolved
import java.util.Iterator;
import java.util.function.BiFunction;
import java.util.stream.Stream;
import org.graalvm.polyglot.Value;

public class CountMinMax {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand why we keep that in Java?
Seems like it should be well emulated by a collection of 4 Refs.
Although I do like that it is encapsulated in a purposeful object - this is neat. Just wondering if the object couldn't be managed by Enso? Or were Refs not performant enough here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Massively slower

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok then. interesting though as in principle it should have similar characteristics. Is it worth discussing with @JaroslavTulach if we want to find out why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we will have a ticket to look at it

It accepts a Text value to check if the value contains it. In case of
Table operations, it can accept another column - then the corresponding
values from the source column and the provided column are checked.
Not_Contains (substring:Text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Not_Contains/Missing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds nice, but the Not_ prefix is consistent with all the other operations so I'm not sure if consistency here will not be more important.

@jdunkerley jdunkerley requested a review from 4e6 as a code owner December 14, 2022 16:35
@jdunkerley jdunkerley added the CI: Ready to merge This PR is eligible for automatic merge label Dec 14, 2022
@mergify mergify bot merged commit 77fe69d into develop Dec 14, 2022
@mergify mergify bot deleted the wip/jd/table-union-initial branch December 14, 2022 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants