Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-13345: [C++] Add basic implementation for log to base b #10898

Closed

Conversation

rommelDB
Copy link
Contributor

@rommelDB rommelDB commented Aug 6, 2021

Changes:

  • Added basic implementation for log to base b.
  • Added unit tests.

@github-actions
Copy link

github-actions bot commented Aug 6, 2021

@pitrou
Copy link
Member

pitrou commented Aug 9, 2021

Is it useful for the log base to be given as a Datum rather than a function option? Are there use cases where one wants to lookup the log base in a column?

@rommelDB
Copy link
Contributor Author

rommelDB commented Aug 9, 2021

Is it useful for the log base to be given as a Datum rather than a function option? Are there use cases where one wants to lookup the log base in a column?

@pitrou To be honest, I'm not sure if that such a scenario exists or whether that scenario is frequent. In any case, this draft was my first approach, and I can certainly change it to consider the base as a single value as a function option. Accordingly, I think that the base as a value makes more sense.

@pitrou
Copy link
Member

pitrou commented Aug 9, 2021

cc @ianmcook @jonkeane for opinions about the log base question.

@jonkeane
Copy link
Member

jonkeane commented Aug 9, 2021

That seems pretty uncommon to me. I could contrive a few examples that aren't totally outlandish (say you're scaling/normalizing a value by groups, you might save the log base that you did the normalization with to be able to transform something back into the original values. But that's pretty contrived, and I bet most people would approach that with a lookup table and a for loop, basically.

@ianmcook
Copy link
Member

I like having the option for the base to be an array. I suspect that in the vast majority of cases users will pass a scalar base, but this implementation is the most flexible way to do it.

@pitrou
Copy link
Member

pitrou commented Aug 10, 2021

I like having the option for the base to be an array. I suspect that in the vast majority of cases users will pass a scalar base, but this implementation is the most flexible way to do it.

That sounds fine to me.

@rommelDB rommelDB marked this pull request as ready for review August 11, 2021 14:54
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me overall! I left some comments on the test cases.

auto func =
std::make_shared<ArithmeticFloatingPointFunction>(name, Arity::Binary(), doc);
for (const auto& ty : FloatingPointTypes()) {
auto output = is_integer(ty->id()) ? float64() : ty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this line is redundant now (ty is never integer). I had filed ARROW-13361 to clean this up.

@@ -1589,6 +1636,19 @@ const FunctionDoc log1p_checked_doc{
"-inf or NaN."),
{"x"}};

const FunctionDoc logb_doc{
"Compute log of x to base b of arguments element-wise",
("Values <= -1 return -inf or NaN. Null values return null.\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Values <= 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

}

TYPED_TEST(TestBinaryArithmeticFloating, Log) {
// Integer arguments promoted to double, sanity check here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think this comment applies here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right.

this->AssertBinop(Logb, "[1.0, 10.0, null]", "[10.0, 10.0, null]", "[0.0, 1.0, null]");
this->AssertBinop(Logb, "[1.0, 2.0, null]", "[2.0, 2.0, null]", "[0.0, 1.0, null]");
this->AssertBinop(Logb, "[10.0, 100.0, 1000.0, null]", this->MakeScalar(10),
"[1.0, 2.0, 3.0, null]");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also test what happens on Log of NaN, Inf, and -Inf. We should also check the error cases below here, with both check_overflow = true and check_overflow = false. Or really, it should look similar to the tests for the unary log functions here:

TYPED_TEST(TestUnaryArithmeticFloating, Log) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More unit tests were added.

@rommelDB rommelDB requested a review from lidavidm August 11, 2021 23:01
@pitrou pitrou changed the title ARROW-13345: [C++] Added basic implementation for log to base b ARROW-13345: [C++] Add basic implementation for log to base b Aug 12, 2021
@pitrou
Copy link
Member

pitrou commented Aug 12, 2021

@rommelDB Can you try to fix the build errors on Windows (see CI)?

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As Antoine notes, there's a build failure on Windows, you'll likely need a cast somewhere: https://github.com/apache/arrow/pull/10898/checks?check_run_id=3306264947#step:7:901

@lidavidm lidavidm closed this in c6b0dda Aug 13, 2021
@lidavidm
Copy link
Member

Thanks for implementing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants