Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Implement "random" nullary function that generates uniform random between 0 and 1 #28196

Closed
asfimport opened this issue Apr 15, 2021 · 11 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Apr 15, 2021

This is similar to PostgreSQL's random()

https://www.postgresql.org/docs/8.2/functions-math.html

Reporter: Wes McKinney / @wesm
Assignee: Alex Suhan / @asuhan

Related issues:

PRs and other links:

Note: This issue was originally created as ARROW-12404. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
@edponce

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
I will work on this, but want to be clear with the requirement. Arrow has non-nullary uniform random number generators that generate N numbers which are defined in the testing module (.e.g, random_real()). The nullary random() will be included in the random namespace and make use of the GenerateOptions?

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
No, this shouldn't use any functionality defined in testing, as it is not built by default. Also, we won't necessarily want to use the same random number generator or initialization parameters.

(see also ARROW-10797 for a slightly related topic)

@asfimport
Copy link
Collaborator Author

yibocai#1:
If we are to implement from scratch a method other than std::uniform_real_distribution , it's better to take into account issue ARROW-12533.

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
Hi @cyb70289,

I am working on this issue and noticed that you submitted #10283 for ARROW-12533. Can I duplicate the RNGs for the compute layer?

@asfimport
Copy link
Collaborator Author

yibocai#1:
Sure, @edponce

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
Thanks! I know that code duplication is a "code smell" but compute and testing layers are independent of each other. As I understand Arrow does not (yet) has a well-defined common submodule to supply utility functions across layers, and this might be further limited if APIs of same functions vary across layers.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
Utility code goes into arrow/util/ ;)

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
That said, I'll reiterate that a RNG for testing needn't have the same properties as a RNG exposed as a public API. We need to be a bit more careful about the public API RNG:

  • needs to have good-quality statistical output
  • needs to have consistent, predictable performance

@asfimport
Copy link
Collaborator Author

Eduardo Ponce / @edponce:
Thanks for feedback @pitrou. I will definitely be careful with the public RNG. Will circle back on this when I get a stable implementation.

@asfimport
Copy link
Collaborator Author

Antoine Pitrou / @pitrou:
Issue resolved by pull request 11864
#11864

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant