Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] add missing arrayXYZ functions #62299

Merged
merged 18 commits into from
Apr 10, 2024

Conversation

Blargian
Copy link
Contributor

@Blargian Blargian commented Apr 4, 2024

Closes #1934 as part of the functions project to document missing functions.

This PR adds documentation for the following functions listed in #1934:

  • array → already exists as array(x1, …), operator [x1, …] here
  • arrayDotProduct
  • arrayEnumerateDenseRanked
  • arrayEnumerateUniqRanked
  • arrayFirstOrNull
  • arrayFlatten → already exists here
  • arrayLastOrNull
  • arrayPartialShuffle
  • arrayShuffle

Changelog category (leave one):

  • Documentation (changelog entry is not required)

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@Blargian
Copy link
Contributor Author

Blargian commented Apr 4, 2024

@justindeguzman @johnnymatthews could one of you please give this a look over when you get a chance.

@Algunenano Algunenano added the can be tested Allows running workflows for external contributors label Apr 8, 2024
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-documentation Documentation PRs for the specific code PR label Apr 8, 2024
@robot-clickhouse-ci-1
Copy link
Contributor

robot-clickhouse-ci-1 commented Apr 8, 2024

This is an automated comment for commit 19916de with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
A SyncThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS⏳ pending
Mergeable CheckChecks if all other necessary checks are successful❌ failure
Successful checks
Check nameDescriptionStatus
CI runningA meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docs checkBuilds and tests the documentation✅ success
PR CheckThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success

@Blargian Blargian changed the title Document missing arrayXYZ functions [Docs] add missing arrayXYZ functions Apr 9, 2024
@rschu1ze rschu1ze self-assigned this Apr 10, 2024
docs/en/sql-reference/functions/array-functions.md Outdated Show resolved Hide resolved
docs/en/sql-reference/functions/array-functions.md Outdated Show resolved Hide resolved
docs/en/sql-reference/functions/array-functions.md Outdated Show resolved Hide resolved

## arrayPartialShuffle

Returns an array of the same size as the original array where elements in range `[1..limit]` are a random subset of the original array. Remaining `(limit..N]` shall contain the elements not in `[1..limit]` range in an undefined order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document what N stands for (=the cardinality/size of the input array).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear to the reader what this function is actually doing.

  • [1..limit] is supposed to be a random (aka. shuffled) subset of the input array
  • (limit..N] is supposed to be a random subset of the remaining elements.

So function arrayPartialShuffle basically permutes the whole thing? How is it then different from arrayShuffle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschu1ze I've tried to make it clearer what this function does - both in the description, and in the examples. How does it look to you now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better now, thank you!


## arrayPartialShuffle

Given an input array of cardinality `N`, returns an array of size N where elements in the range `[1...limit]` are shuffled and the remaining elements in the range `(limit...n]` are unshuffled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In of size N, "N" should be quoted.

(really minor, can be fixed in a follow-up PR)

[6,2,3,4,5,1,7,8,9,10]
```

In this example, the `limit` is increased to `2` and a `seed` value is provided. The order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The order" seems a remnant.

@rschu1ze rschu1ze added this pull request to the merge queue Apr 10, 2024
Merged via the queue into ClickHouse:master with commit daf2fdc Apr 10, 2024
28 of 29 checks passed
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-documentation Documentation PRs for the specific code PR pr-synced-to-cloud The PR is synced to the cloud repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document arrayXYZ functions.
6 participants