Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

randomASCII function #8401

Merged

Conversation

BayoNet
Copy link
Contributor

@BayoNet BayoNet commented Dec 25, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • New Feature

Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):

Added the randomASCII(length) function, generating a string with a random set of ASCII printable characters.

Detailed description:

Documentation inside the PR.

@BayoNet BayoNet added the pr-feature Pull request with new product feature label Dec 25, 2019
WriteBufferFromVector<ColumnString::Chars> buf_to(data_to);

std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(32, 127); //Printable ASCII symbols
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

127 is not printable

str_length = static_cast<size_t>(vec_from[i]);
}

generator.seed( rd() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is slow.

You can also add performance test. Look at dbms/tests/performance.


generator.seed( rd() );

if (str_length > 0){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is redundant, it will be checked in a loop also before first loop iteration.

return name;
}

size_t getNumberOfArguments() const override { return 1; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also support additional "disambiguation" arguments, similar to rand function.


void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result, size_t input_rows_count) override
{
if (!(executeType<UInt8>(block, arguments, result, input_rows_count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, the cost of virtual call is Ok here, we can use IColumn::getUInt method and avoid manual dispatching.

@alexey-milovidov
Copy link
Member

"Ideal" solution (if you will not propose better) would be:

@alexey-milovidov
Copy link
Member

Demo:

SELECT arrayStringConcat(arrayMap(x -> char(intDiv((rand(x) % 255) * 95, 256) + 32), range(100)))

┌─arrayStringConcat(arrayMap(lambda(tuple(x), char(plus(intDiv(multiply(modulo(rand(x), 255), 95), 256), 32))), range(100)))─┐
│ +P0\?rgPZ:iMVo7xHGQ>v-_>-2?6R6PEfe._VT<9<d$4Cv#FE9!:;;JSVL4r6!r "brtUb\"8FOy-!6J/KWvPUl4^Rp/`Yr+Zo:x                       │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


Принимает на вход имя и аргументы модели. Возвращает Float64.
```sql
randomASKII(length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

**Syntax**

```sql
randomASKII(length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

class FunctionRandomASCII : public IFunction
{
public:
static constexpr auto name = "randomASCII";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name it randomPrintableASCII to avoid confusion.

1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create reference files simply by
clickhouse-client -n --testmode < query.sql > query.reference

@alexey-milovidov alexey-milovidov self-assigned this Dec 28, 2019
alexey-milovidov added a commit that referenced this pull request Dec 28, 2019
@alexey-milovidov alexey-milovidov mentioned this pull request Dec 28, 2019
alexey-milovidov added a commit that referenced this pull request Dec 29, 2019
@alexey-milovidov alexey-milovidov merged commit 4b81715 into ClickHouse:master Dec 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature Pull request with new product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants