New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repeat subcommand to builtin string #3864

Closed
Greynad opened this Issue Feb 21, 2017 · 18 comments

Comments

Projects
None yet
5 participants
@Greynad
Contributor

Greynad commented Feb 21, 2017

why?

I often find myself in need of a way to replicate a string n times and there is currently no way to do this natively in any shell. So people came up with a few ways to achieve this, i.e:

python -c 'print "A"*10'
ruby -e 'puts "A"*10'
perl -E 'say "A"x10'
printf "A%.0s" {1..10}

Creating a function to wrap any of these method is fine, but why bother when we could have a native function?

Proposal

Since (to my knowledge) no other shell has this feature, let's add this to fish!

The proposed syntax is:

string repeat -t 10 'string'

Repeat subcommand could take these arguments:

-t, --times: number of times to repeat the string
-n, --no-newline: whether to append a newline or not

alternatives:

-c, --count: number of times to repeat the string (as proposed by @krader1961 )
The newline could be appended automatically if we're not piping the output to another program

Misc

edge cases:

  • Should 'string repeat -t 0 test' yield an error or be allowed and not output anything?

A preliminary version of this has been proposed #3858

@floam

This comment has been minimized.

Show comment
Hide comment
@floam

floam Feb 22, 2017

Member

I'm not convinced we really need a builtin to do this. What's the common use-case?

It's easy to wrap and accomplish a variety of ways. Could use for loops or just a printf and a string replace which are builtins.

printf "%0.*d" 20 0 | string replace -a 0 Foo
FooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFoo

(There's certainly a more clever way I'm not thinking of).

Member

floam commented Feb 22, 2017

I'm not convinced we really need a builtin to do this. What's the common use-case?

It's easy to wrap and accomplish a variety of ways. Could use for loops or just a printf and a string replace which are builtins.

printf "%0.*d" 20 0 | string replace -a 0 Foo
FooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFooFoo

(There's certainly a more clever way I'm not thinking of).

@Greynad

This comment has been minimized.

Show comment
Hide comment
@Greynad

Greynad Feb 22, 2017

Contributor

A lot of scripting languages implements this feature (see python/ruby/perl examples)
In the terminal world, you can come up with a lot of clever hacks that works on some platform and not others, i.e bsd seq can be used seq -f "A" -s '' 10 but gnu seq cannot. Using a scripting language just to do this one task is also a bit cumbersome.

You can find a lot of threads regarding those 'hacks' on stackoverflow but no clean, portable solution really exists to this problem.

Now on the actual usefullness of such a feature, it highly depends on what tasks you're used to accomplish with your shell. I tend to use fish scripting the same way I'd use python or ruby, so i'd use this feature to quickly create files or inputs with repetitions, fuzzing, separating outputs with a given char ("----"), etc... without ever having to call another scripting lang.

The purpose of this PR is to fill that gap with an easy, fast builtin implementation.

Contributor

Greynad commented Feb 22, 2017

A lot of scripting languages implements this feature (see python/ruby/perl examples)
In the terminal world, you can come up with a lot of clever hacks that works on some platform and not others, i.e bsd seq can be used seq -f "A" -s '' 10 but gnu seq cannot. Using a scripting language just to do this one task is also a bit cumbersome.

You can find a lot of threads regarding those 'hacks' on stackoverflow but no clean, portable solution really exists to this problem.

Now on the actual usefullness of such a feature, it highly depends on what tasks you're used to accomplish with your shell. I tend to use fish scripting the same way I'd use python or ruby, so i'd use this feature to quickly create files or inputs with repetitions, fuzzing, separating outputs with a given char ("----"), etc... without ever having to call another scripting lang.

The purpose of this PR is to fill that gap with an easy, fast builtin implementation.

@krader1961 krader1961 added this to the fish-future milestone Feb 22, 2017

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Feb 22, 2017

Contributor

I'm not convinced we really need a builtin to do this.

It's not so much that we need a builtin since, as you pointed out, this can already be accomplished via other means. But the same can be said for much of the string functionality itself. I would have used this had it been available when I wrote the unit tests for verifying that read works correctly when handed too much data. I instead did something more clumsy using printf. If this weren't a borderline trivial, and small, enhancement to the string builtin I would be inclined to vote no. But given that it is and we have a proof of concept implementation from a volunteer I don't see any reason not to implement it.

The only questions I have involve the details of the implementation. Such as should it be --times or --count? What is the behavior if you don't specify a repetition count? Is that an error or does it default to zero (no output) or one (single instance of the string)? What if the user explicitly asks for --count=0? Is that an error or do we simply produce no output? A negative value could be an error but are there other useful meanings? For example, would it represent an upper bound on the number of chars written? That is, string repeat --count=-1024 abcdef means to repeat the string abcdef until 1023 characters have been written plus the newline rather than repeating the string 1024 times. The final repetition would be truncated as necessary. Or perhaps we accommodate that scenario by adding a --limit flag and treat --count=-1 to mean repeat until the limit is reached.

Contributor

krader1961 commented Feb 22, 2017

I'm not convinced we really need a builtin to do this.

It's not so much that we need a builtin since, as you pointed out, this can already be accomplished via other means. But the same can be said for much of the string functionality itself. I would have used this had it been available when I wrote the unit tests for verifying that read works correctly when handed too much data. I instead did something more clumsy using printf. If this weren't a borderline trivial, and small, enhancement to the string builtin I would be inclined to vote no. But given that it is and we have a proof of concept implementation from a volunteer I don't see any reason not to implement it.

The only questions I have involve the details of the implementation. Such as should it be --times or --count? What is the behavior if you don't specify a repetition count? Is that an error or does it default to zero (no output) or one (single instance of the string)? What if the user explicitly asks for --count=0? Is that an error or do we simply produce no output? A negative value could be an error but are there other useful meanings? For example, would it represent an upper bound on the number of chars written? That is, string repeat --count=-1024 abcdef means to repeat the string abcdef until 1023 characters have been written plus the newline rather than repeating the string 1024 times. The final repetition would be truncated as necessary. Or perhaps we accommodate that scenario by adding a --limit flag and treat --count=-1 to mean repeat until the limit is reached.

@krader1961 krader1961 added the RFC label Feb 22, 2017

@ridiculousfish

This comment has been minimized.

Show comment
Hide comment
@ridiculousfish

ridiculousfish Feb 22, 2017

Member

This seems harmless to me.

Member

ridiculousfish commented Feb 22, 2017

This seems harmless to me.

@Greynad

This comment has been minimized.

Show comment
Hide comment
@Greynad

Greynad Feb 23, 2017

Contributor

My two cents on this:

  • I like the --count flag better
  • not providing --count flag or --count=0 should both yield an error
  • I don't think it's necessary to clog the functionnality by giving meaning to a negative --count value
Contributor

Greynad commented Feb 23, 2017

My two cents on this:

  • I like the --count flag better
  • not providing --count flag or --count=0 should both yield an error
  • I don't think it's necessary to clog the functionnality by giving meaning to a negative --count value
@floam

This comment has been minimized.

Show comment
Hide comment
@floam

floam Feb 23, 2017

Member

I like -n as short for --count. I also think --count 0 should just have no output, to make it easier for scripts that might give different values and not want to guard for 0.

Member

floam commented Feb 23, 2017

I like -n as short for --count. I also think --count 0 should just have no output, to make it easier for scripts that might give different values and not want to guard for 0.

@floam

This comment has been minimized.

Show comment
Hide comment
@floam

floam Feb 23, 2017

Member

Do we really even need an --option? Since there's only one option we always require, couldn't it just be repeat NUM STRING?

Member

floam commented Feb 23, 2017

Do we really even need an --option? Since there's only one option we always require, couldn't it just be repeat NUM STRING?

@Greynad

This comment has been minimized.

Show comment
Hide comment
@Greynad

Greynad Feb 23, 2017

Contributor

Could you explain the logic behind -n as short for --count. The -n is already short for --no-newline.
no output in case of 0 and no flags at all is fine with me.

Contributor

Greynad commented Feb 23, 2017

Could you explain the logic behind -n as short for --count. The -n is already short for --no-newline.
no output in case of 0 and no flags at all is fine with me.

@floam

This comment has been minimized.

Show comment
Hide comment
@floam

floam Feb 23, 2017

Member

Oh, I missed the no-newline thing.

A lot of unix utilities use -n for counts.

Member

floam commented Feb 23, 2017

Oh, I missed the no-newline thing.

A lot of unix utilities use -n for counts.

@floam

This comment has been minimized.

Show comment
Hide comment
@floam

floam Feb 23, 2017

Member

Is -n for a newline after each string, or newline after all the output? In either case it could not really be necessary. It's not hard to add a trailing newline yourself - I'm thinking in command substitution cases, you'll actually want no newline by default.

Member

floam commented Feb 23, 2017

Is -n for a newline after each string, or newline after all the output? In either case it could not really be necessary. It's not hard to add a trailing newline yourself - I'm thinking in command substitution cases, you'll actually want no newline by default.

@Greynad

This comment has been minimized.

Show comment
Hide comment
@Greynad

Greynad Feb 23, 2017

Contributor

default behavior appends a newline at the very end of the output. -n removes it.
We could either get rid of the -n flag or just make it default to no newline.
Maybe there are a few cases where having a newline at the end is required, but I'm not thinking of any at the moment. So if someone can prove having the ability to append a newline is indeed useful, we'll keep it, otherwise we'll get rid of it.

Contributor

Greynad commented Feb 23, 2017

default behavior appends a newline at the very end of the output. -n removes it.
We could either get rid of the -n flag or just make it default to no newline.
Maybe there are a few cases where having a newline at the end is required, but I'm not thinking of any at the moment. So if someone can prove having the ability to append a newline is indeed useful, we'll keep it, otherwise we'll get rid of it.

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Feb 24, 2017

Contributor

I do prefer --count. I do agree with @floam that a count of zero should be legal and produce no output. However, we shouldn't default to a count of zero; i.e., a --count flag is mandatory unless a --limit is specified (see below). I also agree the short flag should be -n for consistency with many other commands.

I disagree with @floam that it should be a positional argument rather than a flag because that is an unusual pattern likely to cause problems; e.g., does the count occur before or after the string.

I think we should have a --limit flag to impose an upper bound on the number of characters emitted. In conjunction with this feature omitting the --count flag or specifying a value less than or equal to zero means repeat until the limit is reached. Obviously a count less than one with no limit should be an error. The limit does not include the trailing newline, if any.

I don't understand this comment by @floam:

I'm thinking in command substitution cases, you'll actually want no newline by default.

Command substitutions always remove the trailing newline if present. So it isn't necessary to do anything special in that case. On the other hand I can envision use cases where the output of string repeat is piped into another command or redirecting to a file and terminating the output with a newline should be the default behavior. I would keep the default to add a newline to the output unless the string being repeated ends with a newline. Make the --no-newline short flag -N so as not to conflict with -n for the repeat count. This is also consistent with many other commands where less frequently used short options are in uppercase to avoid conflicting with a more frequently used short option.

Contributor

krader1961 commented Feb 24, 2017

I do prefer --count. I do agree with @floam that a count of zero should be legal and produce no output. However, we shouldn't default to a count of zero; i.e., a --count flag is mandatory unless a --limit is specified (see below). I also agree the short flag should be -n for consistency with many other commands.

I disagree with @floam that it should be a positional argument rather than a flag because that is an unusual pattern likely to cause problems; e.g., does the count occur before or after the string.

I think we should have a --limit flag to impose an upper bound on the number of characters emitted. In conjunction with this feature omitting the --count flag or specifying a value less than or equal to zero means repeat until the limit is reached. Obviously a count less than one with no limit should be an error. The limit does not include the trailing newline, if any.

I don't understand this comment by @floam:

I'm thinking in command substitution cases, you'll actually want no newline by default.

Command substitutions always remove the trailing newline if present. So it isn't necessary to do anything special in that case. On the other hand I can envision use cases where the output of string repeat is piped into another command or redirecting to a file and terminating the output with a newline should be the default behavior. I would keep the default to add a newline to the output unless the string being repeated ends with a newline. Make the --no-newline short flag -N so as not to conflict with -n for the repeat count. This is also consistent with many other commands where less frequently used short options are in uppercase to avoid conflicting with a more frequently used short option.

@faho

This comment has been minimized.

Show comment
Hide comment
@faho

faho Feb 25, 2017

Member

I think we should have a --limit flag to impose an upper bound on the number of characters emitted.

string split already has "-m" or "--max", so use that for consistency?

Member

faho commented Feb 25, 2017

I think we should have a --limit flag to impose an upper bound on the number of characters emitted.

string split already has "-m" or "--max", so use that for consistency?

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Feb 25, 2017

Contributor

string split already has "-m" or "--max", so use that for consistency?

Agreed.

Contributor

krader1961 commented Feb 25, 2017

string split already has "-m" or "--max", so use that for consistency?

Agreed.

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Mar 4, 2017

Contributor

@Greynad, I think we have a rough consensus regarding how this should behave so I'm removing the RFC label. Feel free to create a new pull-request to replace PR #3858 at your convenience that addresses the points raised in this issue.

Contributor

krader1961 commented Mar 4, 2017

@Greynad, I think we have a rough consensus regarding how this should behave so I'm removing the RFC label. Feel free to create a new pull-request to replace PR #3858 at your convenience that addresses the points raised in this issue.

@Greynad

This comment has been minimized.

Show comment
Hide comment
@Greynad

Greynad Mar 4, 2017

Contributor

@krader1961 Thank you! Will do, I just have one last question before making the changes.
How will the --max flag behave? Do we use it as a standalone flag or with --count?
Something along the line of: string repeat -m 100 foo?

Contributor

Greynad commented Mar 4, 2017

@krader1961 Thank you! Will do, I just have one last question before making the changes.
How will the --max flag behave? Do we use it as a standalone flag or with --count?
Something along the line of: string repeat -m 100 foo?

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Mar 4, 2017

Contributor

I think it is reasonable to allow --max without --count to mean repeat as many times as necessary to reach the max number of characters. If combined with --count then we stop repeating the string when the count is reached even if we haven't reached the maximum limit on the output. Documenting this interaction clearly will probably be the hardest part of implementing this issue.

Contributor

krader1961 commented Mar 4, 2017

I think it is reasonable to allow --max without --count to mean repeat as many times as necessary to reach the max number of characters. If combined with --count then we stop repeating the string when the count is reached even if we haven't reached the maximum limit on the output. Documenting this interaction clearly will probably be the hardest part of implementing this issue.

@Greynad Greynad referenced this issue Mar 8, 2017

Closed

String repeat #3882

@krader1961 krader1961 modified the milestones: 2.6.0, fish-future Mar 15, 2017

@krader1961

This comment has been minimized.

Show comment
Hide comment
@krader1961

krader1961 Mar 15, 2017

Contributor

I added a mention of this to the CHANGELOG.md file.

It was a pleasure working with you on this change, @Greynad. I hope to see more work like this from you.

Contributor

krader1961 commented Mar 15, 2017

I added a mention of this to the CHANGELOG.md file.

It was a pleasure working with you on this change, @Greynad. I hope to see more work like this from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment