Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intelligent slice functionality for strings #413

Closed
aman-godara opened this issue May 22, 2021 · 8 comments · Fixed by #414
Closed

intelligent slice functionality for strings #413

aman-godara opened this issue May 22, 2021 · 8 comments · Fixed by #414
Labels
idea Proposition of an idea and opening an issue to discuss it

Comments

@aman-godara
Copy link
Member

aman-godara commented May 22, 2021

Description

Name of functionality: Slice
Signature: slice(string, start(optional)= 1 or len(string), end(optional)=len(string) or 1, stride(optional)= 1, include_start(optional)= .true.)
Output: a new string_type object

Traverses the input string from start index to end index taking a stride of stride indexes to return a new string. start can be greater than end as well giving function an added functionality of reversing input string. start index will always be included in the output substring unless include_start is set to .false. where the end index will be included. So either start index or end index will be included in the output string (or both).
This function is an intelligent one, if the user doesn’t provide any one or more of the 3 optional arguments (start, end or stride) it figures them out automatically using optional arguments which are given (see examples).
But if the user provides arguments he/she is expected to be responsible for that (see example 3).

Examples:

  1. slice(‘12345’, stride=-1) should give '54321'; start = 5, end = 1
  2. slice(‘abcd’, stride=-2, include_start=.false.) should give 'ca'; start = 4, end = 1
  3. slice(‘abcde’, 5, 2, 1) will give '' (empty string); user gave 1 as the stride argument
  4. slice(‘abcde’, 5, 2) will give 'edcb'; stride = -1
  5. slice(‘abcde’, end = 1, stride = -2) will give 'eca'; start = 5

Prior Art

Python has it, but not exactly in the manner proposed above. In python one can do

str = 'spiderman'
print(str[1:9:2])

which will return 'pdra'.

Please review the functionality and let me know your thoughts on this.

@aman-godara aman-godara added the idea Proposition of an idea and opening an issue to discuss it label May 22, 2021
@aman-godara
Copy link
Member Author

aman-godara commented May 22, 2021

Other possibility could be to have flexible stride argument. Where instead of asking user to give exact value of stride, user will be asked to give the absolute value of stride.
taking Example 3 from above:
slice(‘abcde’, 5, 2, 1) will give 'edcb'; stride was converted from 1 to -1.

@ivan-pi
Copy link
Member

ivan-pi commented May 23, 2021

There is subroutine extract proposed in #406 with similar functionality to slice. I'm not sure if the original proposal included an optional stride argument.

In any case I would support this. It might be a good idea to overload this for the intrinsic character type, even if the intrinsic slice syntax exists.

@aman-godara
Copy link
Member Author

I prefer first possibility over second possibility because if a user wants to avail second possibility he/she can do that by using an if else condition before passing arguments to code written in #414. But if we implement slice functionality the second possibility way, an user who is interested in first possibility won't be able to use the slice function.

@Beliavsky
Copy link

Can the slice function be ELEMENTAL? It appears to me that it can, since its arguments are scalars and it returns a scalar.

@awvwgk
Copy link
Member

awvwgk commented May 24, 2021

@Beliavsky Consider this example:

print *, slice("abcdef", 1, [1, 2, 3, 4, 5, 6], 1)

The resulting scalar character values cannot be in an array due to the different length.

@Carltoffel
Copy link
Member

Since this approach doesn't use Fortran indices but function arguments instead, we could also allow negative start/end values which will count from the end. There are two ways I can think of:

  1. do it exactly like in python, -i would then be a shortcut for len(string)-i
    slice('abcd', end = -1) returns abc
  2. or keep the Fortran standard of indices starting at 1, but counted from the end: -2 is the index of the letter 'c' in 'abcd'
    slice('abcd', end = -2) returns abc

@ivan-pi
Copy link
Member

ivan-pi commented May 27, 2021

Here is the description of the extract function proposed in section 3.7.1 of the iso_varying_strings document:

3.7.1 EXTRACT(string[, start,finish])
Description. Extracts a specified substring from a string.
Class. Elemental function.
Arguments.
string shall be either of type VARYING_STRING or type default CHARACTER
start (optional) shall be of type default INTEGER.
finish (optional) shall be of type default INTEGER.
Result Characteristics. Of type VARYING_STRING.
Result Value. The result value is a copy of the characters of the argument string between positions start and finish, inclusive. If start is absent or less than one, the value one is used for start. If finish is absent or greater than LEN(string), the value LEN(string) is used for finish. If finish is less than start, the result is a zero-length string.

@ivan-pi
Copy link
Member

ivan-pi commented May 27, 2021

Since this approach doesn't use Fortran indices but function arguments instead, we could also allow negative start/end values which will count from the end. There are two ways I can think of:

1. do it exactly like in python, `-i` would then be a shortcut for `len(string)-i`
   `slice('abcd', end = -1)` returns `abc`

2. or keep the Fortran standard of indices starting at 1, but counted from the end: `-2` is the index of the letter 'c' in 'abcd'
   `slice('abcd', end = -2)` returns `abc`

@Carltoffel , there some discussion of the different indexing systems in #311 (comment). Not sure if they are applicable here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idea Proposition of an idea and opening an issue to discuss it
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants