Skip to content

Commit

Permalink
StringProcessing: Add unescaped_findall
Browse files Browse the repository at this point in the history
  • Loading branch information
Fabian Neuschmidt committed Mar 30, 2015
1 parent 5bd7964 commit ce0e70a
Show file tree
Hide file tree
Showing 2 changed files with 25 additions and 2 deletions.
17 changes: 16 additions & 1 deletion coalib/parsing/StringProcessing.py
Original file line number Diff line number Diff line change
Expand Up @@ -387,4 +387,19 @@ def unescaped_finditer(string, sub, start=None, end=None):
yield position
start = position + 1
else:
raise StopIteration
raise StopIteration

This comment has been minimized.

Copy link
@Makman2

Makman2 Mar 31, 2015

Member

Squash this change with the previous commit to correct the line ending issue.



def unescaped_findall(string, sub, start=None, end=None):
"""
Lists all indices in the string where substring sub is found
unescaped, such that sub is contained in the slice s[start:end].
:param string: Arbitrary String
:param sub: Substring of which the position is to be found
:param start: Begin of string slice that restricts search area
:param end: End of string slice that restricts search area
:return: List of all positions of sub in string, independent of
slice borders!
"""
return list(unescaped_finditer(string, sub, start, end))
10 changes: 9 additions & 1 deletion coalib/tests/parsing/StringProcessingTest.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from coalib.parsing.StringProcessing import unescaped_find
from coalib.parsing.StringProcessing import unescaped_rfind
from coalib.parsing.StringProcessing import unescaped_finditer
from coalib.parsing.StringProcessing import unescaped_findall


class StringProcessingTest(unittest.TestCase):
Expand Down Expand Up @@ -791,7 +792,14 @@ def test_unescaped_finditer(self):
self.assertEqual(list(unescaped_finditer("acb\\cdec", "c")), [1, 7])
self.assertEqual(list(unescaped_finditer("acb\\cde\\c", "c")), [1])

def test_unescaped_findall(self):
self.assertEqual(unescaped_findall("abcde", "c"), [2])
self.assertEqual(unescaped_findall("abcde", "c", 1, 3), [2])
self.assertEqual(unescaped_findall("abcde", "c", 3, 4), [])
self.assertEqual(unescaped_findall("abcde", "z"), [])
self.assertEqual(unescaped_findall("acb\\cdec", "c"), [1, 7])
self.assertEqual(unescaped_findall("acb\\cde\\c", "c"), [1])


if __name__ == '__main__':
unittest.main(verbosity=2)

5 comments on commit ce0e70a

@Makman2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function is somehow not needed. The user can collect the yields manually via list (and the common python programmer should know that).

@sils
Copy link
Member

@sils sils commented on ce0e70a Mar 31, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python stdlib usually has such mini-wrappers but I think I agree, I prefer having only iterators and converting to list manually as needed.

@fneu
Copy link
Contributor

@fneu fneu commented on ce0e70a Apr 1, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a simplyfication that makes code stay out of the way so scrutinizer is happy :)

Anyway, I if you both support that opinion, I will:

  • remove find, rfind and find all and stay with finditer. One can use lfind, *sth, rfind = finditer()and findall = list(finditer())
  • provide a faster position_unescaped
  • adapt to your style for tests for now

@Makman2
Copy link
Member

@Makman2 Makman2 commented on ce0e70a Apr 1, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you need a find-method that returns indices? You can use MatchObjects, they also contain this kind of information. I think unescaped_search_for() would cover this need, if you need to yield indices you can wrap then unescaped_search_for() inside a new generator that just returns the match.start() or match.end() (these func's contain the matching positions).

@Makman2
Copy link
Member

@Makman2 Makman2 commented on ce0e70a Apr 1, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the position_unescaped (that I would rename to is_position_unescaped to expose the boolean character of this function) is not so bad, you can extend it by start/stopping parameters or implement a manual look-behind to make it more effective. It's especially useful if passages of a string handle escapes and some not.

Please sign in to comment.