Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove get() and unsafe_get() from String #57

Merged
merged 6 commits into from
Oct 17, 2018
Merged

Remove get() and unsafe_get() from String #57

merged 6 commits into from
Oct 17, 2018

Conversation

aorenste
Copy link
Contributor

Lots of code was abusing String::get() (and String::unsafe_get()) ending up in O(n^2) behavior. Removed String::get() and modified the String API to work better with StringIterator.

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Oct 14, 2018
@josephsavona
Copy link
Contributor

Love the direction - obviously :-) - will review a bit later

Copy link
Contributor

@josephsavona josephsavona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot here, this is just first pass. Will look more tomorrow.

@@ -282,9 +342,8 @@ fun testSubstringLarge(): void {
@test
fun testTrim(): void {
assertEqual("".trim(), "");
assertEqual("a".trim(), "a");
assertEqual("a".trimRight(), "a");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is already tested in testTrimRight, why the change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental debug stuff left it - I'll remove it

this
}

mutable fun takeString(n: Int = Int::max): String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This couples advancing the iterator with collecting its results. How about replacing this with mutable fun collectString(): String, and replacing usage of takeString(n) with take(n).collectString()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - that's really just renaming this to collectString() and removing the parameter.

String::fromChars(Array::createFromIterator(this.take(n)));
}

mutable fun takeUntil(end: readonly StringIterator): String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is similar to takeWhile, so I would have guessed that it was "take until the predicate returns true" (inversely, take while the predicate returns false). The closest operation I can think of for this is slice(). I'm not sure about that name either, but takeUntil definitely feels confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - it's basically slice() with no "start" (since that makes less sense for an iterator). If you want I can rename to "slice"

true
}

mutable fun search(pattern: String): Bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this contains(), I think most people's intuition about that name is more likely to be correct re behavior and return type.

Note that calling it "contains" does slightly differ from the behavior of "contains()" for Sequences - there, it's whether the predicate value appears as an item in the sequence - but I think Strings and Sequences are different enough that we should break with consistency. It's always possible to view a String as Sequence of chars, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling search() contains() feels weird to me - search() implies that it's moving the iterator whereas contains() implies to me that it's not (when it really is).

}
}

mutable fun rsearch(pattern: String): Bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"containsRight()". Weird name, weird operation. What is this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so you can find the right-most occurrence of something:
"a/b/c/d".getEndIter().rsearch("/") returns an iterator pointing at the last "/".

}
}

mutable fun rfind(p: Char -> Bool): ?Char {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency the r-prefixed methods should be -Right, so findRight().

// Careful, this is O(n) (because of utf8)
@cpp_runtime
native private fun unsafe_get(x: Int): Char;
static fun fromGenerator(gen: () -> mutable Iterator<Char>): String {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we could prevent needing more and more "fromFoo()" type constructor methods. For this case, it seems like fromIterator() would be sufficient - it works with iterators, and if you have a generator function you just have to call it first to get an iterator. The only reason I can see to add this exact variation is that we don't have list comprehension style syntax. I'd like something like

String::fromIterator(...while (cond) yield blah)

cc @mjhostet @tnowacki

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just handy because a lot of the other functions were in the form:

gen: () -> mutable Iterator<Char> = () -> {
   ... stuff ...
};
String::fromChars(Array::fromIterator(gen()))

If you dislike it I can make the functions use the pattern directly...

this.searchIndex(offset + 1, f)
}
// Finds the given pattern in the string, starting from 'from'
fun search(f: Char -> Bool): ?mutable StringIterator {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh didn't realize we (still) had this method instead of renaming to "contains()". Disregard my note about this method on StringIterator, we can rename both in a follow-up.

src/native/IR.sk Outdated
@@ -144,7 +144,8 @@ base class UTF8String{
backEndUsesUtf8 = ("\u1234".length() != 1);

utf8 = if (backEndUsesUtf8) {
Array::fillBy(string.length(), i -> string[i].code())
i = string.getIter();
Array::fillBy(string.length(), _ -> i.next().fromSome().code())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use String::utf8 for this now?

this.kind match {
| LineComment() -> this.value.sub(2, this.value.length() - 2)
| DelimitedComment() -> this.value.sub(2, this.value.length() - 4)
| LineComment() -> v.getIter().forward(2).takeString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern shows up a bunch now: get iterator, advance forward by N chars, then "take" either to the end or to some end iterator (that is itself created by getting the end iterator and reversing M steps). Seems like we should make slicing from a start to negative-end offset easier.

.getIter()
.forward(3)
.takeUntil(comment.value.getEndIter().rewind(1)),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same pattern here

Copy link
Collaborator

@pikatchu pikatchu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Aaron!

@pikatchu pikatchu merged commit 3bfdfc5 into skiplang:master Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants