Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This starts the process of expanding Blaze's string column support to include
upper
andlower
. This is useful since having first-class (and optimized) support for common string operations is useful for the string-munging pain points that users hit.I'm marking these as part of the "experimental" API currently, since I'm not wild about the
str_upper
,str_lower
, etc. naming scheme. I'd like to find a better naming system for these if we can.We have immediate need for
upper
andlower
, so I'm putting these in for 0.10.Regarding naming schemes: I like the Pandas' style
df.col.str.upper().str.replace(...)
. We could expand that to includedf.col.dt.datetimemethod()
for datetimes as well.We'll have to think about which string and datetime methods we want to support, for which backends, and what are the semantics when the method in question returns multiple values. All of that is outside the scope for this PR.
This PR also deprecates
strlen
and addsstr_len
for consistency.