New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add String.splitOn_of_valid
#495
feat: add String.splitOn_of_valid
#495
Conversation
But why? You say that Checking the separator for emptiness in |
* `List.rdropSublist` drops a sublist from the tail end of a list. * `List.splitOnSublist` splits a list at every occurrence of a separator sublist. The separators are not in the result.
State and prove two theorems about `String`: `splitOnAux_of_valid` and `splitOn_of_valid`.
fbbcb46
to
9b42bad
Compare
Looking at the definitions of |
@digama0 There are some errors in #eval "".splitOn "_" == "".splitOn' "_" -- output: true
#eval "My<>life<>for<>Aiur!".splitOn "<>" == "My<>life<>for<>Aiur!".splitOn' "<>" -- true
#eval "BABABA".splitOn "A" -- ["B", "B", "B", ""]
#eval "BABABA".splitOn' "A" -- ["B", "B", "BA"] |
Why not use the KMP matcher at Std.Data.Array.Match? |
@chabulhwi Based on your example, it seems like As for using the KMP matcher, I was thinking of making the same suggestion but this is a definition which is targeting core, which does not have the KMP matcher. We may consider a |
You're right, but I think I can fix this error. |
Why not just use and prove properties of the existing function? |
I think the current definition of theorem splitOnAux_of_valid (sep₁ sep₂ l m r acc) :
splitOnAux ⟨l++m++sep₁++r⟩ ⟨sep₁++sep₂⟩ ⟨utf8Len l⟩ ⟨utf8Len (l++m++sep₁)⟩ ⟨utf8Len sep₁⟩ acc =
acc.reverse++(List.splitOnList.go r (sep₁++sep₂) sep₂ (m++sep₁).reverse).map mk := by
sorry
theorem splitOn_of_valid (s sep) : splitOn s sep = (List.splitOnList s.1 sep.1).map mk := by
simpa [splitOn] using splitOnAux_of_valid [] sep.1 [] [] s.1 [] When proving Of course, we should make sure |
|
In order to prove When #eval "AB".splitOnAux "C" ⟨0⟩ ⟨0⟩ ⟨1⟩ []
#eval "AB".splitOnAux "C" ⟨1⟩ ⟨1⟩ ⟨0⟩ [""]
#eval "AB".splitOnAux "C" ⟨1⟩ ⟨2⟩ ⟨0⟩ [""]
#eval ("B"::[""]).reverse
#eval ["", "B"]
#eval "DB".splitOnAux "C" ⟨0⟩ ⟨0⟩ ⟨1⟩ []
#eval "DB".splitOnAux "C" ⟨0⟩ ⟨1⟩ ⟨0⟩ []
#eval "DB".splitOnAux "C" ⟨0⟩ ⟨2⟩ ⟨0⟩ []
#eval ("DB"::[]).reverse
#eval ["DB"]
The weird behavior of |
Like I said earlier, The whole reason that |
Adding |
Nat.le
andList.append
.List
:List.rdropSublist
drops a sublist from the tail end of a list.List.splitOnSublist
splits a list at every occurrence of a separator sublist. The separators are not in the result.simp
lemmas forList.rdropSublist
.String
:splitOnAux_of_valid
andsplitOn_of_valid
.Before this pull request gets merged, I need to replace
String.splitOnAux
andString.splitOn
in Lean core with the modified versions I made:The function
splitOnAux'
differs from its original version in two ways. First, it checks whether the separator (sep
) is empty. Second, it checks whethersep.atEnd j
before it checks whethers.get i == sep.get j
. For these reasons, the outputs of the originalsplitOnAux
and its modified version are different for some inputs, as shown in the#eval
commands below:I'll open an issue on Lean 4 GitHub issues after I get feedback about
splitOnAux'
andsplitOn'
.