You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Parsers can be very slow when the | is involved. I tried to write a parser for paths in the following form: /?foo/bar/baz/(**)?/filename that would extract to (Option[Path], Boolean, String). It looked like this:
private[impl] val globParser: Parser[(Option[Path], Boolean, String)] = {
val sep = ('/': Parser[Char]) | ('\\': Parser[Char])
val star: Parser[Char] = '*'
val notSpecial = charClass(c => c != '/' && c != '\\' && c != '*', "not meta")
val pathParts = matched(notSpecial.* ~ sep.? ~ (notSpecial.+ <~ sep).*).?
val recursivePart = sep.? ~> (star.+ <~ sep).?
(OptSpace ~> pathParts ~ recursivePart ~ matched(not(sep, "").+)).map {
case ((prefix, Some(Seq('*', '*'))), filename) => (prefix.map(Paths.get(_)), true, filename)
case ((prefix, _), filename) => (prefix.map(Paths.get(_)), false, filename)
}
}
The parser worked fine for short paths, but the performance was really terrible as the number of path prefixes grew.
One thing that I noticed is that the map is applied eagerly. Naively, I assumed that map would only be applied after all of the bytes had been parsed. With this assumption, I tried splitting the parser into something like pathParts ~ matched(.*).flatMap { (prefix, rest) => ... } but it behaved more or less exactly the same because the flatMap was immediately applied.
I think the Parsers are more or less working as designed, but the design makes it somewhat difficult to write even modestly complicated parsers with reasonable performance (see #4890).
The text was updated successfully, but these errors were encountered:
This is more of an observation than an issue:
Parsers can be very slow when the
|
is involved. I tried to write a parser for paths in the following form:/?foo/bar/baz/(**)?/filename
that would extract to(Option[Path], Boolean, String)
. It looked like this:The parser worked fine for short paths, but the performance was really terrible as the number of path prefixes grew.
One thing that I noticed is that the
map
is applied eagerly. Naively, I assumed that map would only be applied after all of the bytes had been parsed. With this assumption, I tried splitting the parser into something likepathParts ~ matched(.*).flatMap { (prefix, rest) => ... }
but it behaved more or less exactly the same because theflatMap
was immediately applied.I think the Parsers are more or less working as designed, but the design makes it somewhat difficult to write even modestly complicated parsers with reasonable performance (see #4890).
The text was updated successfully, but these errors were encountered: