Original bug ID: 7024 Reporter: flindgren Status: closed (set by @xavierleroy on 2017-02-16T14:16:33Z) Resolution: fixed Priority: normal Severity: minor Target version: 4.03.0+dev / +beta1 Fixed in version: 4.03.0+dev / +beta1 Category: otherlibs Monitored by:@gasche
The Str library states that the $ metacharacter "[m]atches at end of line (either at the end of the matched string, or just before a newline character)". However, it appears that it only matches against LF and not other types of ends of line (say, CRLF). The documentation is not consistent with the observed behaviour.
Steps to reproduce
From an Ocaml toplevel:
let stringlf = "test\n";;
let stringcrlf = "test\r\n";;
let regexp = (Str.regexp "test$");;
Str.string_match regexp stringlf 0;; -> true
Str.string_match regexp stringcrlf 0;; -> false
The text was updated successfully, but these errors were encountered:
The standard library input functions will (when the file is being read in text, rather than binary mode) translate \r\n into \n at reading time under Windows. This means that you should not manipulate strings with \r\n in the OCaml world, and that in particular Str can assume than line ends with \n.
Do you have a particular reason for manipulating raw strings that have not been read in O_TEXT mode?
The program reads many files through the Unix module, which does not seem to support text mode. Some text files may embed binary data and cannot be read in translating modes.
But regardless of the validity or not of my use case, is this only-supports-LF documented somewhere? The documentation of Str only refers generally to line endings and newlines, without specifying that they must be of the right type. Is it documented elsewhere?