-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to replace with Regex that needs multiple lines #71
Comments
Hi Firstly thanks for using unix4j. The simple and short answer to your question: the sed expression that you are trying is not supported by unix4j's sed. In more detail:
If you still want to use unix4j to solve your problem, you will have to first get around the multi-line issue. This can for instance be achieved by loading the file into a string and replacing newlines with some special character sequence: String singleLine = Unix4j.fromFile("issue.txt").toStringResult().replace("\n", "<NL>").replace("\r", ""); You can then replace the List<String> result = Unix4j.fromString(singleLine)
.sed("s/<NL>(\\d\\d\\d\\d)/\n$1/g")
.sed("s/<NL>/ /g")
.toStringList(); This will result in the following output given the sample input from above:
Of course it is far from ideal to process the newline replacement operations in memory especially for large files --- you may want to replace new lines in the file itself with a different method or use a different tool altogether. As I said Unix4j is not really well suited for multiline operations. I hope this helps anyway. (The example has been added to the git repo as unit test |
Hi terzerm, thanks for your response. |
Personally I would process the file manually by always looking 1 line ahead and then directly writing the output to a new file using BufferedReader and PrintWriter for instance. Something like this: final BufferedReader reader = new BufferedReader(new FileReader("issue.txt"));
final PrintWriter writer = new PrintWriter(new FileWriter("result.txt"));
final StringBuilder lineBuffer = new StringBuilder(256);
String line;
while ((line = reader.readLine()) != null) {
if (line.matches("^\\d\\d\\d\\d.*")) {
if (lineBuffer.length() > 0) {
writer.println(lineBuffer);
lineBuffer.setLength(0);
}
}
lineBuffer.append(lineBuffer.length() > 0 ? " " : "").append(line);
}
if (lineBuffer.length() > 0) {
writer.println(lineBuffer);
}
writer.flush(); |
See SedTest.java |
Hi,
I am having trouble getting the sed function to work with a regex. I am trying to remove the New Line Character (\n) if the line does not start with 4 Diggits.
I have a file looking like this
The following command works on the commandline
But when I try to use it like this
I get the following Exception
Am I doing something wrong?
What would be the best way to get this to work?
Thank you in advance.
The text was updated successfully, but these errors were encountered: