New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance of csvwriter.processRecord() with conversions #75
Comments
Thanks! Can you give a sample of code + input that simulates the problem so I can be sure I'm testing using a valid scenario? |
this nearly drove me crazy now... |
By the way I have written a little Conversion that maybe is useful for others as well - feel free to add it to the existing built in Conversions for a future release if you think its worth it. public class TrimToLengthConversion extends TrimConversion {
int length;
/**
* Constructor
*
* @param length
* the maximum length the value is allowed to have
*/
public TrimToLengthConversion(int length) {
this.length = length;
}
/**
* cuts off from the end of the string if it doesn't fit into the required length after trim()
*
* @see com.univocity.parsers.conversions.TrimConversion#execute(java.lang.String)
*/
@Override
public String execute(String input) {
String value = super.execute(input);
if (value.length() > length) {
value = value.substring(0, length).trim();
}
return value;
}
} and some unittests (junit :-)): import static org.junit.Assert.assertEquals;
import org.junit.Test;
public class TrimToLengthConversionTest {
private static final String TEST_INPUT17 = "Dies ist ein Text";
private static final String TEST_INPUT18 = "Dies ist ein Text ";
private static final String TEST_INPUT34 = " Dies ist ein Text mit 34 Zeichen. ";
private static final String EXPECTED_O8 = "Dies ist";
private static final String EXPECTED_17 = "Dies ist ein Text";
@Test
public void testExecuteString1() {
int length = 8;
TrimToLengthConversion conv = new TrimToLengthConversion(length);
String output = conv.execute(TEST_INPUT17);
assertEquals(EXPECTED_O8, output);
assertEquals(length, output.length());
output = conv.execute(TEST_INPUT34);
assertEquals(EXPECTED_O8, output);
assertEquals(length, output.length());
}
@Test
public void testExecuteString2() {
TrimToLengthConversion conv = new TrimToLengthConversion(17);
String output = conv.execute(TEST_INPUT17);
assertEquals("Input has been cut off, although the required maximum length hasn't been exceeded", TEST_INPUT17, output);
output = conv.execute(TEST_INPUT34);
assertEquals(EXPECTED_17, output); // test the input gets trim()ed first and checked for cut off afterwards
}
@Test
public void testExecuteString3() {
TrimToLengthConversion conv = new TrimToLengthConversion(18);
String output = conv.execute(TEST_INPUT34);
assertEquals(EXPECTED_17, output); // first trim() input, then cut off and finally should be trim()ed again to remove whitespaces now at the end. the result is shorter than the specified maximum length of 18 now...
output = conv.execute(TEST_INPUT18);
assertEquals("Input didn't got trim()ed (without exceeding maximum length)", TEST_INPUT17, output);
}
} |
Thank you for the suggestion! I'll add the option to trim to a given length in the TrimConversion itself and also update the annotation class to take the maximum length as an optional argument. |
Adding trim-to-length support in Conversions and annotation.
Added support for trim to length. Now the existing Available on version 2.1.0-SNAPSHOT |
Trim to length has been included in version 2.0.2 already. No need to wait for 2.1.0 as this was a minimal change. |
Hi, |
…ving whitespaces at end of truncated value.
Thanks for that! I've just fixed this to behave as it should and released a 2.5.5-SNAPSHOT build to include this adjustment. |
Hi - I commented on the commit itself, but cross-posting here just in case. The change seems to have broken TrimConversion for empty (0-length strings). We are getting following when writing out file with empty strings in fields which have @Trim annotation applied: "causeClass": "java.lang.StringIndexOutOfBoundsException", |
Thanks @Avksoft I've fixed and released a 2.5.6-SNAPSHOT build |
Hi
I've finished an implementation where I use CsvWriter (brand new final version 2.0 of your great lib) with an OutputValueSwitch and then write out rows by providing them as a Map with the method
processRecord()
.One rowtype of the switch has 57 columns and 3 of them apply a TrimConversion. The Map I'm passing in has 41 entrys so even some columns remain empty.
The
prcoessRecord()
takes approximately 1700ms for writing one row which makes it very long-running.A second rowtype of the switch has 16 columns without any conversions, the input map is nearly the same as for the former rowtype (so there are much more values than needed for this rowtype). This second rowtype takes at most one or two millicesonds.
If I comment out the conversions for the first rowtype it is ultrafast as well. Therefore I think the problem might not be specific to processing a map nor the fact of using an OutputValueSwitch but just in case it matters I described the whole use case...
If you need any further details just let me know
The text was updated successfully, but these errors were encountered: