Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong result for FixedWidthParser #511

Open
pajusin opened this issue Aug 26, 2022 · 1 comment
Open

Wrong result for FixedWidthParser #511

pajusin opened this issue Aug 26, 2022 · 1 comment

Comments

@pajusin
Copy link

pajusin commented Aug 26, 2022

FixedWidthParser returns wrong result if parsed row is smaller than annotation setting (from, to). See unittest

public static class LINE {
		public LINE() {
		}
		
		@Parsed
		@FixedWidth(from = 5, to = 10)
		String row;
	}
	
	@Test
	public void testFixedWidthAnnotation2() throws Exception {
		BeanListProcessor<LINE> rowProcessor = new BeanListProcessor<LINE>(LINE.class);
		FixedWidthParserSettings parserSettings = new FixedWidthParserSettings();
		parserSettings.setProcessor(rowProcessor);
		FixedWidthParser parser = new FixedWidthParser(parserSettings);
		
		parser.parse(new StringReader("     12123"));
		List<LINE> beans = rowProcessor.getBeans();
		assertEquals(beans.get(0).row, "12123"); // this is OK

		parser.parse(new StringReader(" 1"));
		beans = rowProcessor.getBeans();
		assertEquals(beans.get(0).row, ""); //returns wrong result 1, but should return "" or NULL, from position 5 to 10 characters in source row does not exists
	}
@mjawadbutt
Copy link

mjawadbutt commented Oct 12, 2022

I faced a similar issue. Just to summarize again, there are 2 conditions that need to be true to reproduce the error:

  1. In fixed width parsing, there is a gap between the last and the second last field definition i.e.
    :
    :
    fixedWidthFields.addField("Serial no", 0, 6); //-- second last field .. DDMMYY
    fixedWidthFields.addField("Costing Date", 10, 16); //-- second last field .. DDMMYY
    fixedWidthFields.addField("Labor Cost Code", 20, 30); //-- last field .. Alphanumeric,10

(So there is a gap between the second last and the last field from position 16 to position 19)

  1. The last field in the data contains fewer characters than the gap field.

In this case, the value is assigned to the actual last column (i,.e. Labor Cost Code field) rather than being considered part of the gap field and ignored.

i.e. if row is:
SNO___COSTIN_ABCD

so after parsing, the values of fields will be:
SNO__
COSTIN_
ABCD

Whereas they should be:
SNO__
COSTIN_
null

As long as the last field contains characters <= gap field length, this error will manifest. As soon as we have more characters than the gap field the result will become correct.

MY WORKAROUND for this was to define the gap field explicitly and ignore it in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants