-
-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing multi-schema / master-detail files #66
Comments
The tutorial doesn't include anything (yet) for many new features in the upcoming 2.0.0 version. Support for that has been built some time ago. Check the test cases here |
Thank you very much for the advice! This feature looks amazing! Now I'm done evaluating uniVocity-parsers in the context we need a solution - kudos to you, there are quiet some special cases and univocity-parsers masters them all! Thank you for developing and sharing such a great library! |
I've did some tests and now have to ask again - I'm not sure if I do something wrong or if it is a bug...: final ObjectRowWriterProcessor clientProcessor = new ObjectRowWriterProcessor();
final ObjectRowWriterProcessor accountProcessor = new ObjectRowWriterProcessor();
OutputValueSwitch writerSwitch = new OutputValueSwitch();
writerSwitch.addSwitchForValue("Account", accountProcessor, "type", "balance", "bank", "account", "swift");
writerSwitch.addSwitchForValue("Client", clientProcessor);
CsvWriterSettings settings = new CsvWriterSettings();
settings.getFormat().setLineSeparator("\n");
settings.setHeaderWritingEnabled(false);
settings.setRowWriterProcessor(writerSwitch);
CsvWriter writer = new CsvWriter(new File("path/to/file/filename"), settings); The following code writes Ok, next thought was I have to provide which value belongs to which column (header), so the writer can decide which columns are provided and which one war empty. Map<String, Object> rowData = new HashMap<String, Object>();
rowData.put("type", "Account" );
rowData.put("balance","sp2");
rowData.put("bank","sp3");
rowData.put("acount","sp4");
rowData.put("swift","sp5");
writer.processRecord(rowData); which resultet in an empty file as the keys from the map were used as headers but their ordering was different then provided in Map<String, String> headermapping = new HashMap<String, String>();
headermapping.put("type", "type" );
headermapping.put("balance","balance");
headermapping.put("bank","bank");
headermapping.put("acount","acount");
headermapping.put("swift","swift");
writer.processRecord(headermapping,rowData); Result was the same, except that the headers now were provided from the headermapping-map instead of the rowData-map but without any mapping-logic (debugging into Am I on the totally wrong track to achieve the above described or is this a bug? |
Hello there. Regarding your first issue, trying to write
Will result in On the second issue with maps, using a |
…eaders, as identified in github issue #66
As promised, I made a few adjustments: Expanding rows to match the number of headers Throwing an exception when nothing matches the OutputValueSwitch |
Thank you for the adjustments. What do you think - wouldn't it be possible when deciding to choose the correct switch to iterate over all switches, if they are provided with |
What I have forgotten: Your tipp with LinkedHashMap works, thanks. But in case you get the map to write from some code you don't controll this solution is knocked out as |
The header definition of an output switch is used only to manage the output sequence of its written columns after a given format is identified. The format is identified by reading what data is present at given column (by default, column index = 0). This requires that all rows come with the relevant data in the proper location for the switch to work (in your case, the account type) in the expected position (0). When you pass a map as the input, and with no headers defined in the writer settings, it will simply iterate through the map keys to produce a record. In the case of a HashMap, that sequence cannot be determined. In the particular case you want to use Map + OutputSwitch you can control the sequence of headers by providing an LinkedHashMap of header mappings:
This is a bit contrived and a slightly better approach would be to add a few overrides that allow you to provide the sequence as an array of headers. In this case you would be able to call something like Deriving the headers from each item of the output switch would be inefficient as that would have to be executed over and over each time you send in a map of values to write. |
OK, good point. Providing the headerMapping map as an instance of Background: We will have some output to write with different row types. The schema is defined by an external partner and for each type of record we only need some of the lots of columns to fill. So it would have been very handy to define the columns of each rowtype and when writing just pass in a map with the fields which aren't null/not needed. |
Not sure if I understood your question. Can you give me an example of expected input rows and expected output? |
Let's say there can be three rowtypes with theese headers:
Input as Map-Instances (one line=one map; key=>value):
Most of the maps contain only partial data according to the defined headers, e.g. the first SUB1-record doesn't contain the keys b,c,f. desired output:
I would like to define the inputschemata and than write row by row using something like CSV with only one type of record works fine, but the combination of multi schemata style csv and maps containing only partial data is what I didn't find. |
Ok so I just committed a few changes that will make this work as expected, and also allow you to use any type of map without problems:
|
thank you for the quick implementation. Unfortunately there is still a bug. Here is an additional unittest you can use: @Test
public void testMultiple2() {
OutputValueSwitch writerSwitch = new OutputValueSwitch("type"); //switch based on field name
writerSwitch.addSwitchForValue("SUPER", new ObjectRowWriterProcessor(), "type", "h1", "h2", "h3", "h4");
writerSwitch.addSwitchForValue("SUB1", new ObjectRowWriterProcessor(), "type", "a", "b", "c", "d", "e", "f", "g");
writerSwitch.addSwitchForValue("SUB2", new ObjectRowWriterProcessor(), "type", "p", "q", "r", "s", "t", "u", "v",
"w", "x", "y", "z");
writerSwitch.addSwitchForValue("SUB3", new ObjectRowWriterProcessor(), "type", "a", "b", "c");
CsvWriterSettings settings = new CsvWriterSettings();
settings.setExpandIncompleteRows(true);
settings.getFormat().setLineSeparator("\n");
settings.setHeaderWritingEnabled(false);
settings.setRowWriterProcessor(writerSwitch);
StringWriter output = new StringWriter();
CsvWriter writer = new CsvWriter(output, settings);
writer.writeRow(newMap("SUPER", "h1=>v1;h2=>v2;h3=>v3"));
writer.writeRow(newMap("SUB1", "a=>v5;d=>v6;e=>v7;g=>v8"));
writer.writeRow(newMap("SUB2", "q=>v9;u=>v10;w=>v11;y=>v12"));
writer.writeRow(newMap("SUB1", "a=>v13;d=>v14;g=>v15"));
writer.writeRow(newMap("SUB1", "a=>v16;d=>v17;f=>v18"));
writer.writeRow(newMap("SUB3", "a=>v16;b=>v17"));
writer.writeRow(newMap("SUPER", "h1=>v1;h3=>v3"));
writer.close();
assertEquals("" + "SUPER,v1,v2,v3,\n" + "SUB1,v5,,,v6,v7,,v8\n" + "SUB2,,v9,,,,v10,,v11,,v12\n"
+ "SUB1,v13,,,v14,,,v15,,,\n" + "SUB1,v16,,,v17,,v18,,,,\n" + "SUB3,v16,v17," + "SUPER,v1,,v3,\n",
output.toString());
} by the way: in your unittest the values for |
a second little issue: in the |
Fixed everything :)
|
Oh, sorry - seems like TestNG and JUnit have the two parameters swapped. Innocent as I've been I didn't think about what testframework you might use - and having any imports stripped away the testcase was a valid JUnit testcase as well which is what I'm used to :-) I'll give it a try later on, thank you. |
Works like a charm. 👍 |
Hey, thanks a lot for the valuable input. I feel much more confident about the features I'm adding to this library when people have something to complain about as it means they are being used in the real world. |
I came over one more little issue in context of this usecase: Beside this there is a little bug in the exception-message above: |
Thank you! This makes a lot of sense. When writing from maps unknown keys should be ignored. The use cases you reported should all work now. |
I have seen the docu about reading master-detail style files (https://github.com/uniVocity/univocity-parsers#reading-master-detail-style-files) and found #17 "creation of multiple types of Java beans".
Is there something similar for writing files I have missed so far?
Here is the usecase I'm evaluating:
This style might be too complex even for reading I fear as it has more than just one kind of detailrecord (actually I even have to write more than two kinds of detailrecods 😟 ).
My thoughts to solve this so far:
The headers have to be declared as comments so this would not be a problem. The master rows could be written as usual. Writing to fixedwidth the detailrows could be written using
fixedWidthWriter.writeRow(string)
whilestring
could be collected from a second, third,.. fixedWidthWriter usingfww2.processRecordToString()
.In case of CSV (usecase above) this seems a bit more difficult to me...!?
Did I miss something which makes this style of writing files easier?
The text was updated successfully, but these errors were encountered: