Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a test suite showing all the features of supercsv #56

Closed
kbilsted opened this issue Jul 5, 2015 · 21 comments
Closed

Create a test suite showing all the features of supercsv #56

kbilsted opened this issue Jul 5, 2015 · 21 comments
Assignees

Comments

@kbilsted
Copy link
Member

kbilsted commented Jul 5, 2015

An idea could be to create a unit test called "FeaturesOfSuperCsv" which has a test for each row of the comparison matrix showing that supercsv in fact has this feature. Missing features should be made into individual issues which the community then can implement or reject.

this test is nice for

  • new people on the team
  • people doing comparisons
  • for linking from the documentation pages

features to show can be found in

@ZioberMichal
Copy link
Collaborator

Thanks for pointing me to this issue. I will try to start working on test suite this week.

@kbilsted
Copy link
Member Author

kbilsted commented Jul 5, 2015

Sounds fantastic. Not sure how we should structure this? I mean we cannot put ALL the features into one test I guess since we have extra modules e.g. for jodatime. Any ideas? Maybe put a "feature" test in each package?

Then we also need from the frontpage, to link to alle the "featuretests"

@ZioberMichal
Copy link
Collaborator

"Feature" test in each package sounds good. We can start using this approach and when we will see better one we can refactor already created tests.

Edit:
We should decide how we want to solve this for example for "Tokenizer" section from the first link:

  1. custom separator
  2. custom quote
  3. custom escape
  4. custom EOL
  5. \n in delimited field
  6. escaped quote in quoted field
  7. different escape / quote
  8. deal with leading/trailing whitespace

we can

A: Create TokenizerTest which will be containing 8 test for each related features. Put this test to "features" subpackage.
B: Split it on separated 8 feature tests in "features" subpackage.

@kbilsted
Copy link
Member Author

kbilsted commented Jul 6, 2015

I'm not sure what provides the better overview. Maybe it will be clear after some code

@ZioberMichal
Copy link
Collaborator

I have implemented first test for "Tokenizer" section and actually it is not possible to:

  1. change escape character.
  2. set different characters for "quote character" and "escape character"

Or maybe it is possible, but I could not find any feature which allows that. Actually, we can trim leading and trailing white-characters but we have to choose proper cell processors. I think, it should be listed in documentation or guideline.

Below you can see class and simple tests. I did not want to push them as PR because actually I do not know what to do with tests for which we are not able to change "escape character". Should we thrown exception "Not implemented yet!" or something like this? What do you think?

package org.supercsv.features;

import org.junit.Assert;
import org.junit.Test;
import org.supercsv.cellprocessor.constraint.NotNull;
import org.supercsv.cellprocessor.ift.CellProcessor;
import org.supercsv.io.CsvBeanWriter;
import org.supercsv.io.CsvListWriter;
import org.supercsv.prefs.CsvPreference;

import java.io.IOException;
import java.io.StringWriter;
import java.util.Arrays;
import java.util.List;

/**
 * Test class which checks all features listed on <a href="http://csveed.org/comparison-matrix.html">comparison page</a> in "Tokenizer" section.
 *
 * @author Michał Ziober
 */
public class TokenizerFeaturesTest {


    @Test
    public void testCustomSeparator() throws IOException {
        List<String> data = Arrays.asList("John", "Connor");
        CellProcessor[] processors = { new NotNull(), new NotNull() };

        char customSeparator = '+';
        CsvPreference customPreference = new CsvPreference.Builder('"', customSeparator, "").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("John+Connor", writer.toString());
    }

    @Test
    public void testCustomQuote() throws IOException {
        List<String> data = Arrays.asList("John \n Connor");
        CellProcessor[] processors = { new NotNull() };

        char customQuote = '|';
        CsvPreference customPreference = new CsvPreference.Builder(customQuote, ',', "").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("|John  Connor|", writer.toString());
    }

    //TODO: not possible to change
    @Test
    public void testCustomEscape() throws IOException {
        List<String> data = Arrays.asList("\"John Connor\"");
        CellProcessor[] processors = { new NotNull() };

        CsvPreference customPreference = new CsvPreference.Builder('"', ',', "").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("\"\"\"John Connor\"\"\"", writer.toString());
    }

    @Test
    public void testCustomEOL() throws IOException {
        List<String> data = Arrays.asList("John Connor");
        CellProcessor[] processors = { new NotNull() };

        String customEndOfLine = ">\r\n";
        CsvPreference customPreference = new CsvPreference.Builder('"', ',', customEndOfLine).build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("John Connor>\r\n", writer.toString());
    }

    @Test
    public void testNewLineInDelimitedField() throws IOException {
        List<String> data = Arrays.asList("Jo\nhn", "Con\nnor");
        CellProcessor[] processors = { new NotNull(), new NotNull() };

        CsvPreference customPreference = new CsvPreference.Builder('"', ',', "\n").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("\"Jo\nhn\",\"Con\nnor\"\n", writer.toString());
    }

    @Test
    public void testEscapedQuoteInQuotedField() throws IOException {
        List<String> data = Arrays.asList("Joh\"n", "Con\"nor");
        CellProcessor[] processors = { new NotNull(), new NotNull() };

        CsvPreference customPreference = new CsvPreference.Builder('"', ',', "").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("\"Joh\"\"n\",\"Con\"\"nor\"", writer.toString());
    }

    //TODO: not possible to separate
    @Test
    public void testDifferentEscapeAndQuote() throws IOException {
        List<String> data = Arrays.asList("\"John Connor\"");
        CellProcessor[] processors = { new NotNull() };

        char customQuote = '"';
        CsvPreference customPreference = new CsvPreference.Builder(customQuote, ',', "").build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("\"\"\"John Connor\"\"\"", writer.toString());
    }

    //TODO: not possible to trim values
    @Test
    public void testDealWithLeadingTrailingWhitespace() throws IOException {
        List<String> data = Arrays.asList("     John", "Connor     ");
        CellProcessor[] processors = { new NotNull(), new NotNull() };

        char customQuote = '"';
        CsvPreference customPreference = new CsvPreference.Builder(customQuote, ',', "").surroundingSpacesNeedQuotes(false).build();
        StringWriter writer = new StringWriter();
        CsvListWriter beanWriter = new CsvListWriter(writer, customPreference);
        beanWriter.write(data, processors);
        beanWriter.close();

        Assert.assertEquals("     John,Connor     ", writer.toString());
    }
}

@jamesbassett
Copy link
Contributor

Hey Michał, looks like all of those tests are for writing, whereas any features that Tokenizer should support are related to reading. Some features only apply to reading or writing, others apply to both.

Regarding testDealWithLeadingTrailingWhitespace(), you're correct that we don't automatically trim and the only way to do that is to use a cell processor. I think this is logical. surroundingSpacesNeedQuotes can be set to true to force surrounding quotes for values with leading/trailing spaces, but setting to false (it defaults to false btw) just disables that behaviour (it will still quote if the quote mode says so, or if the value has a special character).

Here's some related issues:

I don't think the feature tests should break the build (perhaps we should just @Ignore it and run it on demand), but it would be good to document our position (against the test method) on any features that aren't supported. For example, you guys might disagree, but I really hate the backslash escaping - so many issues are caused by invalid CSV and this feature only makes the situation worse.

@ZioberMichal
Copy link
Collaborator

Actually I wanted wrote these tests for "read" mode as well but at the beginning I just wanted show first draft and talk about that a little. Anyway, that a good point that some features are available in "read" and "write" mode and some only in one mode. I agree that @Ignore annotation should be used if do not want to break build.

@kbilsted
Copy link
Member Author

kbilsted commented Jul 9, 2015

How we organise the tests are still unclear to me but a test fixture for reading features and one for writing features sound great.

With the tests we need a table in the documentation to quickly give an overview of the features possibly with a column referring to the specific tests by linking to the source code maybe.

@ZioberMichal
Copy link
Collaborator

It is a good idea to create documentation which explicitly tells which features are available. A little sum up:

  1. I will create one test class for read and one for write mode, TokenizerFeaturesReadTest and TokenizerFeaturesWriteTest accordingly.
  2. Features which are not available will be represented by ignored tests.

When tests will be ready and "well implemented" we can create new section in documentation.

ZioberMichal added a commit to ZioberMichal/super-csv that referenced this issue Jul 10, 2015
1. Added two new tests which explicitly checks features from
http://csveed.org/comparison-matrix.html page (Tokenizer section).
@ZioberMichal
Copy link
Collaborator

I've pushed tests for Tokenizer read/write features. Let me know what do you think about this implementation.

ZioberMichal added a commit to ZioberMichal/super-csv that referenced this issue Jul 11, 2015
1. Added two new tests which explicitly checks features from
http://csveed.org/comparison-matrix.html page (Tokenizer section).
ZioberMichal added a commit to ZioberMichal/super-csv that referenced this issue Jul 11, 2015
1. Added two new tests which explicitly checks features from
http://csveed.org/comparison-matrix.html page (Tokenizer section).
@ZioberMichal
Copy link
Collaborator

I've also added tests which explicitly checks features from http://csveed.org/comparison-matrix.html page from “Line Reading” section.

@kbilsted
Copy link
Member Author

Great I was about to asking just that
Den 11/07/2015 14.43 skrev "Michał Ziober" notifications@github.com:

I've also added tests which explicitly checks features from
http://csveed.org/comparison-matrix.html page from “Line Reading” section.


Reply to this email directly or view it on GitHub
#56 (comment).

@ZioberMichal
Copy link
Collaborator

We have to write tests also for "Bean Mapping" and "Writing CSV" sections. "Documentation" sections is completed for "super-csv". I am not sure whether is something to do related with "Quality of error feedback", what do you think? "Convenience" section is also not clear for me.

Will you merge this PR as it is now or you want to wait for tests related with other sections?

@kbilsted
Copy link
Member Author

Already merged.. but feel free to send more stuff 👍

@ZioberMichal
Copy link
Collaborator

I am not sure whether it is already merged ... :)

I am a little confused what some ideas mean in this comparison matrix in "Bean mapping" section.

Idea Status Question
converts to primitives set manually How it should be tested? What does it mean "set manually".
converts to basic Objects set manually What "basic Objects" mean? I am not sure...
Converter support no We have these cell processors. Do you think that this is it?
Returns typed Bean yes I am not sure how to test it. Method read returns generic T type. Should I use reflection to check whether this method returns generic type or we can skip it in test suite?

What do you think guys?

@jamesbassett
Copy link
Contributor

My interpretation...

Idea Status James' interpretation
converts to primitives set manually Relates to using CsvListReader and converting columns to primitives (int, long, etc). This requires cell processors, i.e. 'set manually'
converts to basic Objects set manually Relates to using CsvBeanReader/CsvDozerBeanReader converting to simple POJOs (no deep or indexed mapping). This requires cell processors for all non-String fields, i.e. 'set manually'
Converter support no After looking at the csveed API (see [BeanReaderInstructions.setConverter](http://javadocs.csveed.org/org/csveed/bean/BeanReaderInstructions.html#setConverter%28java.lang.String, org.csveed.bean.conversion.Converter%29)), this is exactly the same as cell processors. But I think csveed can do some of this conversion automatically. We could achieve this by setting up a global mapping of cell processors to use when mapping, e.g. if we're reading a CSV column into an Integer and no cell processor is defined for a column, then use new Optional(new ParseInt()).
Returns typed Bean yes No idea - it says opencsv doesn't support this but it looks like it returns typed beans to me...

@ZioberMichal
Copy link
Collaborator

Thanks for it!

I've added new tests. Could you please, @kbilsted, @jamesbassett review them?

ZioberMichal added a commit to ZioberMichal/super-csv that referenced this issue Jul 15, 2015
1. Added two new tests which explicitly checks features from
http://csveed.org/comparison-matrix.html page.
kbilsted added a commit that referenced this issue Jul 31, 2015
#56 - test suite for Tokenizer read/write operations.
@ZioberMichal
Copy link
Collaborator

@kbilsted ,
Implemented tests covers below sections:

  1. Tokenizer
  2. Line Reading
  3. Bean Mapping

Next sections mostly covers "concepts" not "implementation":

  1. Writer - each our writer can write to Writer.
  2. Documentation - done.
  3. Quality of error feedback - I think that we should create new issue for each feature but mainly we have implemented this.
  4. Convenience - as above
  5. Open Source Community - done 😏

Do you think we can do anything more in this issue?

@kbilsted
Copy link
Member Author

kbilsted commented Sep 3, 2015

Sure create the new issues and close. Do we need an issue for making a documentation page giving an overview of all features of supercsv linking into the tests?

@ZioberMichal
Copy link
Collaborator

I think that we can create new documentation page in this issue.

P.S: Until first of October I have a lot of work and probably I will not be able to do it sooner.

@ZioberMichal
Copy link
Collaborator

I have created new issues which should be implemented:

  1. Custom escape character #79
  2. Split cell to multiple properties #80
  3. Join multiple cells into one property #81
  4. Read all lines #82

Also, I have asked to update Comparison matrix in this task 42BV/CSVeed#65.

I checked documentation and it contains already all needed informations. I think that we can create new documentation page when above tasks will be finished.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants