Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add skip column feature in excel file readers #64

Closed

Conversation

parikshitdutta
Copy link

@parikshitdutta parikshitdutta commented Mar 14, 2021

Closes BATCH-EXT-40 [#40]

@parikshitdutta parikshitdutta mentioned this pull request Mar 14, 2021
@mdeinum
Copy link
Collaborator

mdeinum commented Mar 15, 2021

First of all thanks for the contribution. However, as the author of this library, I would suggest to either

  1. Implement this with the RowMapper ignore the column while mapping to the object
  2. Implement this with a RowSet and RowSetFactory and ignore it there.

The readers for Excel are, deliberately, inspired by the the FlatFileItemReader (amongst others) and only allow for skipping rows and not columns. That is better left to either of the aforementioned solutions.

@parikshitdutta
Copy link
Author

parikshitdutta commented Mar 15, 2021

Thanks @mdeinum for your feedback.

As per my understanding and as you also mentioned in your second suggested solution, that is the best way to implement the functionality is using RowSet and RowSetFactory

If you refer to my submitted PR, the logic for skipping column is defined in Rowset and DefaultRowSet respectively, however I have invoked the same from AbstractExcelItemReader.doRead() to keep the skipping column option available at the readers level.

Do you mean, I rather should abstract it from readers, and keep the implementation details at RowSet level?

In that scenario, I should use RowSet.next() to implement the same.

Your thought @mdeinum ?

@mdeinum
Copy link
Collaborator

mdeinum commented Mar 15, 2021

The logic is a bit scattered, the ItemReader now suddenly needs to know it needs to skip columns. That should be a sole responsibility of the RowSet and should be transparent for the ItemReader and other components using a RowSet. Another thing that bothers me is why just a single column? What if you have 10 columns and want to skip multiple columns (or looking at it from a different side, read only a part of that column).

I'd rather see something like available in the DelimitedLineTokenizer (the RowSet (initial inspiration came from that and the JDBC support in Spring). That will read all columns by default or only the list of included columns. For the remainder of the infrastructure, it is completely transparent that this is happening.

The columns to include could be part of the RowSetMetaData so that it can apply the filtering to the headers as well and it could be obtained/used as well when reading the current row (and filter out the given columns).

@mdeinum
Copy link
Collaborator

mdeinum commented Sep 19, 2023

Closing due to inactivity. If we want to revisit we can re-open or create a new issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants