Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Powerpoint, Word doc, PDF, Outlook email, ZIP, GZ, BZ2 #25

Closed
wants to merge 2 commits into from

Conversation

tooptoop4
Copy link

Fixes #16

Copy link
Member

@ebyhr ebyhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating the PR. Please remove the redundant first commit, add tests and resolve conflicts.

Comment on lines +96 to +313
lst.add("bcc_list " + bcc_list);
lst.add("subject " + subject);
lst.add("body " + body);
lines = lst.iterator();
totalBytes = input.getCount();

try (CountingInputStream input = new CountingInputStream(byteSource.openStream())) {
lines = plugin.getIterator(byteSource);
if (plugin.skipFirstLine()) {
lines.next();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
totalBytes = input.getCount();
}
catch (IOException e) {
throw new UncheckedIOException(e);
//zip
else if (tblName.endsWith(".zip") || tblName.contains(".zip?")) {
ArrayList<String> lst = new ArrayList<>();
BufferedReader bufferedeReader = null;
try (CountingInputStream input = new CountingInputStream(byteSource.openStream());
ZipInputStream zin = new ZipInputStream(input);
InputStreamReader isr = new InputStreamReader(zin)) {

ZipEntry entry;
bufferedeReader = new BufferedReader(isr);
while ((entry = zin.getNextEntry()) != null) {
String line = bufferedeReader.readLine();
while (line != null) {
lst.add(line);
line = bufferedeReader.readLine();
}
}
lines = lst.iterator();
totalBytes = input.getCount();
} catch (IOException e) {
throw new UncheckedIOException(e);
} catch (Exception e) {
throw new RuntimeException(e.getMessage());
} finally {
try { if (bufferedeReader != null)
bufferedeReader.close();
} catch (Exception e) { /* ignored */ }
}
}
//gz
else if (tblName.endsWith(".gz") || tblName.endsWith(".gzip") || tblName.contains(".gz?") || tblName.contains(".gzip?")) {
//getting unreadable compressed data back right now!
//todo need to fix with https://stackoverflow.com/a/11093226/8874837 https://www.rgagnon.com/javadetails/java-HttpUrlConnection-with-GZIP-encoding.html
ArrayList<String> lst = new ArrayList<>();
BufferedReader in = null;
try (CountingInputStream input = new CountingInputStream(byteSource.openStream())) {
in = new BufferedReader(new InputStreamReader(new GZIPInputStream(input)));
String inputLine;
while ((inputLine = in.readLine()) != null){
lst.add(inputLine);
}
lines = lst.iterator();
totalBytes = input.getCount();
} catch (IOException e) {
throw new UncheckedIOException(e);
} finally {
try { if (in != null)
in.close();
} catch (Exception e) { /* ignored */ }
}
}
//bz2
else if (tblName.endsWith(".bz2") || tblName.endsWith(".bzip2") || tblName.contains(".bz2?") || tblName.contains(".bzip2?")) {
ArrayList<String> lst = new ArrayList<>();
BufferedReader in = null;
try (CountingInputStream input = new CountingInputStream(byteSource.openStream())) {
in = new BufferedReader(new InputStreamReader(new MultiStreamBZip2InputStream(input)));
String inputLine;
while ((inputLine = in.readLine()) != null){
lst.add(inputLine);
}
lines = lst.iterator();
totalBytes = input.getCount();
} catch (IOException e) {
throw new UncheckedIOException(e);
} finally {
try { if (in != null)
in.close();
} catch (Exception e) { /* ignored */ }
}
}
else {
//text/csv..etc
try (CountingInputStream input = new CountingInputStream(byteSource.openStream())) {
lines = plugin.getIterator(byteSource);
if (plugin.skipFirstLine()) {
lines.next();
}
totalBytes = input.getCount();
}
catch (IOException e) {
throw new UncheckedIOException(e);
}
Copy link
Member

@ebyhr ebyhr Jun 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please refer to the existing implementation. I wouldn't add file specific logic here.

@tooptoop4
Copy link
Author

found some bugs in the readers

@tooptoop4 tooptoop4 closed this Sep 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants