-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds improvements to CSV Importer #33
Conversation
Please review after adding support for |
lib/daru/io/importers/csv.rb
Outdated
|
||
def process_compression | ||
return ::Zlib::GzipReader.new(open(@path)).read if @compression == :gzip | ||
open(@path) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it fine to compute .gzip
readings in the from_csv
method itself, or would it be neater to have Importers::CSVGZip
inheriting Importers::CSV
? I have the same doubt regarding exporter (PR #34) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I would say inherit so that there is very explicit division of responsibilities. @zverok WDYT?
.rubocop.yml
Outdated
@@ -47,7 +47,7 @@ Metrics/ClassLength: | |||
Max: 200 | |||
|
|||
Metrics/ParameterLists: | |||
Max: 10 | |||
Max: 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/me looks suspiciously
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zverok - We finally have an Importer with dozens of arguments. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it is a good sign, to be honest. Maybe it is time to switch to some options
/allowed_options
and something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zverok - Acknowledged. Combined CSV stdlib options like :col_sep
, :converters
, etc. to **options
with default values. It's now back to < 10.
lib/daru/io/importers/csv.rb
Outdated
def compression?(algorithm, *formats) | ||
return true if @compression == algorithm | ||
formats.each { |f| return true if @path.end_with?(f) } | ||
false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe than any?
should help here.
lib/daru/io/importers/csv.rb
Outdated
.by_col | ||
.map do |col_name, values| | ||
next [col_name, []] if values.nil? | ||
[col_name, values[@skiprows..-1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe ?:
would be surprisingly more clear here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, but I thought that this looked more neater. If not, sure - we can change it to ?:
. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I believe that return ... if
and next .... if
is a last resort (when clear statement of entire algo is impossible, and you need to split it in steps). BUT!!!! If it looks cleaner for you and it is mindful solution -- let it stay. I want to share my experience, not to force my personal styles :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's merge it. I have some doubts on details, but let's speak about it later.
@zverok - Acknowledged, please feel free to open an issue to discuss about the details. (just to make sure it's not forgotten in due course of time 😉 ) |
Features added :
:skiprows
option to skip the firstskiprows
rows.csv.gz
format withcompression: :gzip
optionCloses issue #15
Closes issue #30
Closes issue #32
Tests are again, yet to be beautified as per PR #27