New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoload csv files from data directory #2761
Conversation
Oh goodness, I thought I did this! Thanks for the PR. Looks pretty good to me. |
data[key] = SafeYAML.load_file(path) | ||
case File.extname(path).downcase | ||
when '.csv' | ||
data[key] = CSV.read(path, headers: true).map(&:to_hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We follow the GitHub Ruby Style Guide, which dictates we use hash rockets:
data[key] = CSV.read(path, :headers => true).map(&:to_hash)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, what happens if no header is specified? /cc @benbalter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hashrocket added.
As for headers, if you didn't have headers in the CSV, there would be no way to do things like site.members.name (as there wouldn't be anything to say it was a name), so I think it's OK for Jekyll to support a very precise definitions of CSV, i.e. comma separated and includes header row. That's what most people will want to use anyway. If there wasn't a header row, you'd get junk data, but there's currently no simple way to be sure if a CSV has a header or not, so we can't really throw an error.
I agree that we should enforce headers. I would really like a way to show some sort of error if no headers exist. Or add a huuuge warning in the docs and the release notes should say |
@benbalter may also have an idea. He works with this kind of data quite often. |
We've been building http://csvlint.io recently for CSV validation, and I'm 99% sure we don't have a reliable way to autodetect headers, so I expect it'll have to be a documentation thing. Anyway, we'll see what the others say first! |
I agree with @Floppy that detecting whether a CSV file has a header is unreliable in the general case. It works great on big juicy files with cells stuffed with numbers, dates, and the like, but it breaks your heart on important edge cases, including tables with few rows, or a table full of short strings. I think it'd definitely be reasonable to treat the following cases as errors:
Anything that tries to be much smarter than that, it'd be great to have a configuration switch to turn off for when predictability is important. Very happy user of the |
Great set of criteria. Thinking more about it now, this kind of validation would better serve the |
That could work. The core of csvlint.io is in a gem, https://github.com/theodi/csvlint.rb/. We could add the heuristic @paulfitz suggests to that, and integrate that check into |
Thank you for shipping it with 2.4.0 ! Love it <3 |
Sometimes it's simplest to store data in CSV format. This PR autoloads these files as well, just like JSON or YAML.