Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix appending to a sqlite table from a CSV with a header #77

Merged
merged 17 commits into from
Feb 9, 2015
Merged

Fix appending to a sqlite table from a CSV with a header #77

merged 17 commits into from
Feb 9, 2015

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Feb 8, 2015

No description provided.


statement = """
(echo '.mode csv'; echo '.import {abspath} {tblname}';) | sqlite3 {dbpath}
pipe=$(mktemp -t pipe) && rm -f $pipe && mkfifo -m 600 $pipe && (tail +{n} {abspath} > $pipe &) && (echo '.separator {delim}'; echo ".import $pipe {tblname}";) | sqlite3 {dbpath}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first statement is probably better if there isn't a header. Also we'll need to be sensitive to windows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We raise an MDNotImplementedError() if os.name == 'nt', in the execution function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still get fast loads on windows in the no-header case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, that would cause us to search for a new path, which would likely be through Iterator -> sa.Table.insert()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think keeping the first statement around is just going to be a bigger maintenance burden in the future. any particular reason you want to keep it?

@mrocklin
Copy link
Member

mrocklin commented Feb 8, 2015

i think keeping the first statement around is just going to be a bigger maintenance burden in the future. any particular reason you want to keep it?

If the user is on windows and if there is no header then it is useful and fast. Could code be something like the following?

if not has_header:
    statement = ...
elif not os.name = 'nt':
    statement = ...
else:
    raise NotImplementedError()

@cpcloud
Copy link
Member Author

cpcloud commented Feb 8, 2015

no the commands with be different because batch commands are different than shell commands. echoing newlines in particular is very different

@cpcloud
Copy link
Member Author

cpcloud commented Feb 8, 2015

i'll work out a windows solution at least for things without headers

@cpcloud
Copy link
Member Author

cpcloud commented Feb 8, 2015

well it turns out that sqlite takes arguments so that's good. we may be able to remove some code

@cpcloud
Copy link
Member Author

cpcloud commented Feb 9, 2015

i think is ready to go, we're passing on windows, i've xfailed the tests for csvs that have a header

@mrocklin
Copy link
Member

mrocklin commented Feb 9, 2015

+1

cpcloud added a commit that referenced this pull request Feb 9, 2015
Fix appending to a sqlite table from a CSV with a header
@cpcloud cpcloud merged commit b9097ac into blaze:master Feb 9, 2015
@cpcloud cpcloud deleted the fix-sqlite-csv-with-header branch February 9, 2015 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants