Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite-utils memory can't deal with multiple files with the same name #325

Closed
karlb opened this issue Sep 8, 2021 · 4 comments
Closed
Labels
bug Something isn't working

Comments

@karlb
Copy link

karlb commented Sep 8, 2021

When I use multiple files with the same name, e.g. in sqlite-utils memory a/bug.csv b/bug.csv, sqlite-utils creates invalid views.

Traceback (most recent call last):
  File "/home/karl/.local/bin/sqlite-utils", line 8, in <module>
    sys.exit(cli())
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/sqlite_utils/cli.py", line 1299, in memory
    db[csv_table].transform(types=tracker.types)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/sqlite_utils/db.py", line 1287, in transform
    self.db.execute(sql)
  File "/home/karl/.local/pipx/venvs/sqlite-utils/lib/python3.9/site-packages/sqlite_utils/db.py", line 421, in execute
    return self.conn.execute(sql)
sqlite3.OperationalError: error in view t1: no such table: main.bug

This can be reproduced with

#!/bin/bash
mkdir foo
mkdir bar
echo -e 'col1,col2\nval1,val2' > foo/bug.csv
echo -e 'col3,col4\nval3,val4' > bar/bug.csv
sqlite-utils memory */bug.csv 'SELECT 1'

Ideally, the tables would get unique names by including the next path segment until the names are unique. But just making the numbered t* aliases work would be good enough.

This problem can of course be worked around by renaming the files, but it would be nice if this case was handled more gracefully.

Thanks a lot for this great tool!

@simonw simonw added the bug Something isn't working label Sep 22, 2021
@simonw
Copy link
Owner

simonw commented Sep 22, 2021

The t1 and t2 aliases were meant to handle this case, but the are no good if the tool throws an error.

@simonw
Copy link
Owner

simonw commented Sep 22, 2021

Oddly I can't replicate this on macOS:

(sqlite-utils) sqlite-utils % ls foo/*.csv
foo/bug.csv
(sqlite-utils) sqlite-utils % ls bar/*.csv
bar/bug.csv
(sqlite-utils) sqlite-utils % sqlite-utils memory foo/bug.csv bar/bug.csv --schema
CREATE TABLE "bug" (
   [col1] TEXT,
   [col2] TEXT
);
CREATE VIEW t1 AS select * from [bug];
CREATE VIEW t AS select * from [bug];

@simonw
Copy link
Owner

simonw commented Sep 22, 2021

Here's the relevant code:

csv_path = pathlib.Path(path)
csv_table = csv_path.stem
csv_fp = csv_path.open("rb")
rows, format_used = rows_from_file(csv_fp, format=format, encoding=encoding)

I can fix this by checking to see if csv_table is already in use and adding a suffix.

@simonw simonw closed this as completed in c1b26ee Sep 22, 2021
simonw added a commit that referenced this issue Sep 22, 2021
@simonw
Copy link
Owner

simonw commented Sep 22, 2021

Updated documentation: https://sqlite-utils.datasette.io/en/latest/cli.html#running-queries-directly-against-csv-or-json

If two files have the same name they will be assigned a numeric suffix:

$ sqlite-utils memory foo/data.csv bar/data.csv "select * from data_2"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants