Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add permalink virtual field to items table #6

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ __pycache__/
venv
.eggs
.pytest_cache
*.egg-info
*.egg-info
build
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was created as a result of pip install . - if that's not the correct way to work locally, then we can remove.

22 changes: 22 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,25 @@ Run Datasette like this:
$ datasette -m metadata.json hacker-news.db

The timestamp columns will now be rendered as human-readable dates, and any HTML in your posts will be displayed as rendered HTML.

## Package Development

After cloning, install the dependencies (preferably in a virtual environment):

```sh
pip install --editable '.[test]'
```

This gives you everything you need to run and develop the package. Running the tests should now work:

```sh
pytest
```

As you make changes to the code, you can re-run it using:

```sh
.venv/bin/hacker-news-to-sqlite
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole section is basically a total guess. If you have a different process (that you can either document or link) that would be super helpful!

```

Which should reflect your changes immediately.
9 changes: 9 additions & 0 deletions hacker_news_to_sqlite/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,15 @@ def ensure_tables(db):
{"id": int, "type": str, "by": str, "time": int, "title": str, "text": str},
pk="id",
)
# includes hidden columns
all_column_names = {
c[1] for c in db.execute("PRAGMA table_xinfo([items])").fetchall()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated columns are hidden and are thus not included in db['items'].column_dict

}
if "permalink" not in all_column_names:
db.execute(
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be improved by a resolution on simonw/sqlite-utils#411

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I was mulling this over a bit - making the table virtual provides backwards compatibility, but it might be better to create a real column for new databases. It'll take up space storing a lot of nearly-identical strings, but it won't incur repeated computation at runtime.

I was thinking about it in the context of adding a num_children column to make interacting with the string kids column easier. This can be done in pure sqlite as another virtual column, but it would be easier to pre-compute it in new databases (and provide a virtual table for existing ones)

'ALTER TABLE items ADD COLUMN permalink TEXT GENERATED ALWAYS as ("https://news.ycombinator.com/item?id=" || id) VIRTUAL;'
)

if "users" not in db.table_names():
db["users"].create(
{"id": str, "created": int, "karma": int, "about": str}, pk="id"
Expand Down
1 change: 1 addition & 0 deletions tests/test_hacker_news_to_sqlite.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ def test_import_user(tmpdir, requests_mock):
"time": 1583377246,
"kids": "[22491039, 22490633, 22491277, 22492319, 22490883, 22491996, 22502812, 22491049, 22491052, 22491001, 22490704]",
"parent": 22485489,
'permalink': 'https://news.ycombinator.com/item?id=22490556',
"text": "The approach that has worked best for me is...",
"title": None,
}
Expand Down