Skip to content
This repository has been archived by the owner on Nov 28, 2023. It is now read-only.

Deployed front-end doesn't reload to include latest data #10

Closed
kairstenfay opened this issue Jul 3, 2020 · 12 comments · Fixed by #12
Closed

Deployed front-end doesn't reload to include latest data #10

kairstenfay opened this issue Jul 3, 2020 · 12 comments · Fixed by #12
Assignees

Comments

@kairstenfay
Copy link
Contributor

It's unclear at this point whether this is a deployment configuration issue (and thus should be filed at seattleflu/backoffice) or something that can be fixed in this repo.

In the past couple of days, the deployed app at backoffice.seattleflu.org/switchboard has failed to update the front-end with the latest data in the sqlite database.

See this Slack thread for the steps I took to debug, as I'm unsure how much barcode/REDCap link sharing I should be doing in a public setting.

@kairstenfay
Copy link
Contributor Author

From @tsibley:

My guess is that this issue is related to how we're updating the SQLite file. Particularly if Datasette holds open file handles to it.

If we can replicate locally, then using lsof to look at the inodes the Datasette processes have open and comparing to the inode of the latest SQLite file would be helpful next time this occurs.

@tsibley
Copy link
Member

tsibley commented Jul 6, 2020

Reproducing locally might also be easier by running Datasette single-threaded so that test requests are sure to hit the same process/thread.

In this code:

https://github.com/seattleflu/scan-switchboard/blob/92a9a9ecbbe9e035f51821bad60fd611d79e1ade/Makefile#L6-L8

We're not updating the SQLite file in place, but replacing it with a different file. That was intentional to avoid a bad update leaving an existing but broken database file, but it may not place nice with Datasette. Casual testing during dev seemed to show that Datasette/SQLite copes just fine, but more rigorous testing may show otherwise.

@tsibley
Copy link
Member

tsibley commented Jul 6, 2020

Fortuitously, this just happened again in production. I grabbed an lsof snapshot of the Datasette process (PID reported by systemctl status scan-switchboard) and stat'd the current SQLite file.

image

The current SQLite file is inode 2560088, and as suspected, the Datasette threads are still holding open file descriptors for the deleted files. lsof helpfully reports this, but you can also see the inodes (last number before the file path) are different (2560067 and 2560061). They're not even consistent across threads, since the threads presumably opened the path at different times.

I think in-place updates will work if we stop re-creating/overwriting the database file everytime but instead only update it in-place via SQL commands. But we should robustly test this locally, and to do so consistently will need to make sure Datasette is running single-threaded. (Not sure if that's possible, though?)

@kairstenfay kairstenfay self-assigned this Jul 6, 2020
@joverlee521
Copy link

Can we just update in-place via cp?

cp $@.new $@
rm $@.new

@kairstenfay
Copy link
Contributor Author

kairstenfay commented Jul 7, 2020

Working on this now, but wanted to say that reverting commit de07e5c4 produces the same error (of datasette pointing to a deleted sqlite file).

Same goes for @joverlee521's cp/rm suggestion, unfortunately.

@kairstenfay
Copy link
Contributor Author

kairstenfay commented Jul 7, 2020

If I understand @tsibley correctly, updating the sqlite file in place also results in the same error.

Never mind, further testing seems to show it does fine.

data/scan-redcap.sqlite: data/record-barcodes.ndjson derived-tables.sql
	sqlite-utils insert --nl $@ record_barcodes $<
	sqlite3 $@ < derived-tables.sql

@kairstenfay
Copy link
Contributor Author

kairstenfay commented Jul 7, 2020

Ah, my original testing process was flawed. It appears both @joverlee521's solution,

data/scan-redcap.sqlite: data/record-barcodes.ndjson derived-tables.sql
	sqlite-utils insert --nl $@.new record_barcodes $<
	sqlite3 $@.new < derived-tables.sql
	cp $@.new $@
	rm $@.new

, and updating in place (my code in the previous comment) avoid Datasette threads' holding open file descriptors of deleted files.
Although, the in place solution does not remove deleted barcodes.

@kairstenfay
Copy link
Contributor Author

kairstenfay commented Jul 7, 2020

Description of my local testing:

  1. restart the service with systemctl restart scan-switchboard
  2. run make -B but with a project, e.g. SCAN IRB English, removed.
  3. run make -B again but including the project removed in step 2.
  4. see if the web app can access the newly included data

@tsibley
Copy link
Member

tsibley commented Jul 7, 2020

Testing with Datasette using multiple threads is likely to give misleading results, as your different requests may hit different threads which only open the SQLite file as necessary. This is why different threads in the lsof output above had different inodes opened.

Using cp instead is a good thought! But I'm wary of overwriting the entire SQLite file in-place, as that may modify other parts of the SQLite file structure behind Datasette's back and lead to other subtle issues. Depending on how the SQLite libraries handle this, it might be ok, but I don't know SQLite's guarantees here.

One option might be to leave the SQLite file itself alone and instead do the replacement atomically within SQL. That is, within a single transaction, truncate the data table before insert. I added support to sqlite-utils for this yesterday in a PR, thinking we might need it.

@tsibley
Copy link
Member

tsibley commented Jul 7, 2020

The num_sql_threads option to Datasette allows specifying how many Python threads are used to open the SQLite file and run queries. If I set it to 1 by:

pipenv run ./bin/serve --config num_sql_threads:1

then overwrite the file in-place with cp and make a new request, I see this error in the logs:

ERROR: conn=<sqlite3.Connection object at 0x7f27e1bfd1f0>, sql = 'select count(*) from [duplicate_record_ids]', params = None: database disk image is malformed

@kairstenfay
Copy link
Contributor Author

The num_sql_threads option to Datasette allows specifying how many Python threads are used to open the SQLite file and run queries. If I set it to 1 by:

pipenv run ./bin/serve --config num_sql_threads:1

then overwrite the file in-place with cp and make a new request, I see this error in the logs:

ERROR: conn=<sqlite3.Connection object at 0x7f27e1bfd1f0>, sql = 'select count(*) from [duplicate_record_ids]', params = None: database disk image is malformed

Hey Tom, where do you see this error? I invoked ./bin/serve with the num_sql_threads option, and I don't see that error logged in my syslog, systemctl or journalctl status outputs.

@tsibley
Copy link
Member

tsibley commented Jul 7, 2020

It's in the terminal output when I run

pipenv run ./bin/serve --config num_sql_threads:1

tsibley added a commit that referenced this issue Jul 8, 2020
…file

This moves the update from the filesystem layer into the SQL layer, thus
allowing multiple processes to coordinate.  Datasette holds open SQLite
connections and was keeping references to the deleted files.

Since the Makefile recipe got more complicated, move it to a shell
script.

Resolves #10.
tsibley added a commit that referenced this issue Jul 8, 2020
…file

This moves the update from the filesystem layer into the SQL layer, thus
allowing multiple processes to coordinate.  Datasette holds open SQLite
connections and was keeping references to the deleted files.

The new --truncate option to `sqlite-utils insert` is added in a PR I
submitted <simonw/sqlite-utils#118>.  For now,
sqlite-utils is installed from our fork.  When/if --truncate is released
officially, we can switch back to installing from PyPI.

Resolves #10.
@tsibley tsibley closed this as completed in d5bd474 Jul 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants