Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Applications using both gtk3 and sqlcipher do not work #385

Closed
eli-schwartz opened this issue Mar 18, 2021 · 13 comments
Closed

Applications using both gtk3 and sqlcipher do not work #385

eli-schwartz opened this issue Mar 18, 2021 · 13 comments

Comments

@eli-schwartz
Copy link

Expected Behavior

An sqlcipher-using program should not care what UI toolkits or other dependencies use indirectly. Opening databases exclusively via the sqlcipher API should work in a self-contained manner.

Actual Behavior

Recent versions of gtk3 directly link to upstream sqlite3 to provide generic system features.

Applications that link to both gtk3 and sqlcipher report data corruption due to inability to open the database, then offer to delete the encrypted database and start afresh.

See for example:
signalapp/Signal-Desktop#4513 (comment)
https://bugs.archlinux.org/task/69980

Steps to Reproduce

Install Signal Desktop or Element Desktop on Arch Linux, and downgrade gtk3 to 3.24.27-3.
Try to access encrypted messages.

Recommended fix

Since sqlcipher provides an overlapping API with sqlite, it is not safe to load both libraries into the same process space. The symbols will clash and unpredictably resolve one or the other.

Symbol clashing can be fixed! Please consider using a GNU ld version script: https://sourceware.org/binutils/docs-2.29/ld/VERSION.html

The build-time API will remain the same. The ABI will change, and after relinking, programs which use e.g. sqlite3_open via sqlcipher will resolve the symbol sqlite3_open@SQLCIPHER (or however you choose to name it) which ensures the symbol comes from libsqlcipher.so rather than libsqlite3.so

SQLCipher version: 4.4.2

@sjlombardo
Copy link
Member

Hello @eli-schwartz, thanks for your interest in SQLCipher. This situation with Signal is serious and undesirable, however, the proposed change to SQLCipher core may not be appropriate in this case. SQLCipher is intended solely as a drop-in replacement for SQLite. The exact duplication of the API and symbols is deliberate because the intended use is for it to be loaded instead of SQLite, not alongside it.

That said, this issue can and should be handled downstream at the application level:

  1. While it may feel like a workaround, using LD_PRELOAD is a legitimate approach here because it will substitute the system SQLite with SQLCipher which is the intended usage model;
  2. Signal could statically link SQLCipher so that it cannot be substituted at runtime by another library; or
  3. The Signal application build could include its own ld version script when it builds SQLCipher from source as you proposed

That said, as noted before in other similar tickets, we recommend against linking multiple versions of SQLite/SQLCipher simultaneously because it is potentially dangerous and introduces a documented risk of database corruption.

Also, as an important point of clarification, this issue is not actually corrupting the Signal database as reported in signalapp/Signal-Desktop#4513 and https://bugs.archlinux.org/task/69980. Rather, when the system libsqlite3.so is being used by Signal and it encounters a pre-existing encrypted database, it simply does not know how to open it. The library reports that it is not a known database format, but the database is just fine. It is the Signal application behavior that offers to delete the database and start over (which is perhaps a dangerous thing to do).

We usually advise that applications using SQLCipher via dynamic linking perform a runtime check to verify that the expected libsqlcipher.so library is being used. Executing the PRAGMA cipher_version statement provides a simple way for an application to do this. SQLCipher will report its version and build type, which can be checked before use of the library. If a non-SQLCipher library is loaded instead the statement will return an empty result set, and the application can fail safely. This might be a good addition to Signal to help prevent unexpected behavior like this.

Finally, in addition to the potential data loss / functional issue for existing Signal users, this issue risks data exposure for new users. The use of the system SQLite will not throw any errors if an encrypted database does not already exist. I suspect a new install will just leave the user's message database unencrypted on the file system. A runtime version check will help address this problem as well.

Let us know if this rationale makes sense or if you have any other thoughts or questions.

@eli-schwartz
Copy link
Author

eli-schwartz commented Mar 19, 2021

I did acknowledge that applications may report this, then offer to delete something they shouldn't. This is of course not the fault of sqlcipher. But it is an example of real world effects.

I tried to make it clear that the larger issue here is sqlcipher is incorrect for assuming that an application gets to choose to only link one of (sqlcipher, libsqlite) and not the other. But maybe I wasn't clear.

Applications that have NO dependencies other than sqlcipher and a system libc can choose this. It is not safe, sane, or reasonable to assume any other kind of software gets to choose. Gtk3 demonstrates this by providing operating system level functionality that is supposed to use boring old regular unencrypted sqlite databases. Gtk3 "tracker" filesystem indexing runs in the same process space as, and (this is important) across a project boundary from, "signal, a gtk3 application that uses sqlcipher for the signal database".

Running two different sql libraries, each on a different database, in the same process space... is not fundamentally a bad or wrong thing to do. The problem is when they both bind to the same symbol namespace, and the C compiler cannot instruct the binary at runtime to use the same version of the symbol that each location was actually linked to.

I don't understand why it would be inappropriate for sqlcipher core. Users and developers still compile against "sqlite_foo", not "sqlite_foo@SQLCIPHER", this is entirely just compiler symbol mangling to avoid runtime collision.

Unless you want people to be able to compile applications against libsqlite, distribute the binaries to users, and let the users secretly swap in sqlcipher, collide the symbols, and trick the binary into successfully reading encrypted databases???

@eli-schwartz
Copy link
Author

Note that cipher_version seems like it would not be useful for any purpose other than to gracefully error out and say "sorry, Mr. User, we can't find the right sql library so we have no idea what to do. Your system is broken. Sorry."

Which is admittedly better than claiming the database is broken and deleting data. But I can't help feeling that the right solution is to make gtk3 applications actually able to use sqlcipher.

@eli-schwartz
Copy link
Author

Note also that signal happens to use a forked private copy of sqlcipher that doesn't do the right thing, but other applications use the system sqlcipher, I mentioned for example element-desktop. Those can't be fixed by statically compiling, or by version scripts on a system loaded external .so

@sjlombardo
Copy link
Member

sjlombardo commented Mar 22, 2021

Hello @eli-schwartz. Before going any further into discussion about whether this should be included in SQLCipher core, we decided to run a proof of concept to determine if this approach is a viable option to solve the original problem.

This involved setting up a test case as follows. First, libsqlcipher.so was built using the standard SQLCipher source release, but with the addition of the following version script included via -Wl,--version-script=version.map

SQLCIPHER_4 {
  global: sqlite3*;
  local: *;
};

The resulting .so was verified to ensure it included the version tag via nm

$ nm libsqlcipher.so
...
0000000000000000 A SQLCIPHER_4
...

Next, we created two simple shared libraries that simply invoke some basic functions of the sqlite3_* API (e.g sqlite3_open, sqlite3_exec, etc.) and print version information.

The first, liba.so, is linked against the system SQLite (libsqlite3.so via libsqlite3-dev on Ubuntu 20.04.02 LTS):

$ ldd liba.so 
	...
	libsqlite3.so.0 => /usr/lib/x86_64-linux-gnu/libsqlite3.so.0 (0x00007fbe0608e000)
	...

The second, libb.so, is linked against the libsqlcipher.so library prepared in the previous step.

$ ldd libb.so 
	...
	libsqlcipher.so.0 => ../sqlcipher/.libs/libsqlcipher.so.0 (0x00007fcc8e404000)
	...
	libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fcc8ddcd000)
	...

nm shows libb.so is referencing the intended versioned symbols:

$ nm libb.so 
...
0000000000001279 T run_test_b
                 U sqlite3_close@@SQLCIPHER_4
                 U sqlite3_column_text@@SQLCIPHER_4
                 U sqlite3_column_type@@SQLCIPHER_4
                 U sqlite3_exec@@SQLCIPHER_4
                 U sqlite3_finalize@@SQLCIPHER_4
                 U sqlite3_libversion@@SQLCIPHER_4
                 U sqlite3_open@@SQLCIPHER_4
                 U sqlite3_prepare_v2@@SQLCIPHER_4
                 U sqlite3_step@@SQLCIPHER_4
...

Finally a simple driver program was linked with -la -lb to call the test function in each library.

Unfortunately, even when using the version script for libsqlcipher.so, if the system libsqlite3.so (without versioning) is resolved first, it is used for all the sqlite3* symbols including for libb.so.

To round out the experiment we did prepare a specialized standalone build of libsqlite3.so which used its own version script with a different version tag. Only in this case, where both libsqlite3.so and libsqlcipher.so were versioned, did the linker use distinct sqlite3_* APIs for the two libraries.

Based on these results, it seems that in order for this to work consistently, both the libsqlite3.so and libsqlcipher.so would need to be build with version scripts on Linux. When only SQLCipher is built with a version script, the non-versioned symbols from libsqlite3.so may be substituted. SQLite core and the precompiled libraries included with Linux distributions, omit the requisite version information. This results in the same undesirable behavior.

I will be the first to admit that my knowledge of DSO versioning is limited. On the other hand, there are posts on other forums which describe this same behavior, including in the context of problems substituting specific versions of SQLite (albeit with Firefox).

If you are able to provide advice on how to alter this linker behavior, or you can prepare a working reference example that demonstrates this solution as intended, we'd be happy to take a look at it and discuss further.

primeos added a commit to primeos/nixpkgs that referenced this issue May 14, 2021
AFAIK this is the only reliable way for us to ensure SQLCipher will be
loaded instead of SQLite. It feels like a hack/workaround but according
to the SQLCipher developers [0] "this issue can and should be handled
downstream at the application level: 1. While it may feel like a
workaround, using LD_PRELOAD is a legitimate approach here because it
will substitute the system SQLite with SQLCipher which is the intended
usage model;".

This fixes NixOS#108772 for NixOS 20.09 users who upgrade to NixOS 21.05 and
replaces NixOS#117555.

For nixos-unstable users this will unfortunately break everything again
so we should add a script to ease the transition (in a separate commit
so that we can revert it for NixOS 21.05).

[0]: sqlcipher/sqlcipher#385 (comment)
@rfjakob
Copy link

rfjakob commented Dec 4, 2021

Landed here because my Signal installation exploded, but back on the topic of symbol versioning:

Reading https://maskray.me/blog/2020-11-26-all-about-symbol-versioning#linker-behavior suggests that libb.so should have referenced sqlite3_close@SQLCIPHER_4 (single "@") instead of sqlite3_close@@SQLCIPHER_4 (double "@@").

@sjlombardo
Copy link
Member

Hey @rfjakob - thanks for taking a look at this. Can you provide any advice on how to link so that the single "@" is used? I've read through the document you linked to, but don't see any guidance on how to do so.

@heftig
Copy link

heftig commented Dec 6, 2021

signalapp/better-sqlite3#3 should have made it irrelevant since the symbols are no longer exposed.

PS: Whoops, misinterpreted this issue, sorry. This one is about sqlcipher in general, not just in the context of Signal/Electron.

@heftig
Copy link

heftig commented Dec 6, 2021

Hey @rfjakob - thanks for taking a look at this. Can you provide any advice on how to link so that the single "@" is used? I've read through the document you linked to, but don't see any guidance on how to do so.

I think you would need to change the headers and add __attribute__ ((__symver__ ("sqlite3_<name>@SQLCIPHER_4"))) to each item.

@sjlombardo
Copy link
Member

After reviewing many related issues, it seems that almost all dependent projects have been able to resolve these issues either via linking changes or preloading, with no invasive changes required to SQLCipher. I will close this for now, as we haven't had any other reported issues in more than a year.

@sergiomb2
Copy link

After reviewing many related issues, it seems that almost all dependent projects have been able to resolve these issues either via linking changes or preloading, with no invasive changes required to SQLCipher. I will close this for now, as we haven't had any other reported issues in more than a year.

I don't agree with you , it is required change SQLCipher to not clash with SQLLite , you should find a way to change all symbols
sqlcipher should not provide an overlapping API with sqlite and should be safe to load both libraries into the same process space and fix the possibility of " symbols clash and unpredictably resolve of one or the other library".

@developernotes
Copy link
Member

Hi @sergiomb2,

sqlcipher should not provide an overlapping API with sqlite and should be safe to load both libraries into the same process space and fix the possibility of " symbols clash and unpredictably resolve of one or the other library".

This isn't just an issue with SQLCipher, nor of addressing duplicate symbols; linking multiple versions of SQLite into the same application pose a database corruption possibility 1.

Footnotes

  1. https://sqlite.org/howtocorrupt.html#multiple_copies_of_sqlite_linked_into_the_same_application

@sergiomb2
Copy link

Hi @sergiomb2,

sqlcipher should not provide an overlapping API with sqlite and should be safe to load both libraries into the same process space and fix the possibility of " symbols clash and unpredictably resolve of one or the other library".

This isn't just an issue with SQLCipher, nor of addressing duplicate symbols; linking multiple versions of SQLite into the same application pose a database corruption possibility 1.

Footnotes

1. https://sqlite.org/howtocorrupt.html#multiple_copies_of_sqlite_linked_into_the_same_application [↩](#user-content-fnref-1-1a7c95ca657bacfedd2fdd3919a177c6)

Quoting "Recent versions of gtk3 directly link to upstream sqlite3 to provide generic system features." the link above describe exactly the same problem, when an app link to 2 versions on SQLite .

Clearly SQLite is problematic with versions, but be forced to use export LD_PRELOAD= is not a good solution , so modify all symbols with a prefix or suffix and bump the "so" version , I think is an easy thing to do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants