Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DB layer and API for the Autofill Component #3582

Merged
merged 1 commit into from Sep 21, 2020
Merged

Conversation

lougeniaC64
Copy link
Contributor

@lougeniaC64 lougeniaC64 commented Sep 15, 2020

Fixes #3521 and #3522

Pull Request checklist

  • Quality: This PR builds and tests run cleanly
    • automation/all_tests.sh runs to completion and produces no failures
    • Note: For changes that need extra cross-platform testing, consider adding [ci full] to the PR title.
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry in CHANGES_UNRELEASED.md or an explanation of why it does not need one
    • Any breaking changes to Swift or Kotlin binding APIs are noted explicitly
  • Dependencies: This PR follows our dependency management guidelines
    • Any new dependencies are accompanied by a summary of the due dilligence applied in selecting them.

Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a terrific first cut, @lougeniaC64, thank you so much! I left some comments and nits throughout, but they’re suggestions, not hard blockers. I think after @mhammond has a look, we can get this landed quickly and iterate on it with PRs instead of having a big long-running branch.

Rust forms is aliiiiive! It’s aliiiiiive! ⚡⚡

components/forms/sql/create_shared_schema.sql Outdated Show resolved Hide resolved
time_last_modified INTEGER NOT NULL,
times_used INTEGER NOT NULL DEFAULT 0,

sync_change_counter INTEGER NOT NULL DEFAULT 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s a bit weird, but the mirror is a copy of the server state the last time we synced...it’s only written to by sync, when it fetches new records, and after uploading any local records. So it doesn’t need a sync_change_counter.

sync_change_counter INTEGER NOT NULL DEFAULT 1
);

CREATE TABLE IF NOT EXISTS addresses_tombstones (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about representing tombstones “inline”, like logins does—like a time_deleted field in addresses_data—instead of in their own table? On the one hand, it means dropping the NOT NULL constraints on data, or writing more complicated CHECK constraints. On the other, it avoids bugs like ones we’ve had in bookmarks, where we accidentally forgot to add a tombstone for a deleted item, or forgot to remove a tombstone so the same GUID showed up in data and tombstones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've definitely seen various ways for representing deleted records so this is very useful information. Avoiding bugs is ideal so I'll make these changes. Ignore my comments above on the NULL vs NOT NULL database constraints. 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other downside of that is that there's a risk that items have values and the deleted flag, which would be bad. It also means that almost every query must remember to have an extra condition to check the deleted flag - which would be easy to forget as testing will rarely show it up as the vast majority of time there will be no tombstones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From yesterday's meeting: let's keep tombstones in their own table. If we want to make sure a GUID doesn't end up in both, we can write a pair of triggers on addresses_data and addresses_tombstones that check if the GUID we're inserting exists in the other table, and does a RAISE(FAIL, ...) if so. Keeping the tombstones inline means we have to NULL out all the fields, and it's still possible to have weird stuff like a row with a name, address, and time_deleted.


CREATE TABLE IF NOT EXISTS credit_cards_data (
guid TEXT NOT NULL PRIMARY KEY,
cc_name TEXT NOT NULL, -- full name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question as for nullable fields in addresses. Would it be worth relaxing the constraints so that only guid and cc_number (and time_created and time_last_modified) are required, in case we can’t capture all the fields, or did you want to use ”” for values we don’t know?

- check if deleted on create and/or update?
*/

fn add_address(tx: &Transaction<'_>, mut address: Address) -> Result<Address> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! 💯

If you’re open to an API suggestion and it doesn’t complicate the code too much, how do you feel about splitting out the editable fields into their own struct (like NewAddressFields), and passing that in? You could then create an Address from AddressFields, with all the “internal” fields (GUID, timestamps, use counts) set in this function, and return that Address with the internal fields set like you have now!

I’m suggesting that because it feels a bit awkward for a caller to pass dummy fields in an Address—even if they’re zero or ””—and then having this function mutate them and return the modified address back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I was just running into complications from this as I completed the CRUD functions for addresses. 👍🏾

#[serde(rename = "id")]
pub guid: Guid,

pub given_name: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the database columns are nullable, these could be Option<String>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why is that a good thing? :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I wrote this at first, I was thinking you'd need to handle the field not existing—so if it was Option<String>, you'd need to remember to check Some and None, but with a regular String, you might forget to check if given_name.is_empty(). But like you said yesterday, you could still end up with Some(""), and it makes the API so clunky to handle it that it's not a good thing 😅

components/forms/sql/create_shared_schema.sql Outdated Show resolved Hide resolved
Ok(address)
}

fn get_address(conn: &Connection, guid: &Guid) -> Result<Option<Address>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s return an AddressDoesNotExist error if the address doesn’t exist, and simplify the return signature to just Result<Address>. I’m thinking, if the caller wants to distinguish between address not found and other kinds of DB errors, it can destructure the error instead of doing a double-unwrap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an oversight on my part but the query_row function I'm using to execute my sql statement actually returns an error if no rows are found. I'm going to remove the Option from the return type but let me know if I should convert the QueryReturnedNoRows type to a custom error.

|row| {
Ok(Address {
guid: Guid::from_string(row.get::<_, String>("guid")?),
given_name: row.get::<_, String>("given_name")?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
given_name: row.get::<_, String>("given_name")?,
given_name: row.get("given_name")?,

I’m not sure if this would compile (and please ignore me if you tried this already, and I’m talking nonsense!), but the type checker might be smart enough to infer this is a String without having to write out the turbofish.

Also, holy moly, so many fields! 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It compiles! Love it, so much cleaner.

Comment on lines 17 to 18
#[error("Quota exceeded: {0:?}")]
QuotaError(QuotaReason),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this came from webext_storage, but forms doesn’t have quotas, so let’s nix these (and InvalidConnectionType, ConnectionAlreadyOpen, and WrongApiForClose, too, I think).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - I had this comment queued too - remove anything that's not actually used :)

@lougeniaC64
Copy link
Contributor Author

This looks like a terrific first cut, @lougeniaC64, thank you so much! I left some comments and nits throughout, but they’re suggestions, not hard blockers. I think after @mhammond has a look, we can get this landed quickly and iterate on it with PRs instead of having a big long-running branch.

Rust forms is aliiiiive! It’s aliiiiiive! ⚡⚡

Thanks! I have the remainder of the addresses API committed locally but I'm going to make these updates before I push them. TL;DR this is a very very WIP PR.

Copy link
Member

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great Lougenia, nice work! As another meta comment, it looks like maybe rustfmt wasn't run? I see some odd formatting and missing line-ends at the end of files.

But this is awesome, and I agree with Lina that after another pass with our nits we should land it 🎉

components/forms/sql/create_shared_schema.sql Outdated Show resolved Hide resolved
tel TEXT, -- Stored in E.164 format
email TEXT,

-- TODO: removed computed fields, need to figure out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should consider removing them now.

sync_change_counter INTEGER NOT NULL DEFAULT 1
);

CREATE TABLE IF NOT EXISTS addresses_tombstones (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other downside of that is that there's a risk that items have values and the deleted flag, which would be bad. It also means that almost every query must remember to have an extra condition to check the deleted flag - which would be easy to forget as testing will rarely show it up as the vast majority of time there will be no tombstones.

#[serde(rename = "id")]
pub guid: Guid,

pub given_name: String,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But why is that a good thing? :)

Cargo.toml Outdated
@@ -1,6 +1,7 @@
[workspace]
# Note: Any additions here should be repeated in default-members below.
members = [
"components/forms",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a meta comment, I wonder if "forms" really is the correct name for the component? forms carries a bit of baggage from desktop - there's both an engine and a collection named 'forms', but this component is completely unrelated to both of them. Maybe "autofill"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually really like this idea because even without the desktop baggage the word "forms" is still quite ambiguous. "Autofill" on the other hand is much more meaningful. I just landed all the local updates I was working on so I'll push a rename next.

time_deleted INTEGER NOT NULL
) WITHOUT ROWID;

CREATE TABLE IF NOT EXISTS credit_cards_data (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should removing the CC table too if we don't see ourselves populating it in the short term - it might mislead the casual reader. It also allows us to side-step the requirements around client-side encryption of the CC number etc. Leaving it in as a comment saying like "we think the CC table might one day look like..." might be fine too.


#[serde(default)]
#[serde(deserialize_with = "deserialize_timestamp")]
pub time_created: i64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather see a distinct type for timestamp fields - it's worked out quite well in places, and I wish I'd done that in webext-storage, particularly when integrating with desktop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this would be used in more than one component should it the types logic be moved out of the places component? Would support be a good place for it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

support sounds perfect!

Comment on lines 17 to 18
#[error("Quota exceeded: {0:?}")]
QuotaError(QuotaReason),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep - I had this comment queued too - remove anything that's not actually used :)

@codecov-commenter
Copy link

codecov-commenter commented Sep 17, 2020

Codecov Report

Merging #3582 into main will increase coverage by 0.19%.
The diff coverage is 92.07%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3582      +/-   ##
==========================================
+ Coverage   76.77%   76.96%   +0.19%     
==========================================
  Files         230      237       +7     
  Lines       30928    31319     +391     
==========================================
+ Hits        23744    24104     +360     
- Misses       7184     7215      +31     
Impacted Files Coverage Δ
components/autofill/src/error.rs 0.00% <0.00%> (ø)
examples/autofill-utils/src/autofill-utils.rs 0.00% <0.00%> (ø)
components/support/types/src/lib.rs 88.23% <88.23%> (ø)
components/autofill/src/db.rs 92.30% <92.30%> (ø)
components/autofill/src/api.rs 99.31% <99.31%> (ø)
components/autofill/build.rs 100.00% <100.00%> (ø)
components/autofill/src/schema.rs 100.00% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7762c4b...c369ecd. Read the comment docs.

Copy link
Contributor

@linabutler linabutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I left a few comments, but looks wonderful on the whole, let's :shipit: and iterate!

sync_change_counter INTEGER NOT NULL DEFAULT 1
);

CREATE TABLE IF NOT EXISTS addresses_tombstones (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From yesterday's meeting: let's keep tombstones in their own table. If we want to make sure a GUID doesn't end up in both, we can write a pair of triggers on addresses_data and addresses_tombstones that check if the GUID we're inserting exists in the other table, and does a RAISE(FAIL, ...) if so. Keeping the tombstones inline means we have to NULL out all the fields, and it's still possible to have weird stuff like a row with a name, address, and time_deleted.

#[serde(rename = "id")]
pub guid: Guid,

pub given_name: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I wrote this at first, I was thinking you'd need to handle the field not existing—so if it was Option<String>, you'd need to remember to check Some and None, but with a regular String, you might forget to check if given_name.is_empty(). But like you said yesterday, you could still end up with Some(""), and it makes the API so clunky to handle it that it's not a good thing 😅


#[serde(default)]
#[serde(deserialize_with = "deserialize_timestamp")]
pub time_created: i64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep!

#[serde(rename = "id")]
pub guid: Guid,

pub address_fields: NewAddressFields,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since the type is already called Address, how do you feel about fields instead of address_fields? Writing out, say, address.address_fields.tel feels a little redundant.

#[serde(default)]
pub times_used: i64,

pub sync_change_counter: i64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not expose this as part of the API—and let's make it either pub(crate), or have a test-only helper to read the change counter. I'm thinking because the change counter is an implementation detail that gets bumped whenever an address is created or modified, but consumers don't need to know that's how it works under the hood.

tel: row.get("tel")?,
email: row.get("email")?,
};
Ok(Address {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind factoring this out into an Address::from_row helper function that get_all_addresses and get_address could use?

|row| row.get(0),
)?;

if exists {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think this is missing a DELETE FROM addresses_data to drop the row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly wasn't sure how this worked so thank you for the clarification. And for sure, as you stated below, this simplifies the "exists" check.

components/forms/src/api.rs Outdated Show resolved Hide resolved
time_last_used,
time_last_modified,
times_used,
sync_change_counter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like another good place to use COMMON_COLS.

sync_change_counter
FROM addresses_data d
WHERE guid = :guid
AND NOT EXISTS (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If delete_address deletes the row from addresses_data (and we rely on a trigger or something to make sure a GUID can't be in both addresses_data and addresses_tombstones), we can avoid the NOT EXISTS checks 😄

@lougeniaC64 lougeniaC64 force-pushed the forms-api branch 2 times, most recently from c4c12a7 to 59a1e13 Compare September 18, 2020 23:02
@lougeniaC64 lougeniaC64 changed the title [WIP] Forms DB and API Autofill Component DB and API Sep 18, 2020
@lougeniaC64 lougeniaC64 changed the title Autofill Component DB and API Add DB layer and API for the Autofill Component Sep 18, 2020
@lougeniaC64 lougeniaC64 marked this pull request as ready for review September 18, 2020 23:03
@lougeniaC64
Copy link
Contributor Author

FYI: I didn't include updating the places component to use the Timestamp logic that I copied into the support component because this PR is big enough as is. I'll handle that in a follow-on PR. 😬

Copy link
Member

@mhammond mhammond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely work Lougenia - let's get it landed!

@lougeniaC64 lougeniaC64 merged commit 60ab9d1 into main Sep 21, 2020
@lougeniaC64 lougeniaC64 deleted the forms-api branch September 21, 2020 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create basic Forms API
4 participants