Design backend for Datablock Storage #55554

snickell · 2024-01-03T20:29:49Z

Datablock Storage (#55084) is switching from a browser=>firebase connection for project storage to a browser=>rails=>mysql connection. We have tentatively determined a schema (see: #55344 (comment)) and measured its perf characteristics (#55189), now we need to design the rails backend.

Characterize async behavior of the existing Firebase backend
Determine acceptable load (particular invocations/s) for Rails backend
Decide batching size (if batching is possible within async semantics)
Write out & profile SQL queries for doing all required checks (rate limit, table count, etc) for each data block
Figure out how we want to implement de-duped tables, which also solves for "current table" (the live tables like Daily Weather, Spotify, and COVID caseload)
Design backend API

snickell · 2024-01-03T20:50:11Z

Performance and async characteristics of the existing Firebase blocks

Our target burst rate (existing firebase rate) is 20 inserts/s (technically: 300 inserts per 15s).
Exceeding this limit results in failure messages to the code.org debug console. It does not appear to alert programmatically (?).
A single call to the createRecord(table, record, callback) block takes 0.75s from invocation to callback.
- We take the createRecord() callback to mean "the record is committed".
- However, before it is committed, the new record is available instantly (measured <1ms) to a subsequent readRecords() call. Since this is lower than is possible with wire latency, we surmise that there's local buffering of data combined with optimistic updating.
- Question: Is this local optimistic updating done by Firebase official JS libs or perhaps a local store we implement (redux or similar)?
- Question: If the buffering is done by Firebase, do we need it? Will we break student code if we don't support synchronous insert?

Performance Characteristics of our MySQL DB

Using a "lock Datablock Storage table by locking the first row in mysql" approach (see: Figure out Rate Limiting for Datasets on MySQL #55481 (comment)), we measured ~1ms for a single-insert, with additional inserts in the same call adding about 0.3ms.

Implementation Ideas

If our target was 0.75s from "invocation of a createRecord to it existing in the DB", we could easily batch calls, reducing load on our rails backend. Strawman idea would be to batch into 0.5s or 1s batches. This would meet or exceed the existing latency until callback, and would take advantage of the improved performance for batching inserts (no need to re-lock every time, see below).

snickell · 2024-01-03T21:02:02Z

Auto-increment of RecordIDs is Tricky

Our existing record.ids are auto-incremented using a Firebase counters "table". MySQL innodb tables don't have support for auto-increment of one column in a composite key, at least not relative to the other keys (see, e.g., https://stackoverflow.com/questions/18120088/defining-composite-key-with-auto-increment-in-mysql).

We evaluated putting recordID as the first column, and making it AUTOINCREMENT, but this would be shared by ALL records in ALL Datablock Storage tables, and while it probably preserves the existing semantics, we think it makes for a bad student experience when your records start with huge IDs like 458849898.
Alternatively, we could lock the MAX(record_id) for a given channel_id and table_name, but this has race conditions with MySQL transaction isolation levels.

The problem with MAX is that if you have three inserts going (one starts, another starts and delays a hiccup, while a third starts) you can still get a duplicate ID condition. Since we always increase the ID when inserting, we realized you could use MIN as a sort of "table lock" (not SQL table, but Datablock Storage table) that is more stable.

There are probably some race conditions left with this (?) but under the common scenarios we thought of, this looks pretty good for manually incrementing ID for an insert:

set profiling=1;
BEGIN;
  SELECT MIN(record_id) FROM unfirebase.records WHERE channel_id='shared' AND table_name='words' LIMIT 1 FOR UPDATE;
  SELECT @id := IFNULL(MAX(record_id),0)+1 FROM unfirebase.records WHERE channel_id='shared' AND table_name='words';

  INSERT INTO unfirebase.records
     VALUES
     ('shared', 'words', @id, '{}');
COMMIT;
  
show profiles;

Profiles of the above approach look good

We profiled the above approach to locking before inserting, on a Datablock Storage table with 5000 rows (shared/words), we see overall durations of about a millisecond for the combined set of steps to lock and insert a row (duration column is in seconds).

# Query_ID, Duration, Query
'47', '0.00013625', 'BEGIN'
'48', '0.00029125', 'SELECT MIN(record_id) FROM unfirebase.records WHERE channel_id=\'shared\' AND table_name=\'words\' LIMIT 1 FOR UPDATE'
'49', '0.00027725', 'SELECT @id := IFNULL(MAX(record_id),0)+1 FROM unfirebase.records WHERE channel_id=\'shared\' AND table_name=\'words\'\nLIMIT 0, 2000'
'50', '20.00030250', 'DO SLEEP(20)'
'51', '0.00039750', 'INSERT INTO unfirebase.records\n     VALUES\n     (\'shared\', \'words\', @id, \'{}\')'
'52', '0.00017625', 'COMMIT'

Additionally we used SHOW ENGINE INNODB STATUS to look at the locks being created. Using MIN allows us to have only one lock. We were able to verify when a lock wasn't cleaned up, it only affected the one Datablock Storage table, the other rows were accessible.

snickell · 2024-01-03T21:12:01Z

It appears that Firebase RTDB (what we are using) does in fact do optimistic updating: https://firebase.google.com/docs/database/admin/save-data#section-writes-offline

There's very little reference to this, probably because its fundamental to Firebase's design.

snickell · 2024-01-04T00:18:14Z

Enforcing limits in the backend:

maxPropertySize:4096, value length of a key value pair, this is enforced in our schema because the Value column of the KeyValuePairs table is VARCHAR(4096).
maxRecordSize:4096, max length of a record, this needs to be enforced by the rails backend, because the mysql JSON type doesn't allow setting a max length
maxTableCount:10, this doesn't seem to be encorced in rules.bolt?, we'll enforce this on the backend by doing a SELECT COUNT(tableName) FROM Tables WHERE channelID=___
maxTableRows: 20000, number of records inside a table, SELECT COUNT(recordId) FROM Records WHERE channelID=___ AND table_name=____
maxKeySize: 768, this is a firebase limit for the full key path (like /v3/channels/shared). We don't test for this particular limit, and we keep lengths from being indefinited instead by setting table_name, key value pair key, etc to be VARCHAR(1024). We don't care how long JSON key names are inside a record (firebase would), and instead enforce overall maxRecordSize above.

cnbrenci · 2024-01-05T02:01:38Z

Today, the data blocks do not use optimistic updating. e.g. If you call a set and a get directly after without waiting for the callback, it is nondeterministic whether the result from the get is the old or new value. So we are OK with that being the same in the new architecture. We will batch, but we won't worry about optimistic updating of local state.

snickell · 2024-02-16T03:22:30Z

We've basically finished this item, since we've already executed against the schema design. Breaking out remaining small issues and closing:

Consider batching create/update record requests in JS Consider batching create/update record requests in JS #56615
Schema for library manifest Implement Library Manifest System #56472
Rate limiting design Figure out Rate Limiting for Datasets on MySQL #55481

snickell mentioned this issue Jan 3, 2024

Figure out Rate Limiting for Datasets on MySQL #55481

Closed

snickell assigned snickell and cnbrenci Jan 3, 2024

cnbrenci changed the title ~~Design backend for projectDB~~ Design backend for Datablock Storage Jan 17, 2024

cnbrenci mentioned this issue Jan 23, 2024

Datablock Storage - deprecate Firebase (tracking issue) #55084

Closed

66 tasks

snickell added the unfirebase https://github.com/orgs/code-dot-org/projects/4 label Feb 3, 2024

snickell closed this as completed Feb 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design backend for Datablock Storage #55554

Design backend for Datablock Storage #55554

snickell commented Jan 3, 2024 •

edited

snickell commented Jan 3, 2024 •

edited by cnbrenci

snickell commented Jan 3, 2024 •

edited by cnbrenci

snickell commented Jan 3, 2024

snickell commented Jan 4, 2024 •

edited

cnbrenci commented Jan 5, 2024

snickell commented Feb 16, 2024 •

edited

Design backend for Datablock Storage #55554

Design backend for Datablock Storage #55554

Comments

snickell commented Jan 3, 2024 • edited

snickell commented Jan 3, 2024 • edited by cnbrenci

Performance and async characteristics of the existing Firebase blocks

Performance Characteristics of our MySQL DB

Implementation Ideas

snickell commented Jan 3, 2024 • edited by cnbrenci

Auto-increment of RecordIDs is Tricky

Profiles of the above approach look good

snickell commented Jan 3, 2024

snickell commented Jan 4, 2024 • edited

cnbrenci commented Jan 5, 2024

snickell commented Feb 16, 2024 • edited

snickell commented Jan 3, 2024 •

edited

snickell commented Jan 3, 2024 •

edited by cnbrenci

snickell commented Jan 3, 2024 •

edited by cnbrenci

snickell commented Jan 4, 2024 •

edited

snickell commented Feb 16, 2024 •

edited