Skip to content

fix: 500 error when published component size is too large#249

Open
morgan-wowk wants to merge 1 commit into
masterfrom
validate-component-api
Open

fix: 500 error when published component size is too large#249
morgan-wowk wants to merge 1 commit into
masterfrom
validate-component-api

Conversation

@morgan-wowk
Copy link
Copy Markdown
Collaborator

@morgan-wowk morgan-wowk commented May 22, 2026

Fix: validate component text size before MySQL insert

Problem

POST /api/published_components/ was crashing with an unhandled pymysql.err.DataError
when component text exceeded the MySQL TEXT column limit (65,535 bytes):

(1406, "Data too long for column 'text' at row 1")

The existing MAX_COMPONENT_SIZE = 300_000 check was in characters, not bytes, and set
far above the actual column capacity — so components between ~65 KB and 300 KB passed
validation and hit the database, which rejected them with a 500.

Solution

Replace the character-count check with a byte-length check against the real column limit,
and raise ApiValidationError (→ HTTP 422) instead of letting the DB error propagate.

Byte limit

PostgreSQL and SQLite map Text() to unlimited storage, so this limit is based on MySQL.

# Baseline: MySQL TEXT column maximum (65,535 bytes).
# If the column type is ever widened (e.g. to MEDIUMTEXT), update this constant.
MAX_COMPONENT_SIZE_BYTES = 65_535

UTF-8 encoding

# PyMySQL always encodes Python str to UTF-8 on the wire, so UTF-8 byte length
# is exactly what MySQL receives and measures against the column's byte limit.
text_bytes = len(text.encode("utf-8"))

Changes

  • cloud_pipelines_backend/component_library_api_server.py — replace MAX_COMPONENT_SIZE (character count) with MAX_COMPONENT_SIZE_BYTES = 65_535 (byte count); raise errors.ApiValidationError instead of ValueError
  • tests/test_component_library_api_server.py — add boundary tests for exactly 65,535 bytes (accepted) and 65,536 bytes (rejected with 422)

Copy link
Copy Markdown
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@morgan-wowk morgan-wowk force-pushed the validate-component-api branch from ffd738d to 965a91a Compare May 22, 2026 19:02
Comment thread tests/test_component_library_api_server.py Fixed
@morgan-wowk morgan-wowk force-pushed the validate-component-api branch from 965a91a to 5b737fe Compare May 22, 2026 19:13
@morgan-wowk morgan-wowk marked this pull request as ready for review May 22, 2026 19:14
@morgan-wowk morgan-wowk requested a review from Ark-kun as a code owner May 22, 2026 19:14

MAX_COMPONENT_SIZE = 300_000
# Baseline: MySQL TEXT column maximum (65,535 bytes).
MAX_COMPONENT_SIZE_BYTES = 65_535
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @morgan-wowk for actioning this issue!

Tangle is suppose to be DB agnostic, and this limit is for MySQL while (Postgres and SQLite) doesn't have this limit. I'm wondering if we should:

  1. Move this change to oasis-backend, since that would be proper for KateSQL (MySQL) only.
  2. Create a dialect aware helper class in Tangle (too complex IMHO, and I don't think we have a precendent for that, nor would it be wise to create a precendent now).
  3. Something else?

TEXT Column Size Limits by Database

Database TEXT Max Size Effectively Unlimited?
MySQL 65,535 bytes (~64 KB) No
PostgreSQL ~1 GB Yes
SQLite ~1 GB (default) Yes

MySQL

"A TEXT column with a maximum length of 65,535 (2^16 − 1) bytes. The effective maximum length is less if the value contains multibyte characters."

Source: MySQL 8.4 Reference Manual - String Data Type Syntax

PostgreSQL

"In addition, PostgreSQL provides the text type, which stores strings of any length."

"the longest possible character string that can be stored is about 1 GB"

Source: PostgreSQL Docs - Character Types

SQLite

"The maximum length of a TEXT or BLOB in bytes."
#define SQLITE_MAX_LENGTH 1000000000

Source (source header): sqliteLimit.h

"The maximum number of bytes in a string or BLOB in SQLite is defined by the preprocessor macro SQLITE_MAX_LENGTH. The default value of this macro is 1 billion (1,000,000,000)."

Source (docs): SQLite Implementation Limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants