Skip to content

Conversation

@techouse
Copy link
Owner

This pull request introduces two new helper classes, DataTransferManager and SchemaWriter, to modularize and improve the codebase for MySQL-to-SQLite data migration. These classes encapsulate the logic for chunked data transfer and schema creation, respectively, and are now referenced in the main transporter type definitions. Additionally, a small update is made to the linter configuration.

New modular helpers:

  • Added DataTransferManager in data_transfer.py to handle chunked data transfer from MySQL to SQLite, including reconnect logic and efficient batch inserts with progress reporting.
  • Added SchemaWriter in schema_writer.py to build SQLite schemas from MySQL metadata, supporting tables, indices, and foreign keys with detailed translation logic.

Integration with main transporter:

  • Updated types.py to include type-checking imports for the new helpers and added _schema_writer and _data_transfer attributes to the MySQLtoSQLiteAttributes class, integrating the new classes into the main transporter context. [1] [2]

Tooling:

  • Updated tox.ini to add I300 to the list of ignored flake8 warnings, likely to reduce noise from import order/style checks.

@techouse techouse self-assigned this Nov 16, 2025
@techouse techouse added the enhancement New feature or request label Nov 16, 2025
@coderabbitai
Copy link

coderabbitai bot commented Nov 16, 2025

Walkthrough

This pull request refactors the transporter module by extracting schema building, data transfer, and type translation logic into three dedicated modules: SchemaWriter manages SQLite table creation, DataTransferManager handles chunked data migration, and type_translation provides MySQL-to-SQLite conversion utilities. The transporter is updated to delegate to these components.

Changes

Cohort / File(s) Summary
New type translation module
src/mysql_to_sqlite3/type_translation.py
Introduces comprehensive utilities for translating MySQL schemas and data types to SQLite equivalents, including column type translation, expression normalisation, identifier quoting/escaping, and default value conversion.
New schema writer module
src/mysql_to_sqlite3/schema_writer.py
Implements SchemaWriter class that constructs CREATE TABLE statements from MySQL metadata, handling primary keys, indexes, foreign keys, and type mapping.
New data transfer module
src/mysql_to_sqlite3/data_transfer.py
Introduces DataTransferManager class for chunked data transfer with reconnection handling, progress tracking via tqdm, and error recovery for MySQL server loss.
Transporter refactoring
src/mysql_to_sqlite3/transporter.py
Refactors to use dependency injection, delegating schema building to SchemaWriter, data transfer to DataTransferManager, and type translation to type_translation module; removes legacy implementations.
Type hints update
src/mysql_to_sqlite3/types.py
Adds TYPE_CHECKING imports and type annotations for SchemaWriter and DataTransferManager in MySQLtoSQLiteAttributes.
Flake8 configuration
tox.ini
Adds I300 to flake8 ignore list and .tox to exclude list.

Sequence Diagram(s)

sequenceDiagram
    participant Transporter
    participant SchemaWriter
    participant DataTransferManager
    participant TypeTranslation
    participant MySQL
    participant SQLite

    Transporter->>Transporter: _get_schema_writer()
    Transporter->>SchemaWriter: _build_create_table_sql(table_name)
    SchemaWriter->>MySQL: Query column metadata
    SchemaWriter->>TypeTranslation: translate_type_from_mysql_to_sqlite()
    SchemaWriter->>TypeTranslation: translate_default_from_mysql_to_sqlite()
    SchemaWriter->>SQLite: CREATE TABLE statement
    
    Transporter->>Transporter: _get_data_transfer_manager()
    Transporter->>DataTransferManager: transfer_table_data(table_name, sql)
    
    alt Chunking enabled
        DataTransferManager->>MySQL: fetchmany(chunk_size)
        loop For each chunk
            DataTransferManager->>TypeTranslation: encode_data_for_sqlite()
            DataTransferManager->>SQLite: executemany()
        end
    else No chunking
        DataTransferManager->>MySQL: fetchall()
        DataTransferManager->>TypeTranslation: encode_data_for_sqlite()
        DataTransferManager->>SQLite: executemany()
    end
    
    DataTransferManager->>SQLite: commit()
    
    alt MySQL connection lost
        DataTransferManager->>MySQL: reconnect()
        DataTransferManager->>DataTransferManager: transfer_table_data() [retry]
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • type_translation.py requires careful examination of MySQL-to-SQLite type mapping logic, especially edge cases for JSON, binary types, boolean defaults, and collation handling.
  • schema_writer.py involves intricate SQL construction with primary key detection, index building, and foreign key constraint translation; requires verification of metadata query correctness.
  • data_transfer.py handles critical data transfer with reconnection logic and chunking; error handling paths need validation.
  • transporter.py refactoring spans numerous method delegations; verify all call sites are correctly updated and no logic was inadvertently altered during extraction.
  • Cross-module interactions between transporter, SchemaWriter, and DataTransferManager should be validated for correct initialisation and state management.

Possibly related PRs

Poem

🐰 A rabbit's ode to refactoring

Modular magic, code flows clean,
SchemaWriter builds SQL dreams,
DataTransfer hops with chunked delight,
Type translation shines so bright!
From MySQL's warren to SQLite's burrow—
No more chaos, only furrow! 🌱

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description covers key changes and includes relevant sections, but the PR template requires a 'Type of change' checkbox section and testing details, which are entirely absent from the provided description. Complete the PR description by adding the 'Type of change' section (marking 'New feature' as applicable) and 'How Has This Been Tested?' and 'Checklist' sections from the template to meet repository standards.
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'split up transporter.py' accurately reflects the main objective of this PR, which refactors the transporter module by extracting DataTransferManager and SchemaWriter into separate modules.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/split-it-up

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@techouse techouse changed the title ♻️ split up transporter.py ♻️ refactor transporter Nov 16, 2025
@codecov
Copy link

codecov bot commented Nov 16, 2025

Codecov Report

❌ Patch coverage is 97.61905% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.23%. Comparing base (b15c593) to head (c472fee).

Files with missing lines Patch % Lines
src/mysql_to_sqlite3/schema_writer.py 94.93% 4 Missing ⚠️
src/mysql_to_sqlite3/type_translation.py 98.85% 3 Missing ⚠️
src/mysql_to_sqlite3/types.py 60.00% 2 Missing ⚠️
src/mysql_to_sqlite3/data_transfer.py 97.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #116      +/-   ##
==========================================
- Coverage   98.81%   98.23%   -0.58%     
==========================================
  Files           8       11       +3     
  Lines         925     1018      +93     
==========================================
+ Hits          914     1000      +86     
- Misses         11       18       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
src/mysql_to_sqlite3/schema_writer.py (1)

167-227: Confirm foreign key generation semantics when migrating only a subset of tables

The foreign key block correctly:

  • Uses server version–aware JOIN vs LEFT JOIN on information_schema to account for MySQL 8.0.20+ behaviour.
  • Quotes all identifiers before emitting FOREIGN KEY clauses and normalises ON UPDATE / ON DELETE rules to upper case.

However, it only checks _without_tables and _without_foreign_keys. Prior guidance was to skip FK generation when a table subset restriction is applied to avoid references to tables that are not being created. Based on learnings.

If that behaviour is now implemented by forcing _without_foreign_keys when a subset is selected, this is fine; if not, this block may reintroduce FKs pointing at tables that are intentionally omitted.

Consider explicitly short‑circuiting FK generation here when a subset of tables is being migrated (e.g. when ctx._mysql_tables is non‑empty or exclusions are active), or filtering out FKs whose ref_table is not in the target table set. Please confirm which layer currently enforces this and that previous behaviour is preserved.

src/mysql_to_sqlite3/type_translation.py (1)

27-124: Tighten TIMESTAMP detection for clarity in type translation

translate_type_from_mysql_to_sqlite is generally well‑structured and nicely parameterised, but:

if data_type in "TIMESTAMP":
    return "DATETIME"

relies on substring membership in the literal "TIMESTAMP". Given the earlier branches, this will behave as intended for TIMESTAMP inputs, but the use of in here is surprising and brittle.

I’d suggest switching to an explicit equality check for clarity and future‑proofing:

-    if data_type in "TIMESTAMP":
-        return "DATETIME"
+    if data_type == "TIMESTAMP":
+        return "DATETIME"

The rest of the mapping table (numeric, string and JSON handling, plus the sqlglot‑backed fallback) looks solid and aligned with previous behaviour. Based on learnings.

After making this change, please rerun the existing type‑mapping tests to confirm there’s no behavioural change and that new edge cases (e.g. unusual TIMESTAMP variants) are still covered.

src/mysql_to_sqlite3/data_transfer.py (1)

65-90: Double‑check reconnect behaviour when CR_SERVER_LOST is raised during data transfer

The MySQL error handling correctly:

  • Detects CR_SERVER_LOST and limits reconnect attempts to a single retry via the attempting_reconnect flag.
  • Calls ctx._mysql.reconnect() before re‑invoking transfer_table_data.
  • Logs MySQL and SQLite failures with table context before re‑raising.

One potential gap is that, on reconnect, this method does not itself re‑execute the SELECT that populated ctx._mysql_cur or otherwise reposition the cursor; it assumes that calling fetchmany/fetchall after reconnect() is sufficient. Depending on how MySQLtoSQLite sets up the cursor and whether the underlying connector preserves statement state across reconnect(), this may or may not fully reproduce the previous reconnection behaviour. Based on learnings.

It would be worth verifying, ideally via tests, that:

  • After a simulated CR_SERVER_LOST during fetchmany/fetchall, the reconnect path actually re‑reads the intended rows rather than failing with a cursor error or skipping/duplicating chunks.
  • If necessary, consider re‑executing the original SELECT (e.g. using a stored statement on ctx or the cursor) and fast‑forwarding via _current_chunk_number before resuming inserts.

Also, you may want to switch the error logging in these except blocks to logger.exception(...) to capture tracebacks automatically.

src/mysql_to_sqlite3/transporter.py (3)

195-197: Eager vs lazy initialisation of SchemaWriter/DataTransferManager

You currently both eagerly construct SchemaWriter/DataTransferManager in __init__ and support lazy creation via _get_schema_writer()/_get_data_transfer_manager(). Behaviour is correct, but having two initialisation paths makes the lifecycle a bit harder to reason about and encourages direct attribute access.

If you are comfortable nudging callers/tests towards the accessors, you could simplify by dropping the eager instantiation and relying solely on the getters, which already cache the instances:

-        self._schema_writer = SchemaWriter(self)
-        self._data_transfer = DataTransferManager(self)

This keeps the constructor lighter and ensures all use goes through the same code path. If backwards compatibility with direct ._schema_writer / ._data_transfer access is required, the current approach is still acceptable.

As per coding guidelines.

Also applies to: 303-315


317-333: Index name de-duplication helper meets collision-avoidance goals

_get_unique_index_name correctly tracks seen names and appends numeric suffixes, ensuring uniqueness across the SQLite database while preserving the original base name. This pairs well with the SchemaWriter logic that prefixes with <table>_ when there is a table-name collision or --prefix-indices is set.

One minor consideration: logging every rename at INFO may be noisy on large schemas with many similar index names. If you see verbose logs in practice, consider dropping this to DEBUG:

-        self._logger.info(
+        self._logger.debug(
             'Index "%s" renamed to "%s" to ensure uniqueness across the SQLite database.',
             base_name,
             candidate,
         )

Not essential, but it can keep logs cleaner on big migrations.

As per coding guidelines.


470-478: Route data transfer through the wrapper for consistency

You now have _transfer_table_data() delegating to DataTransferManager.transfer_table_data(...), but transfer() calls self._data_transfer.transfer_table_data(...) directly. This works today, yet it creates two slightly different calling patterns and bypasses the lazy getter.

To keep the abstraction clean and future-proof (e.g. if _transfer_table_data grows extra bookkeeping like metrics or debug behaviour), I would suggest calling the wrapper from transfer():

-                        self._data_transfer.transfer_table_data(
-                            table_name=table_name,  # type: ignore[arg-type]
-                            sql=sql,
-                            total_records=total_records_count,
-                        )
+                        self._transfer_table_data(
+                            table_name=table_name,  # type: ignore[arg-type]
+                            sql=sql,
+                            total_records=total_records_count,
+                        )

This also ensures all use goes through _get_data_transfer_manager() rather than relying on the attribute being pre-populated.

As per coding guidelines.

Also applies to: 602-606

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b15c593 and c472fee.

📒 Files selected for processing (6)
  • src/mysql_to_sqlite3/data_transfer.py (1 hunks)
  • src/mysql_to_sqlite3/schema_writer.py (1 hunks)
  • src/mysql_to_sqlite3/transporter.py (12 hunks)
  • src/mysql_to_sqlite3/type_translation.py (1 hunks)
  • src/mysql_to_sqlite3/types.py (2 hunks)
  • tox.ini (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Keep mypy passing (Python 3.9 baseline); avoid untyped dynamic attributes in source code

Files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Adhere to lint suite: Black line length 120 and isort profile=black import ordering for all Python files

Files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
src/mysql_to_sqlite3/types.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

src/mysql_to_sqlite3/types.py: Keep type hints exhaustive in types.py and update when MySQLtoSQLite constructor kwargs change
When constructor kwargs change, update types.py accordingly for mypy
Update typing in types.py when new CLI parameters are added to the constructor

Files:

  • src/mysql_to_sqlite3/types.py
src/mysql_to_sqlite3/transporter.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

src/mysql_to_sqlite3/transporter.py: Implement and modify core transfer logic (schema introspection, batching, type/default translation, reconnection handling) in transporter.py
Convert AUTO_INCREMENT single primary keys to INTEGER PRIMARY KEY AUTOINCREMENT only when the translated type is integer; log a warning otherwise
Avoid index naming collisions: if an index name equals its table name or --prefix-indices is set, prefix with

_
Centralize default value translation in _translate_default_from_mysql_to_sqlite; extend this function for new MySQL constructs
Handle JSON columns: if SQLite has JSON1, map MySQL JSON to JSON; otherwise map to TEXT unless --json-as-text is set
Skip foreign key generation when a table subset restriction is applied
Use _setup_logger for logging; do not instantiate new loggers ad hoc
For large tables, prefer chunked transfer using fetchmany(self._chunk_size) and executemany inserts
On CR_SERVER_LOST during schema or data transfer, attempt a single reconnect; preserve this behavior
Disable PRAGMA foreign_keys during bulk load and re-enable in finally; ensure early returns still re-enable
Use INSERT OR IGNORE to handle potential duplicate inserts
Thread new CLI parameters through the MySQLtoSQLite constructor
Raise ValueError in constructor for invalid configuration
When --debug is not set, swallow and log errors; when --debug is set, re-raise for stack inspection
Extend _translate_type_from_mysql_to_sqlite when adding type mappings and add corresponding tests
Extend _translate_default_from_mysql_to_sqlite to support new default expressions and add tests
Centralize progress/UI changes around tqdm and honor the --quiet flag

Files:

  • src/mysql_to_sqlite3/transporter.py
src/mysql_to_sqlite3/{cli.py,transporter.py}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

src/mysql_to_sqlite3/{cli.py,transporter.py}: Never log raw passwords
Support --skip-ssl to disable MySQL SSL; default to encrypted where possible and do not silently change defaults

Files:

  • src/mysql_to_sqlite3/transporter.py
🧠 Learnings (26)
📓 Common learnings
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Implement and modify core transfer logic (schema introspection, batching, type/default translation, reconnection handling) in transporter.py
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Extend _translate_type_from_mysql_to_sqlite when adding type mappings and add corresponding tests
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Centralize progress/UI changes around tqdm and honor the --quiet flag
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : For large tables, prefer chunked transfer using fetchmany(self._chunk_size) and executemany inserts
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Skip foreign key generation when a table subset restriction is applied
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Implement and modify core transfer logic (schema introspection, batching, type/default translation, reconnection handling) in transporter.py

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : For large tables, prefer chunked transfer using fetchmany(self._chunk_size) and executemany inserts

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Extend _translate_type_from_mysql_to_sqlite when adding type mappings and add corresponding tests

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Centralize progress/UI changes around tqdm and honor the --quiet flag

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Thread new CLI parameters through the MySQLtoSQLite constructor

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : On CR_SERVER_LOST during schema or data transfer, attempt a single reconnect; preserve this behavior

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Skip foreign key generation when a table subset restriction is applied

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Centralize default value translation in _translate_default_from_mysql_to_sqlite; extend this function for new MySQL constructs

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Disable PRAGMA foreign_keys during bulk load and re-enable in finally; ensure early returns still re-enable

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/{cli.py,transporter.py} : Support --skip-ssl to disable MySQL SSL; default to encrypted where possible and do not silently change defaults

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Use INSERT OR IGNORE to handle potential duplicate inserts

Applied to files:

  • src/mysql_to_sqlite3/data_transfer.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Convert AUTO_INCREMENT single primary keys to INTEGER PRIMARY KEY AUTOINCREMENT only when the translated type is integer; log a warning otherwise

Applied to files:

  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/types.py : Keep type hints exhaustive in types.py and update when MySQLtoSQLite constructor kwargs change

Applied to files:

  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/types.py : When constructor kwargs change, update types.py accordingly for mypy

Applied to files:

  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Avoid index naming collisions: if an index name equals its table name or --prefix-indices is set, prefix with <table>_

Applied to files:

  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/types.py : Update typing in types.py when new CLI parameters are added to the constructor

Applied to files:

  • src/mysql_to_sqlite3/schema_writer.py
  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to tests/unit/test_*.py : Add unit tests for new type mappings in _translate_type_from_mysql_to_sqlite

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Extend _translate_default_from_mysql_to_sqlite to support new default expressions and add tests

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/cli.py : Add new user-facing CLI options only in src/mysql_to_sqlite3/cli.py (Click command entrypoint)

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Handle JSON columns: if SQLite has JSON1, map MySQL JSON to JSON; otherwise map to TEXT unless --json-as-text is set

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to mysql_utils.py : Use mysql_utils.py for charset/collation sets and MySQL value adaptation; add related helpers here

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to sqlite_utils.py : Use sqlite_utils.py for SQLite capability detection and value adaptation; add related helpers here

Applied to files:

  • src/mysql_to_sqlite3/type_translation.py
  • src/mysql_to_sqlite3/types.py
  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to tests/** : Use pytest; keep tests under tests/ mirroring src/mysql_to_sqlite3/ structure

Applied to files:

  • src/mysql_to_sqlite3/types.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Use _setup_logger for logging; do not instantiate new loggers ad hoc

Applied to files:

  • src/mysql_to_sqlite3/transporter.py
📚 Learning: 2025-10-18T21:08:55.973Z
Learnt from: CR
Repo: techouse/mysql-to-sqlite3 PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-10-18T21:08:55.973Z
Learning: Applies to src/mysql_to_sqlite3/transporter.py : Raise ValueError in constructor for invalid configuration

Applied to files:

  • src/mysql_to_sqlite3/transporter.py
🧬 Code graph analysis (5)
src/mysql_to_sqlite3/data_transfer.py (4)
src/mysql_to_sqlite3/sqlite_utils.py (1)
  • encode_data_for_sqlite (35-48)
src/mysql_to_sqlite3/transporter.py (1)
  • MySQLtoSQLite (47-617)
tests/unit/test_transporter.py (2)
  • sql (175-176)
  • reconnect (69-70)
tests/unit/mysql_to_sqlite3_test.py (3)
  • fetchmany (563-564)
  • fetchall (560-561)
  • commit (410-411)
src/mysql_to_sqlite3/schema_writer.py (1)
src/mysql_to_sqlite3/transporter.py (6)
  • _quote_sqlite_identifier (253-254)
  • _escape_mysql_backticks (257-258)
  • _translate_type_from_mysql_to_sqlite (233-242)
  • _translate_default_from_mysql_to_sqlite (272-284)
  • _data_type_collation_sequence (287-294)
  • _get_unique_index_name (317-333)
src/mysql_to_sqlite3/type_translation.py (2)
src/mysql_to_sqlite3/sqlite_utils.py (1)
  • CollatingSequences (51-56)
src/mysql_to_sqlite3/transporter.py (6)
  • _valid_column_type (221-222)
  • _column_type_length (225-226)
  • _decode_column_type (229-230)
  • _transpile_mysql_type_to_sqlite (261-269)
  • _transpile_mysql_expr_to_sqlite (245-246)
  • _normalize_literal_with_sqlglot (249-250)
src/mysql_to_sqlite3/types.py (2)
src/mysql_to_sqlite3/data_transfer.py (1)
  • DataTransferManager (23-90)
src/mysql_to_sqlite3/schema_writer.py (1)
  • SchemaWriter (19-233)
src/mysql_to_sqlite3/transporter.py (3)
src/mysql_to_sqlite3/data_transfer.py (2)
  • DataTransferManager (23-90)
  • transfer_table_data (30-90)
src/mysql_to_sqlite3/schema_writer.py (2)
  • SchemaWriter (19-233)
  • _build_create_table_sql (26-233)
src/mysql_to_sqlite3/type_translation.py (11)
  • escape_mysql_backticks (171-173)
  • _valid_column_type (27-28)
  • _column_type_length (31-35)
  • _decode_column_type (38-46)
  • translate_type_from_mysql_to_sqlite (49-123)
  • _transpile_mysql_type_to_sqlite (176-250)
  • _transpile_mysql_expr_to_sqlite (126-137)
  • _normalize_literal_with_sqlglot (140-151)
  • quote_sqlite_identifier (154-168)
  • translate_default_from_mysql_to_sqlite (253-394)
  • data_type_collation_sequence (397-428)
🪛 Ruff (0.14.4)
src/mysql_to_sqlite3/data_transfer.py

41-41: Value being cast to int is already an integer

Remove unnecessary int call

(RUF046)


78-82: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


85-89: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/mysql_to_sqlite3/schema_writer.py

71-93: Possible SQL injection vector through string-based query construction

(S608)


170-198: Possible SQL injection vector through string-based query construction

(S608)

src/mysql_to_sqlite3/type_translation.py

65-65: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Test (python3.10, mariadb:10.0, 1, true, 3.10)
  • GitHub Check: Test (python3.9, mariadb:10.0, 1, true, 3.9)
  • GitHub Check: Test (python3.14, mariadb:5.5, 1, true, 3.14)
  • GitHub Check: Test (python3.13, mariadb:5.5, 1, true, 3.13)
  • GitHub Check: Test (python3.11, mariadb:5.5, 1, true, 3.11)
  • GitHub Check: Test (python3.12, mariadb:5.5, 1, true, 3.12)
  • GitHub Check: Test (python3.9, mariadb:5.5, 1, true, 3.9)
  • GitHub Check: Test (python3.10, mariadb:5.5, 1, true, 3.10)
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (8)
tox.ini (1)

105-112: flake8 ignore additions look consistent with the existing tooling stack

Adding I300 to the ignore list is reasonable given that import ordering is already enforced via isort (profile=black), and excluding .tox matches the other tool sections. No issues from a tooling/config perspective.

Please run the flake8 environment once to confirm no unexpected codes are now silently ignored.

src/mysql_to_sqlite3/types.py (1)

12-15: New helper attributes are correctly threaded into the typing surface

The TYPE_CHECKING imports for SchemaWriter and DataTransferManager and the corresponding _schema_writer / _data_transfer attributes on MySQLtoSQLiteAttributes keep the attributes in sync with the actual transporter internals, which is exactly what types.py is meant to track. This should help mypy catch attribute usage on these helpers. Based on learnings.

Please confirm that MySQLtoSQLite always initialises _schema_writer and _data_transfer (or adjust these to Optional[...]) so that the declared attributes match runtime behaviour.

Also applies to: 94-95

src/mysql_to_sqlite3/schema_writer.py (2)

26-69: AUTO_INCREMENT single‑column primary key handling matches the documented behaviour

The column loop correctly:

  • Uses _translate_type_from_mysql_to_sqlite with Integer_Types to decide when to emit INTEGER PRIMARY KEY AUTOINCREMENT.
  • Logs a warning and skips the special handling when an AUTO_INCREMENT PK is not an integer type, which aligns with the previous guidance about only promoting integer PKs.
  • Delegates default and collation logic to _translate_default_from_mysql_to_sqlite and _data_type_collation_sequence, keeping that complexity centralised in the translation helpers.

This looks like a faithful extraction of the existing transporter logic. Based on learnings.

Please run the existing schema‐creation tests (especially around AUTO_INCREMENT handling and defaults) to ensure there are no behaviour changes from the refactor.


70-163: Index translation and naming collision handling is sound and preserves prior guarantees

The STATISTICS query and subsequent loop:

  • Correctly aggregates index columns and MySQL column types via GROUP_CONCAT, and uses the auto_increment aggregate to decide when a PRIMARY index should not be re‑emitted (i.e. the single integer AUTO_INCREMENT case).
  • Uses ctx._prefix_indices and ctx._get_unique_index_name to avoid collisions and to prefix indices when requested, which matches the “avoid index naming collisions / prefix with <table>_” guidance.
  • Properly quotes all SQLite identifiers via _quote_sqlite_identifier and only interpolates constant SQL fragments (auto_increment expression and group_by_extra) into the query text, so the SQL injection warning from static analysis here appears to be a false positive.

Overall, the index DDL generation looks correct and robust.

Please verify that the existing index‑naming tests (including collision cases and --prefix-indices) still pass against this refactored implementation.

src/mysql_to_sqlite3/type_translation.py (1)

253-395: Centralised default and collation translation logic is thorough; rely on tests to guard edge cases

The new translate_default_from_mysql_to_sqlite and data_type_collation_sequence functions:

  • Consolidate a lot of previously scattered logic (binary/hex defaults, charset introducers, generated defaults, boolean/time functions, NULL handling, and arithmetic expressions) into one place.
  • Correctly distinguish binary column types when column_default is bytes, and use CHARSET_INTRODUCERS consistently across bytes and string cases.
  • Use sqlglot both to transpile expressions and to normalise literals, which should reduce the need for ad‑hoc string parsing.
  • Restrict COLLATE clauses to types with text affinity while avoiding JSON/BLOB, and fall back to re‑transpiling the MySQL type when there’s ambiguity.

This is a good refactor, but because the control flow is complex and touches many MySQL edge cases, it’s easy for regressions to slip in.

Please ensure there is test coverage (or add tests if missing) for:

  • Binary/hex defaults with and without charset introducers.
  • Generated temporal defaults (CURTIME/CURDATE/CURRENT_TIMESTAMP/NOW) and their _GENERATED variants.
  • BOOLEAN defaults under different SQLite versions.
  • Collation assignment for various textual and JSON/BLOB‑like types.

That will help guarantee parity with the previous transporter implementation as the library evolves.

Also applies to: 397-428

src/mysql_to_sqlite3/data_transfer.py (1)

30-64: Chunked vs non‑chunked transfer logic is straightforward and respects quiet/progress settings

transfer_table_data:

  • Uses trange with _current_chunk_number and ceil(total_records / _chunk_size) for chunked transfers, updating _current_chunk_number each iteration, and wraps fetchmany in an executemany with encode_data_for_sqlite per value.
  • Falls back to a single executemany over fetchall() wrapped in tqdm when chunking is disabled, again honouring ctx._quiet.
  • Commits via ctx._sqlite.commit() only after all inserts have been attempted.

This keeps all progress/UI handling in one place and matches the previous guidance about chunked transfers and tqdm usage. Based on learnings.

Please confirm that _current_chunk_number is reset appropriately per table (e.g. before each call into transfer_table_data) so chunk progress and any reconnect logic remain accurate across multiple tables.

src/mysql_to_sqlite3/transporter.py (2)

3-3: Type/default translation and helper wiring look consistent

The move to centralise column-type parsing, default expression handling, identifier quoting, and MySQL backtick escaping via mysql_to_sqlite3.type_translation is coherent and aligns with the goal of sharing logic across components. The from __future__ import annotations import and the pylint: disable=protected-access are justified here, as you are intentionally exposing internal helper functions for reuse and tests while keeping the transporter thin. I do not see issues with the wiring of _valid_column_type, _decode_column_type, _translate_type_from_mysql_to_sqlite, default translation, or collation handling.

As per coding guidelines.

Also applies to: 23-23, 33-35, 50-50, 220-294


335-361: Reconnection handling for tables and views remains robust

The refactored _create_table and _create_view keep the single-reconnect semantics on CR_SERVER_LOST: first failure triggers a reconnect and retry; a second failure logs that the reconnection attempt is aborted and re-raises. Error logging for other MySQL/sqlite errors is preserved, and commits are still done only on success.

This matches the existing behaviour while simplifying the control flow, so the changes look good.

As per coding guidelines.

Also applies to: 445-459

@techouse techouse closed this Nov 16, 2025
@techouse techouse deleted the refactor/split-it-up branch November 16, 2025 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants