Skip to content

refactor: refactor StatefulApiExtractor to fix busy buffer and EOF errors#8405

Merged
klesh merged 2 commits into
apache:mainfrom
caioq:fix/extract-execution
Apr 29, 2025
Merged

refactor: refactor StatefulApiExtractor to fix busy buffer and EOF errors#8405
klesh merged 2 commits into
apache:mainfrom
caioq:fix/extract-execution

Conversation

@caioq
Copy link
Copy Markdown
Contributor

@caioq caioq commented Apr 24, 2025

Summary

This PR updates the StatefulApiExtractor.Execute() method to avoid using a long-lived cursor while writing to the database. Previously, the cursor was kept open during data iteration and insertions, which led to errors like unexpected EOF and busy buffer. This change introduces a two-phase process: first, we collect all raw data IDs using a cursor and then close it. After that, each row is loaded individually by ID and processed safely. This avoids concurrency issues with the database connection and improves stability for high-volume data extractions.

Does this close any open issues?

Closes #7826

Screenshots

Testing with Github connection
image

Testing with Jira connection
image

Other Information

This issue was observed during the extractIssues step in the Jira plugin. It occurred consistently on subsequent executions of the task, but not on the first run after restarting the application.
This change ensures a more robust extraction flow without requiring increased max_allowed_packet settings or other MySQL tuning.

@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux labels Apr 24, 2025
@caioq caioq changed the title Fix/extract execution refactor: refactor StatefulApiExtractor to fix busy buffer and EOF errors Apr 24, 2025
Copy link
Copy Markdown
Contributor

@klesh klesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the error disappear after this patch?
I would recommend to use db.Pluck to fetch all ids at once instead of using db.Cursor

@caioq caioq force-pushed the fix/extract-execution branch from 09dd090 to c040a3b Compare April 28, 2025 16:26
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 28, 2025
@caioq caioq force-pushed the fix/extract-execution branch from c040a3b to 09dd090 Compare April 28, 2025 17:14
@dosubot dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Apr 28, 2025
@caioq caioq force-pushed the fix/extract-execution branch from 09dd090 to 522728d Compare April 28, 2025 17:16
@caioq
Copy link
Copy Markdown
Contributor Author

caioq commented Apr 28, 2025

Does the error disappear after this patch? I would recommend to use db.Pluck to fetch all ids at once instead of using db.Cursor

Good idea, done a32009a

Yes, I tested it locally and the error disappeared.
I tested it with 8K Jira Issues and the following database configuration:

mysql:
    image: mysql:8
    volumes:
      - mysql-storage:/var/lib/mysql
    restart: always
    ports:
      - 3306:3306
    environment:
      MYSQL_ROOT_PASSWORD: admin
      MYSQL_DATABASE: lake
      MYSQL_USER: merico
      MYSQL_PASSWORD: merico
      TZ: UTC
    command: --character-set-server=utf8mb4
      --collation-server=utf8mb4_bin
      --skip-log-bin
      --max_allowed_packet=9M
      --net_read_timeout=6
      --net_write_timeout=6

Without this fix, this scenario was already enough to have the error in subsequent executions.

@klesh
Copy link
Copy Markdown
Contributor

klesh commented Apr 29, 2025

Nice work. Thanks for your contribution.

@klesh klesh merged commit 8bc1f1a into apache:main Apr 29, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/plugins This issue or PR relates to plugins pr-type/bug-fix This PR fixes a bug severity/p1 This bug affects functionality or significantly affect ux size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][Jira] issues disappearing from dataset, progressive load issue?

2 participants