Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No documents synced if source table does not have Primary Keys #467

Open
lio-p opened this issue Feb 14, 2023 · 13 comments
Open

No documents synced if source table does not have Primary Keys #467

lio-p opened this issue Feb 14, 2023 · 13 comments
Labels
community-driven enhancement New feature or request

Comments

@lio-p
Copy link

lio-p commented Feb 14, 2023

Bug Description

Using MySQL Connector, if the source table does not have primary keys, no documents are being synced. No messages explain what happened.

To Reproduce

  • Create a table with no PK.
CREATE TABLE faq_no_pk (
    title TEXT,
    content TEXT
)
  • Ingest documents to the table.
  • In Kibana, create a new Elasticsearch index and choose MySQL Connector as a connector.
  • Sync contents with the database
  • No documents are being synced

Expected behavior

I understand that it's not working probably because Elasticsearch relies on the primary keys to track the synchronization state. But I think it would be interesting to have a message that explains how many documents have been omitted.

Environment

  • Version 8.6.1
@lio-p lio-p added the bug Something isn't working label Feb 14, 2023
@danajuratoni
Copy link
Contributor

A primary key is mandatory, but missing user feedback seems to be a bug.
According to the design we should log an warning/error mentioning which table we're skipping, ideally also a suggestion for a possible fix e.g. consider adding an auto-increment key.
The warning / error should be looped back to Kibana.
cc: @moxarth-elastic

@parth-crest
Copy link
Collaborator

According to the design we should log an warning/error mentioning which table we're skipping,

@danajuratoni We have that logger message where we inform user that we have skipped that table because it has no primary key.

possible fix e.g. consider adding an auto-increment key.

You mean do you want us to add a random unique id to the elastic document when primary key is not attached, in this case all the document will be reindex in each sync.
OR
You mean it would done at the database level by end user.

@danajuratoni
Copy link
Contributor

Cool, for connector clients, that log message you linked is exactly what we need.
For the native connector we'd need a similar warning fed back to Kibana.

@sphilipse
Copy link
Member

This probably calls for adding a 'warnings' or 'messages' field to the connector protocol to communicate these kinds of things, maybe with a severity (log/warn/error)? Would definitely be useful in the jobs index, I'm not sure whether that would be as useful in the connector document itself.

Adding a possible 'warning' status to the connector and job would help, too.

@parth-crest
Copy link
Collaborator

Agree, 'warning' status needs to be added at the framework level and the connector should be able to pass the messages via any method that appends the message to the 'warning' field.

@tarekziade what are your thoughts on this ?

@tarekziade
Copy link
Contributor

Excellent idea - I would use the standard levels though for the severity field, so info, warning error critical etc.

@sphilipse
Copy link
Member

sphilipse commented Mar 3, 2023

We'd probably get a structure like this:

messages: [
  { severity: 'warning', message: 'Skipped table due to missing primary key'},
  { severity: 'info', message: 'Synced five tables with 120 entries}
]

@parth-crest parth-crest added enhancement New feature or request and removed bug Something isn't working labels Mar 7, 2023
@danajuratoni danajuratoni added this to the 2023-03-15 - 2023-03-28 milestone Mar 10, 2023
@danajuratoni danajuratoni added 8.7 and removed 8.7 labels Mar 10, 2023
@danajuratoni danajuratoni removed this from the 2023-03-15 - 2023-03-28 milestone Mar 10, 2023
@danajuratoni
Copy link
Contributor

Marking this issue as blocked by the work needed to support additional severity warnings. Postponed to 8.8.

@danajuratoni
Copy link
Contributor

Moving this to 8.9, as work for additional severity warning levels is postponed.

@khushbu-elastic
Copy link
Collaborator

@danajuratoni Is there any update on this considering the 8.9 release?

@khushbu-elastic
Copy link
Collaborator

@danajuratoni Please let us know if there is any update on this?

@chantzlarge
Copy link

chantzlarge commented Jan 2, 2024

Moving this to 8.9, as work for additional severity warning levels is postponed.

Is the intention to add support for additional indexes or simply improve the warning message? It would be helpful for us if we were able to specify alternative indexes as we are currently unable to index materialize views due to this constraint.

@seanstory
Copy link
Member

I agree, it would be good for us to revisit these constraints, and find a way to allow the user to provide a column/function to be used as _id so that views and even arbitrary joins could be indexed. This relates to #2002

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community-driven enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants