Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Source Plugin #2264

Closed
ashoktelukuntla opened this issue Feb 11, 2023 · 1 comment
Closed

Generic Source Plugin #2264

ashoktelukuntla opened this issue Feb 11, 2023 · 1 comment
Labels
duplicate This issue or pull request already exists plugin - source A plugin to receive data from a service or location.

Comments

@ashoktelukuntla
Copy link
Contributor

ashoktelukuntla commented Feb 11, 2023

Is your feature request related to a problem? Please describe.

Users are looking to migrate data in to OpenSearch. Source of data is from existing self managed, managed clusters of OpenSearch. The plugin should enable users read data, transform and write to OpenSearch clusters. This feature is useful in many ways not limiting to migration of data but also replaying , reindexing.

Describe the solution you'd like

Create a source plugin which would enable users to bulk read, bulk write on a scheduled manner to a given OpenSearch cluster. This plugin should be extendable to take user defined additional sources. Users should be able to create/schedule pipeline for migration of data by

  • Auto discovery i.e. Listing all the indexes or Take given Index
  • Iterate over a index , read/fetch complete data
  • Enrich/transform Data (Optional)
  • Sink to OpenSearch using Data Prepper
  • Reconcile/report comparing source and sink data

Cron can be used to schedule the migration of data. Example: schedule: "* * * * *" ' will load data every minute

Additional context

Plugin should be able to take configurations data related to cluster including hostname:port, user credentials, optional - index and query (e.g. match_all) .

I would envision the following sequence of steps

  1. cat indices - https://opensearch.org/docs/1.2/opensearch/rest-api/cat/cat-indices/
  2. Iterate over an index
  3. Query index i.e. match_all or scroll query for a large indices
  4. Enrich/transform Data (Optional)
  5. Data Prepper pipeline to ingest data in to opensearch
  6. Report on data from sink and source

**References: **

This should be similar to the logstash-input-opensearch-plugin provided in the OpenSearch project.

https://opensearch.org/blog/community/2022/05/introducing-logstash-input-opensearch-plugin-for-opensearch/
https://github.com/opensearch-project/logstash-input-opensearch

@ashoktelukuntla ashoktelukuntla changed the title X Source Plugin Generic Source Plugin Feb 11, 2023
@dlvenable
Copy link
Member

@ashoktelukuntla , Thank you for this issue. This is very similar to the #1985 issue. Can you add your inputs into that issue so that we can incorporate it into the final solution?

@asifsmohammed asifsmohammed added plugin - source A plugin to receive data from a service or location. and removed untriaged labels Feb 15, 2023
@dlvenable dlvenable added the duplicate This issue or pull request already exists label Apr 25, 2023
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 8, 2023
…rch-project#2264

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 11, 2023
…nsearch-project#2264

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 12, 2023
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 12, 2023
…search-project#2264. Signed-off-by:rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 12, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 12, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 16, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 16, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 16, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 17, 2023
…search-project#2264. Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 17, 2023
… Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 18, 2023
…y: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 18, 2023
…roject#2264.

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
rajeshLovesToCode added a commit to rajeshLovesToCode/data-prepper that referenced this issue May 18, 2023
…ect#2264.

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
dlvenable pushed a commit that referenced this issue May 22, 2023
Resolves #1985,#2264

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>

---------

Signed-off-by: rajeshLovesToCode <rajesh.dharamdasani3021@gmail.com>
Signed-off-by: Taylor Gray <tylgry@amazon.com>
Co-authored-by: Taylor Gray <tylgry@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists plugin - source A plugin to receive data from a service or location.
Projects
Archived in project
Development

No branches or pull requests

3 participants