Make extractors usable on their own #19

MeltyBot · 2018-06-25T16:54:40Z

Migrated from GitLab: https://gitlab.com/meltano/meltano/-/issues/20

Originally created by @joshlambert on 2018-06-25 16:54:40

We discussed this in the past, but opted to not do this due to the increased complexity of adding capabilities for which we do not immediately need: databases other than Postgres, splitting the extractors into separate projects, etc.

Now that we have more engineering resources on board, and have addressed the near term nears of our internal data team, I think we should revisit this topic for a few reasons:

Each of these extractors has their own value, and could generate interest on their own. For example, a good SFDC, Zuora, or Netsuite extractor would be useful for the broader community.
It will take some time for us to really make the full meltano experience great, end to end.
While that is work is being done, we could start generating interest and critical mass with just the extractors themselves.
Right now however, there are a few major hurdles in driving usage of these extractors:

Our extractors only output to Postgres. There is no support for exporting to a file, or any other database type. If your EDW runs on bigquery, we can't help you.
There is no SEO for the individual extractors. If you google for "sfdc extract", you aren't going to get a good hit based on the full Meltano readme.
Further, the extractors aren't easily usable on their own. It's expected that they are used in the context of the full project. For example, there is no canned image, instead they are pulled down with a git checkout.
We currently operate as a monorepo, and it is not user friendly to work on these in isolation. Our issues, MR's, README's, etc. all cover a broader scope than the simple sharp tool of extracting from a source.

There are some downsides:

There will be some work to really "productize" these individually, if we are going with our own system.

We should accelerate the output to an intermediate format for the extractors, so we can support multiple storage engines. (PG, MySQL, Bigquery, Redshift, Snowflake, etc.) We can then build individual loaders for these.
We will need to rework the pipelines, to build a final image for each extractor. Then update the main CI pipeline.

This work may delay down the effort to productize the full meltano project, for example building the data mapping feature.

MeltyBot · 2022-05-30T01:38:04Z

View 38 previous comments from the original issue on GitLab

MeltyBot closed this as completed May 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make extractors usable on their own #19

Make extractors usable on their own #19

MeltyBot commented Jun 25, 2018

MeltyBot commented May 30, 2022

Make extractors usable on their own #19

Make extractors usable on their own #19

Comments

MeltyBot commented Jun 25, 2018

MeltyBot commented May 30, 2022