Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove infobright-storage folder from 4-storage #285

Closed
alexanderdean opened this issue Jul 1, 2013 · 3 comments
Closed

Remove infobright-storage folder from 4-storage #285

alexanderdean opened this issue Jul 1, 2013 · 3 comments
Assignees
Milestone

Comments

@alexanderdean
Copy link
Member

Remove as quite confusing to keep it currently given the current ETL doesn't work with Infobright.

@ghost ghost assigned alexanderdean Jul 1, 2013
@sriv
Copy link

sriv commented Jul 1, 2013

Is there a reason for ETL not working with Infobright? Can it be patched somehow?

@alexanderdean
Copy link
Member Author

Hi @sriv - the reason that the current Hadoop ETL does not currently work with Infobright (or MySQL) is that the flatfile format required for loading by Infobright/MySQL is slightly different from the one required by Redshift/Postgres.

From a roadmap perspective, it makes more sense for us to add Postgres support next (as it requires no change to the ETL), rather than adding in Infobright/MySQL (which would require work to the ETL - work which we would throw away when we launch Avro support in 0.9.x).

In other words, the current ETL flow is:

raw events > ETL > Redshift/Postgres-format flatfiles

Where we will move to later this year:

                                           > Hive
                                           > HBase
raw events > ETL > binary Avro event files > MongoDB
                                           > more ETL > Redshift/Postgres-format flatfiles
                                                      > MySQL/Infobright-format flatfiles
                                                      > other-format flatfiles

To answer your question: yes the ETL could be forked & patched to load into Infobright - it would require tweaking a few data types. This could work as a stop-gap until MySQL/Infobright support is added back in later this year:

raw events > forked ETL > MySQL/Infobright-format flatfiles

@alexanderdean
Copy link
Member Author

Done in 0.8.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants