Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Next Steps: What more can be done beyond the programming challenge? #18

Closed
arunkpatra opened this issue Jul 29, 2020 · 1 comment
Closed
Assignees
Labels
enhancement New feature or request

Comments

@arunkpatra
Copy link
Owner

arunkpatra commented Jul 29, 2020

Ask

List down items you would like/need to do if given more time.

Next Steps

  1. The analytical queries are not tuned for massive scale. They would need to be tuned for scale and leverage the power of Redshift's query optimization techniques like carefully chosen distkey and sortkey.
  2. Use Amazon Spectrum to directly query S3 data and have Redshift act as a conduit between the business application and the OLAP engine.
  3. Tweak the data model that mimics the actual GC business model more closely. There are opportunities to merge certain entities to facilitate better OLAP queries.
  4. Do a UI that floats up data and presents powerful visualizations.
  5. Implement a more advanced algorithm for breakage forecast.
  6. Productionize API': Security, ExceptionHandling, Logging, Monitoring, Tracing, Scaling, Stress testing, Deployment, Containerization etc.
  7. An ambitious goal would be to develop a robust breakage forecast Machine Learning model trained on actual production data. This is of significant commercial value.

Thoughts on cost

  1. Point 1 requires a massive amount of actual test data, and requires some effort.
  2. Points 2 through 4 in the thoughts mentioned above, are fairly straightforward.
  3. Point 5 requires effort and a more comprehensive data model and business research.
  4. Point 6 is necessary to build production worthy code. It takes a non-trivial amount of time and effort.
  5. Point 7 is a significantly complex effort, but has the highest commercial value. It's probably a valuable goal for the GC business and can be a real income generator.
@arunkpatra arunkpatra added the enhancement New feature or request label Jul 29, 2020
@arunkpatra arunkpatra self-assigned this Jul 29, 2020
@arunkpatra arunkpatra pinned this issue Aug 6, 2020
@arunkpatra
Copy link
Owner Author

Closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant