Question: Differentiating btween lamba and spark architectures #1

jmandel1027 · 2019-08-16T17:52:09Z

Hey there!

Really enjoyed your article on Towards Data science! Great explainer on the arch and shows the scope of all the diff moving parts quite well.

One thing that might be helpful for other peeps coming to this repo would be separate branches (or other repo, what have you) with the the spark and aws setups respectively. At a glance it's kind of confusing to see what elements are for which config.

That said this is a pretty sweet project and ref point for peeps trying to dig into these tools!

chollinger93 · 2019-08-23T23:58:22Z

Hey,

thanks :)

This is a good point, but I am not a huge fans of persistent branches (as opposed to release branches) for functionality. However, the README.md could definitely use some work.

The long term plan is to add a 3rd version that re-uses part of the Spark code, but uses less Hadoop focused technologies.

jmandel1027 · 2019-08-24T16:51:32Z

For sure, yeah release tags would totally do the trick and provide an easy way to access the correct state for each version.

Ooh interesting how so? As like a more optimized way to mount the files from s3 and perform the queries over spark? Thats super interesting. Athena is very powerful but sometimes not well suited for things in a real time fashion.

One thing that might be interesting to explore is a lambda to activate a Fargate task, this would give you more flexibility to execute the spark jobs without worrying about hitting a timeout although i think they're like 15 mins now so might be moot. Pairing a lambda step function + Fargate tasks might be an ergonomic way to handle the query and return the results.

chollinger93 · 2019-10-29T21:28:33Z

Created separate branches

chollinger93 added the enhancement label Aug 23, 2019

chollinger93 self-assigned this Aug 23, 2019

chollinger93 closed this as completed Oct 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Differentiating btween lamba and spark architectures #1

Question: Differentiating btween lamba and spark architectures #1

jmandel1027 commented Aug 16, 2019

chollinger93 commented Aug 23, 2019

jmandel1027 commented Aug 24, 2019 •

edited

chollinger93 commented Oct 29, 2019

Question: Differentiating btween lamba and spark architectures #1

Question: Differentiating btween lamba and spark architectures #1

Comments

jmandel1027 commented Aug 16, 2019

chollinger93 commented Aug 23, 2019

jmandel1027 commented Aug 24, 2019 • edited

chollinger93 commented Oct 29, 2019

jmandel1027 commented Aug 24, 2019 •

edited