Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for improvement #6

Open
scharlottej13 opened this issue Nov 10, 2021 · 2 comments
Open

Ideas for improvement #6

scharlottej13 opened this issue Nov 10, 2021 · 2 comments

Comments

@scharlottej13
Copy link

Opening up this issue to start a conversation on things that can be cut and/or improved (per @ncclementi's suggestion!)

@ncclementi
Copy link
Contributor

Thank you @scharlottej13 for opening this issue and starting the conversation.

Currently, the tutorial is a bit long, it's taking between 1.5-2h to complete and there are certain topics that can probably be removed to make the material easier to chew.

Since this tutorial is targeted as introductory I think we can remove:

  • Cut Futures section
    • This is a more advanced topic and we might want to just remove the section.
  • Avoid Single Machine vs Distributed Schedulers explanation as we recommend people use the distributed scheduler
    • This will imply redesigning the notebooks to use the distributed scheduler since the beginning, which will also help in exposing the dashboard earlier.

cc: @pavithraes @rrpelgrim as you have taught similar tutorials, what's your experience with these topics, is there anything else that might be confusing/advanced for beginners?

@avriiil
Copy link

avriiil commented Nov 11, 2021

Thanks for starting this conversation @scharlottej13!

I agree with @ncclementi, I think the tutorial atm tries to be too exhaustive and complete (lots of task graphs, starting off with Delayed, etc.) which can be intimidating for novice users. I think we should move towards 'wow-ing' people with the power of Dask first...and only then explaining how it works.

The analogy we're using in evangelism atm is that we want to show people a shiny race car, get them to step in and take it for a test drive (no mechanic skills or understanding of the inner workings of the engine needed here), and then be super impressed by the results. If at that point people are like - "Hey, how does this actually work?" or "Hey, can I take out the engine and build my own car/hovercraft/spaceship?"...then we can dive into that.

With that in mind, what I've been doing is:

  1. Start with a no-code slide Deck to build intuition and excitement around what Dask is and the problems it can solve for you -- ~10 minutes

-- move to notebooks --

  1. Start with a quick flashy 'showing off' of the various Dask race cars: Dask.dataframe to scale pandas, Dask.array to scale numpy, Dask.ml to scale sklearn and a very quick sneak-peak into the engine with a simple dask.delayed example (to tease any intermediate/expert users in the room) -- ~10 minutes

  2. Then jump into the Dask.dataframe and take it for a test drive. Show them how to move from pandas to Dask and how to control that car (API, etc.) -- ~20 minutes

  3. Then jump into the Dask-ML car and take it for a drive -- ~15 minutes

  4. Then say - "Cool stuff, right? Do you want to know how this works?" and talk a little more about delayed -- ~10 minutes

  5. Skip Schedulers and Futures

  6. Q&A

My alternative layout of the notebooks lives here for now, but would like to synthesise efforts and end up with a set of 'master' notebooks and slides in this coiled/dask-mini-tutorial repo that we can then fork whenever we give a presentation.
https://github.com/coiled/coiled-resources/tree/main/dask-tutorial/notebooks

My non-code slides live here -- these need iteration, not totally convinced by my own narrative line on this one:
https://docs.google.com/presentation/d/1BMhxuTuOg1jRYFANDvbb-GNszpyH-JKPKnbGEsO5GtQ/edit?usp=sharing

We should also refresh the longer data-science-at-scale tutorial with some of these messaging strategies in mind.

curious what @MrPowers thinks based on his meetup experiences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants